Crate embellama

Expand description

§Embellama

A high-performance Rust library for generating text embeddings using llama-cpp-2.

§Features

High-performance embedding generation using llama.cpp backend
Support for batch processing with parallel pre/post-processing
Thread-safe model management with Arc/RwLock
Comprehensive error handling
Builder pattern for configuration
Optional HTTP server with OpenAI-compatible API

§Example

use embellama::{EmbeddingEngine, EngineConfig};

// Configure and build the engine
let config = EngineConfig::builder()
    .with_model_path("models/all-MiniLM-L6-v2.gguf")
    .with_model_name("minilm")
    .with_context_size(512)
    .with_n_threads(4)
    .build()?;

let engine = EmbeddingEngine::new(config)?;

// Generate a single embedding
let embedding = engine.embed("minilm", "Hello, world!")?;
println!("Embedding dimension: {}", embedding.len());

// Generate batch embeddings
let texts = vec![
    "First document",
    "Second document",
    "Third document",
];
let embeddings = engine.embed_batch("minilm", texts)?;
println!("Generated {} embeddings", embeddings.len());

Modules§

cache: Cache module for token and embedding caching
server: Server module (feature-gated) Server module for Embellama HTTP API

Structs§

BackendInfo: Backend information for diagnostics
BatchProcessor: Represents a batch of texts to be processed.
BatchProcessorBuilder: Builder for creating configured BatchProcessor instances.
CacheConfig: Configuration for caching system
CacheConfigBuilder: Builder for CacheConfig
EmbeddingEngine: The main entry point for the embellama library.
EmbeddingModel: Represents a loaded embedding model.
EngineConfig: Configuration for the embedding engine
EngineConfigBuilder: Builder for creating EngineConfig instances
GGUFMetadata: GGUF model metadata extracted from the file header
ModelConfig: Configuration for a single model
ModelConfigBuilder: Builder for creating ModelConfig instances
ModelInfo: Information about a loaded model.
VersionInfo: Get library version information

Enums§

BackendType: Available backend types for llama.cpp
Error: Custom error type for the embellama library
NormalizationMode: Normalization mode for embedding vectors
PoolingStrategy: Pooling strategy for combining token embeddings

Constants§

VERSION: Library version

Functions§

clear_metadata_cache: Clear the GGUF metadata cache
detect_best_backend: Detect the best available backend based on compile-time features and runtime capabilities
extract_gguf_metadata: Extract comprehensive metadata from a GGUF file
get_compiled_backend: Get the currently compiled backend based on build features
init: Initialize the library with default tracing subscriber
init_with_env_filter: Initialize the library with a custom environment filter
metadata_cache_size: Get the current size of the GGUF metadata cache
version_info: Get version information

Type Aliases§

Result: Type alias for Results in this crate

Crate embellama

Crate embellama Copy item path

§Embellama

§Features

§Example

Modules§

Structs§

Enums§

Constants§

Functions§

Type Aliases§

Crate embellama