Expand description
§Embellama
A high-performance Rust library for generating text embeddings using llama-cpp-2.
§Features
- High-performance embedding generation using llama.cpp backend
- Support for batch processing with parallel pre/post-processing
- Thread-safe model management with Arc/RwLock
- Comprehensive error handling
- Builder pattern for configuration
- Optional HTTP server with OpenAI-compatible API
§Example
ⓘ
use embellama::{EmbeddingEngine, EngineConfig};
// Configure and build the engine
let config = EngineConfig::builder()
.with_model_path("models/all-MiniLM-L6-v2.gguf")
.with_model_name("minilm")
.with_context_size(512)
.with_n_threads(4)
.build()?;
let engine = EmbeddingEngine::new(config)?;
// Generate a single embedding
let embedding = engine.embed("minilm", "Hello, world!")?;
println!("Embedding dimension: {}", embedding.len());
// Generate batch embeddings
let texts = vec![
"First document",
"Second document",
"Third document",
];
let embeddings = engine.embed_batch("minilm", texts)?;
println!("Generated {} embeddings", embeddings.len());Modules§
- cache
- Cache module for token and embedding caching
- server
- Server module (feature-gated) Server module for Embellama HTTP API
Structs§
- Backend
Info - Backend information for diagnostics
- Batch
Processor - Represents a batch of texts to be processed.
- Batch
Processor Builder - Builder for creating configured
BatchProcessorinstances. - Cache
Config - Configuration for caching system
- Cache
Config Builder - Builder for
CacheConfig - Embedding
Engine - The main entry point for the embellama library.
- Embedding
Model - Represents a loaded embedding model.
- Engine
Config - Configuration for the embedding engine
- Engine
Config Builder - Builder for creating
EngineConfiginstances - GGUF
Metadata - GGUF model metadata extracted from the file header
- Model
Config - Configuration for a single model
- Model
Config Builder - Builder for creating
ModelConfiginstances - Model
Info - Information about a loaded model.
- Version
Info - Get library version information
Enums§
- Backend
Type - Available backend types for llama.cpp
- Error
- Custom error type for the embellama library
- Normalization
Mode - Normalization mode for embedding vectors
- Pooling
Strategy - Pooling strategy for combining token embeddings
Constants§
- VERSION
- Library version
Functions§
- clear_
metadata_ cache - Clear the GGUF metadata cache
- detect_
best_ backend - Detect the best available backend based on compile-time features and runtime capabilities
- extract_
gguf_ metadata - Extract comprehensive metadata from a GGUF file
- get_
compiled_ backend - Get the currently compiled backend based on build features
- init
- Initialize the library with default tracing subscriber
- init_
with_ env_ filter - Initialize the library with a custom environment filter
- metadata_
cache_ size - Get the current size of the GGUF metadata cache
- version_
info - Get version information
Type Aliases§
- Result
- Type alias for Results in this crate