Crate embedcache

Crate embedcache 

Source
Expand description

EmbedCache - High-performance text embedding library with caching capabilities

This library provides functionality for generating text embeddings with various state-of-the-art models and caching the results for improved performance.

§Features

  • Multiple embedding models (BGE, MiniLM, Nomic, etc.)
  • Modular text chunking strategies with extensible trait-based architecture
  • SQLite-based caching
  • Asynchronous operation

§Installation

Add this to your Cargo.toml:

[dependencies]
embedcache = "0.1.0"

§Examples

§Using as a library

use embedcache::{FastEmbedder, Embedder};
use fastembed::{TextEmbedding, InitOptions, EmbeddingModel};

let model = TextEmbedding::try_new(InitOptions {
    model_name: EmbeddingModel::BGESmallENV15,
    show_download_progress: true,
    ..Default::default()
})?;

let embedder = FastEmbedder {
    options: InitOptions::new(EmbeddingModel::BGESmallENV15),
};

let texts = vec![
    "This is an example sentence.".to_string(),
    "Another example sentence for embedding.".to_string(),
];

let embeddings = embedder.embed(&texts).await?;

§Implementing custom chunking strategies

use embedcache::{ContentChunker};
use async_trait::async_trait;

struct MyCustomChunker;

#[async_trait]
impl ContentChunker for MyCustomChunker {
    async fn chunk(&self, content: &str, size: usize) -> Vec<String> {
        // Your custom chunking logic here
        vec![content.to_string()]
    }
     
    fn name(&self) -> &str {
        "my-custom-chunker"
    }
}

§Running the embedcache service

The library also includes a binary that can be run as a standalone service:

cargo install embedcache
embedcache

Re-exports§

pub use crate::core::*;

Modules§

core