Skip to content

Claude code memory, or maybe not code.

Notifications You must be signed in to change notification settings

cip22/claudemem

 
 

Repository files navigation

CLAUDEMEM

npm version License: MIT GitHub stars


Local semantic code search for Claude Code. Index your codebase once, search it with natural language.

Install

# npm
npm install -g claude-codemem

# homebrew (macOS)
brew tap MadAppGang/claude-mem && brew install --cask claudemem

# or just curl it
curl -fsSL https://raw.githubusercontent.com/MadAppGang/claudemem/main/install.sh | bash

Why this exists

Claude Code's built-in search (grep/glob) works fine for exact matches. But when you're trying to find "where do we handle auth tokens" or "error retry logic" — good luck.

claudemem fixes that. It chunks your code using tree-sitter (so it actually understands functions/classes, not just lines), generates embeddings via OpenRouter, and stores everything locally in LanceDB.

The search combines keyword matching with vector similarity. Works surprisingly well for finding stuff you kinda-sorta remember but can't grep for.

Quick start

# first time setup
claudemem init

# index your project
claudemem index

# search
claudemem search "authentication flow"
claudemem search "where do we validate user input"

That's it. Changed some files? Just search again — it auto-reindexes modified files before searching.

Embedding Model Benchmark

Run your own benchmark with claudemem benchmark. Here are results on real code search tasks:

Embedding Model Benchmark

Model Speed NDCG Cost Notes
voyage-code-3 4.5s 175% $0.007 Best quality
gemini-embedding-001 2.9s 170% $0.007 Great free option
voyage-3-large 1.8s 164% $0.007 Fast & accurate
voyage-3.5-lite 1.2s 163% $0.001 Best value
voyage-3.5 1.2s 150% $0.002 Fastest
mistral-embed 16.6s 150% $0.006 Slow
text-embedding-3-small 3.0s 141% $0.001 Decent
text-embedding-3-large 3.1s 141% $0.005 Not worth it
all-minilm-l6-v2 2.7s 128% $0.0001 Cheapest (local)

Summary:

  • 🏆 Best Quality: voyage-code-3 (175% NDCG)
  • Fastest: voyage-3.5 (1.2s)
  • 💰 Cheapest: all-minilm-l6-v2 (local, free)

Embedding providers

claudemem supports three embedding providers:

OpenRouter (cloud, default)

claudemem init  # select "OpenRouter"
# requires API key from https://openrouter.ai/keys
# ~$0.01 per 1M tokens

Ollama (local, free)

# install Ollama first: https://ollama.ai
ollama pull nomic-embed-text

claudemem init  # select "Ollama"

Recommended Ollama models:

  • nomic-embed-text — best quality, 768d, 274MB
  • mxbai-embed-large — large context, 1024d, 670MB
  • all-minilm — fastest, 384d, 46MB

Custom endpoint (local server)

claudemem init  # select "Custom endpoint"
# expects OpenAI-compatible /embeddings endpoint

View available models:

claudemem --models           # OpenRouter models
claudemem --models --ollama  # Ollama models

Using with Claude Code

Run it as an MCP server:

claudemem --mcp

Then Claude Code can use these tools:

  • search_code — semantic search (auto-indexes changes)
  • index_codebase — manual full reindex
  • get_status — check what's indexed
  • clear_index — start fresh

VS Code autocomplete (experimental)

This repo also contains an experimental VS Code inline completion extension that talks to a persistent claudemem autocomplete server.

  • Autocomplete server: claudemem --autocomplete-server --project .
  • VS Code extension source: extensions/vscode-claudemem-autocomplete/

What it actually does

  1. Parses code with tree-sitter — extracts functions, classes, methods as chunks (not dumb line splits)
  2. Generates embeddings via OpenRouter (default: voyage-3.5-lite, best value)
  3. Stores locally in LanceDB — everything stays in .claudemem/ in your project
  4. Hybrid search — BM25 for exact matches + vector similarity for semantic. Combines both.

Supported languages

TypeScript, JavaScript, Python, Go, Rust, C, C++, Java.

If your language isn't here, it falls back to line-based chunking. Works, but not as clean.

CLI reference

claudemem init              # setup wizard
claudemem index [path]      # index codebase
claudemem search <query>    # search (auto-reindexes changed files)
claudemem status            # what's indexed
claudemem clear             # nuke the index
claudemem models            # list embedding models
claudemem benchmark         # benchmark embedding models
claudemem --mcp             # run as MCP server

Search flags:

-n, --limit <n>       # max results (default: 10)
-l, --language <lang> # filter by language
-y, --yes             # auto-create index without asking
--no-reindex          # skip auto-reindex

Config

Env vars:

  • OPENROUTER_API_KEY — for OpenRouter provider
  • CLAUDEMEM_MODEL — override embedding model

Files:

  • ~/.claudemem/config.json — global config (provider, model, endpoints)
  • .claudemem/ — project index (add to .gitignore)

Limitations

  • First index takes a minute on large codebases
  • Ollama is slower than cloud (runs locally, no batching)
  • Embedding quality depends on the model you pick
  • Not magic — sometimes grep is still faster for exact strings

License

MIT


GitHub · npm · OpenRouter

About

Claude code memory, or maybe not code.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • TypeScript 98.0%
  • JavaScript 1.4%
  • Other 0.6%