Local semantic code search for Claude Code. Index your codebase once, search it with natural language.
# npm
npm install -g claude-codemem
# homebrew (macOS)
brew tap MadAppGang/claude-mem && brew install --cask claudemem
# or just curl it
curl -fsSL https://raw.githubusercontent.com/MadAppGang/claudemem/main/install.sh | bashClaude Code's built-in search (grep/glob) works fine for exact matches. But when you're trying to find "where do we handle auth tokens" or "error retry logic" — good luck.
claudemem fixes that. It chunks your code using tree-sitter (so it actually understands functions/classes, not just lines), generates embeddings via OpenRouter, and stores everything locally in LanceDB.
The search combines keyword matching with vector similarity. Works surprisingly well for finding stuff you kinda-sorta remember but can't grep for.
# first time setup
claudemem init
# index your project
claudemem index
# search
claudemem search "authentication flow"
claudemem search "where do we validate user input"That's it. Changed some files? Just search again — it auto-reindexes modified files before searching.
Run your own benchmark with claudemem benchmark. Here are results on real code search tasks:
| Model | Speed | NDCG | Cost | Notes |
|---|---|---|---|---|
| voyage-code-3 | 4.5s | 175% | $0.007 | Best quality |
| gemini-embedding-001 | 2.9s | 170% | $0.007 | Great free option |
| voyage-3-large | 1.8s | 164% | $0.007 | Fast & accurate |
| voyage-3.5-lite | 1.2s | 163% | $0.001 | Best value |
| voyage-3.5 | 1.2s | 150% | $0.002 | Fastest |
| mistral-embed | 16.6s | 150% | $0.006 | Slow |
| text-embedding-3-small | 3.0s | 141% | $0.001 | Decent |
| text-embedding-3-large | 3.1s | 141% | $0.005 | Not worth it |
| all-minilm-l6-v2 | 2.7s | 128% | $0.0001 | Cheapest (local) |
Summary:
- 🏆 Best Quality: voyage-code-3 (175% NDCG)
- ⚡ Fastest: voyage-3.5 (1.2s)
- 💰 Cheapest: all-minilm-l6-v2 (local, free)
claudemem supports three embedding providers:
claudemem init # select "OpenRouter"
# requires API key from https://openrouter.ai/keys
# ~$0.01 per 1M tokens# install Ollama first: https://ollama.ai
ollama pull nomic-embed-text
claudemem init # select "Ollama"Recommended Ollama models:
nomic-embed-text— best quality, 768d, 274MBmxbai-embed-large— large context, 1024d, 670MBall-minilm— fastest, 384d, 46MB
claudemem init # select "Custom endpoint"
# expects OpenAI-compatible /embeddings endpointView available models:
claudemem --models # OpenRouter models
claudemem --models --ollama # Ollama modelsRun it as an MCP server:
claudemem --mcpThen Claude Code can use these tools:
search_code— semantic search (auto-indexes changes)index_codebase— manual full reindexget_status— check what's indexedclear_index— start fresh
This repo also contains an experimental VS Code inline completion extension that talks to a persistent claudemem autocomplete server.
- Autocomplete server:
claudemem --autocomplete-server --project . - VS Code extension source:
extensions/vscode-claudemem-autocomplete/
- Parses code with tree-sitter — extracts functions, classes, methods as chunks (not dumb line splits)
- Generates embeddings via OpenRouter (default: voyage-3.5-lite, best value)
- Stores locally in LanceDB — everything stays in
.claudemem/in your project - Hybrid search — BM25 for exact matches + vector similarity for semantic. Combines both.
TypeScript, JavaScript, Python, Go, Rust, C, C++, Java.
If your language isn't here, it falls back to line-based chunking. Works, but not as clean.
claudemem init # setup wizard
claudemem index [path] # index codebase
claudemem search <query> # search (auto-reindexes changed files)
claudemem status # what's indexed
claudemem clear # nuke the index
claudemem models # list embedding models
claudemem benchmark # benchmark embedding models
claudemem --mcp # run as MCP server
Search flags:
-n, --limit <n> # max results (default: 10)
-l, --language <lang> # filter by language
-y, --yes # auto-create index without asking
--no-reindex # skip auto-reindex
Env vars:
OPENROUTER_API_KEY— for OpenRouter providerCLAUDEMEM_MODEL— override embedding model
Files:
~/.claudemem/config.json— global config (provider, model, endpoints).claudemem/— project index (add to .gitignore)
- First index takes a minute on large codebases
- Ollama is slower than cloud (runs locally, no batching)
- Embedding quality depends on the model you pick
- Not magic — sometimes grep is still faster for exact strings
MIT
GitHub · npm · OpenRouter

