Local-first semantic code search for AI coding agents.
v0.5.2 - Now with hybrid search, intelligent re-ranking, and context savings metrics!
Pommel maintains a vector database of your code, enabling fast semantic search without loading files into context. Designed to complement AI coding assistants by providing targeted code discovery.
- Hybrid search - Combines semantic vector search with keyword search (FTS5) using Reciprocal Rank Fusion for best-of-both-worlds results.
- Intelligent re-ranking - Heuristic signals boost results based on name matches, exact phrases, file paths, recency, and code structure.
- Semantic code search - Find code by meaning, not just keywords. Search for "rate limiting logic" and find relevant implementations regardless of naming conventions.
- Always-fresh file watching - Automatic file system monitoring keeps your index synchronized with code changes. No manual reindexing required.
- Multi-level chunks - Search at file, class/module, or method/function granularity for precise results.
- Low latency local embeddings - All processing happens locally via Ollama with Jina Code Embeddings v2 (768-dim vectors).
- Context savings metrics - See how much context window you're saving compared to grep-based approaches with
--metrics. - JSON output for agents - All commands support
--jsonflag for structured output, optimized for AI agent consumption.
macOS / Linux:
curl -fsSL https://raw.githubusercontent.com/dbinky/Pommel/main/scripts/install.sh | bashWindows (PowerShell):
irm https://raw.githubusercontent.com/dbinky/Pommel/main/scripts/install.ps1 | iexThis will:
- Download pre-built binaries (or build from source on Unix)
- Install
pmandpommeldto your PATH - Install Ollama if not present
- Pull the embedding model (~300MB)
Ollama is required for generating embeddings. The install scripts handle this automatically, but you can install manually:
# macOS
brew install ollama
# Linux
curl -fsSL https://ollama.com/install.sh | sh# Windows (winget)
winget install Ollama.OllamaDownload binaries from releases:
| Platform | Architecture | CLI | Daemon |
|---|---|---|---|
| macOS | Intel | pm-darwin-amd64 | pommeld-darwin-amd64 |
| macOS | Apple Silicon | pm-darwin-arm64 | pommeld-darwin-arm64 |
| Linux | x64 | pm-linux-amd64 | pommeld-linux-amd64 |
| Windows | x64 | pm-windows-amd64.exe | pommeld-windows-amd64.exe |
Then pull the embedding model:
ollama pull unclemusclez/jina-embeddings-v2-base-code# Clone and build
git clone https://github.com/dbinky/Pommel.git
cd Pommel
make build
# Install to PATH (Unix)
cp bin/pm bin/pommeld ~/.local/bin/# Navigate to your project
cd your-project
# Initialize Pommel
pm init
# Start the daemon (begins indexing automatically)
pm start
# Search for code semantically
pm search "user authentication"
# Check indexing status
pm statusInitialize Pommel in the current directory. Creates .pommel/ directory with configuration files.
pm init # Initialize with defaults
pm init --auto # Auto-detect languages and configure
pm init --claude # Also add usage instructions to CLAUDE.md
pm init --start # Initialize and start daemon immediatelyControl the Pommel daemon for the current project.
pm start # Start daemon in background
pm start --foreground # Start in foreground (for debugging)
pm stop # Stop the running daemonHybrid search across the codebase. Combines semantic vector search with keyword matching, then re-ranks results using code-aware heuristics.
# Basic search
pm search "authentication middleware"
# Limit results
pm search "database connection" --limit 20
# Filter by chunk level
pm search "error handling" --level method
pm search "service classes" --level class
# Filter by path
pm search "api handler" --path src/api/
# JSON output (for agents)
pm search "user validation" --json --limit 5
# Verbose output with match reasons and score breakdown
pm search "rate limiting" --verbose
# Show context savings metrics
pm search "database queries" --metrics
# Disable hybrid search (vector-only)
pm search "config parsing" --no-hybrid
# Disable re-ranking stage
pm search "utility functions" --no-rerankOptions:
| Flag | Short | Description |
|---|---|---|
--limit |
-n |
Maximum number of results (default: 10) |
--level |
-l |
Chunk level filter: file, class, method |
--path |
-p |
Path prefix filter |
--json |
-j |
Output as JSON (agent-friendly) |
--verbose |
-v |
Show detailed match reasons and score breakdown |
--metrics |
Show context savings vs grep baseline | |
--no-hybrid |
Disable hybrid search (vector-only mode) | |
--no-rerank |
Disable re-ranking stage |
Example JSON Output:
{
"query": "user authentication",
"results": [
{
"id": "chunk-abc123",
"file": "src/auth/middleware.py",
"start_line": 15,
"end_line": 45,
"level": "class",
"language": "python",
"name": "AuthMiddleware",
"score": 0.89,
"content": "class AuthMiddleware:\n ...",
"match_source": "both",
"match_reasons": ["semantic similarity", "keyword match via BM25", "contains 'auth' in name"],
"score_details": {
"vector_score": 0.85,
"keyword_score": 0.72,
"rrf_score": 0.89
},
"parent": {
"id": "chunk-parent123",
"name": "auth.middleware",
"level": "file"
}
}
],
"total_results": 1,
"search_time_ms": 42,
"hybrid_enabled": true,
"rerank_enabled": true
}Show daemon status and indexing statistics.
pm status # Human-readable output
pm status --json # JSON outputExample Output:
{
"daemon": {
"running": true,
"pid": 12345,
"uptime_seconds": 3600
},
"index": {
"total_files": 342,
"total_chunks": 4521,
"last_indexed": "2025-01-15T10:30:00Z",
"pending_changes": 0
},
"health": {
"status": "healthy",
"embedding_model": "loaded",
"database": "connected"
}
}Force a full re-index of the project. Useful after major refactors or if the index becomes corrupted.
pm reindex # Reindex all files
pm reindex --path src/ # Reindex specific path onlyView or modify project configuration.
pm config # Show current configuration
pm config get embedding.ollama_url # Get specific setting
pm config set watcher.debounce_ms 1000 # Update setting
pm config set daemon.port 7421 # Change daemon port
pm config set search.default_levels method,class,file # Set search levelsConfiguration is stored in .pommel/config.yaml:
version: 1
# Chunk levels to generate
chunk_levels:
- method
- class
- file
# File patterns to include
include_patterns:
- "**/*.cs"
- "**/*.py"
- "**/*.js"
- "**/*.ts"
- "**/*.jsx"
- "**/*.tsx"
# File patterns to exclude
exclude_patterns:
- "**/node_modules/**"
- "**/bin/**"
- "**/obj/**"
- "**/__pycache__/**"
- "**/.git/**"
- "**/.pommel/**"
# File watcher settings
watcher:
debounce_ms: 500 # Debounce delay for file changes
max_file_size: 1048576 # Skip files larger than this (bytes)
# Daemon settings
daemon:
host: "127.0.0.1"
port: 7420
log_level: "info"
# Embedding settings
embedding:
model: "unclemusclez/jina-embeddings-v2-base-code"
ollama_url: "http://localhost:11434"
batch_size: 32
cache_size: 1000
# Search defaults
search:
default_limit: 10
default_levels:
- method
- class
# Hybrid search settings (v0.5.0+)
hybrid_search:
enabled: true # Enable hybrid vector + keyword search
rrf_k: 60 # RRF constant (higher = more weight to lower ranks)
vector_weight: 1.0 # Weight for vector search results
keyword_weight: 1.0 # Weight for keyword search results
# Re-ranker settings (v0.5.0+)
reranker:
enabled: true # Enable heuristic re-ranking
model: "heuristic" # Re-ranking model (currently only "heuristic")
timeout_ms: 100 # Timeout for re-ranking
candidates: 50 # Number of candidates to re-rankCreate .pommelignore in your project root using gitignore syntax:
# Dependencies
node_modules/
vendor/
.venv/
packages/
# Build outputs
dist/
build/
bin/
obj/
*.min.js
*.min.css
# Generated files
*.generated.cs
*.g.cs
__pycache__/
# IDE and editor files
.idea/
.vscode/
*.swp
# Test fixtures
**/testdata/
**/fixtures/Pommel also respects your existing .gitignore by default.
Pommel is designed specifically for AI coding agents. It provides ~18x token savings compared to traditional exploration.
Use pm search FIRST for:
- Finding specific implementations ("where is X implemented")
- Quick code lookups when you know what you're looking for
- Iterative exploration (multiple related searches)
- Cost/time-sensitive tasks
Fall back to Explorer/Grep when:
- Verifying something does NOT exist (Pommel may return false positives)
- Understanding architecture or code flow relationships
- Need full context around matches (not just snippets)
- Searching for exact string literals
Decision rule: Start with pm search. If results seem off-topic or you need to confirm absence, use Explorer.
| Use Case | Recommended Tool |
|---|---|
| Quick code lookup | Pommel |
| Understanding architecture | Explorer |
| Finding specific implementations | Pommel |
| Verifying if feature exists | Explorer |
| Iterative exploration | Pommel |
| Comprehensive documentation | Explorer |
| Cost-sensitive workflows | Pommel (18x fewer tokens) |
| Time-sensitive tasks | Pommel (1000x+ faster) |
Run pm init --claude to automatically add Pommel instructions to your project's CLAUDE.md. Or add manually:
## Code Search
This project uses Pommel for semantic code search.
\`\`\`bash
# Find code related to a concept
pm search "rate limiting logic" --json --limit 5
# Find implementations of a pattern
pm search "retry with exponential backoff" --level method --json
# Search within a specific area
pm search "validation" --path "src/Api/" --json
\`\`\`
**Tip:** Low scores (< 0.5) suggest weak matches - use Explorer to confirm.Without Pommel:
Agent needs to understand authentication...
-> Reads src/Auth/ (15 files, 2000 lines)
-> Reads src/Middleware/ (8 files)
-> Reads src/Services/ (12 files)
-> Context window significantly consumed
With Pommel:
Agent needs to understand authentication...
-> pm search "authentication flow" --json --limit 5
-> Receives 5 targeted results with file:line references
-> Reads only the 3 most relevant sections
-> Context window minimally impacted
| Language | Extensions | Chunk Levels |
|---|---|---|
| C# | .cs |
file, class/struct/interface, method/property |
| Dart | .dart |
file, class/mixin, method/function |
| Elixir | .ex, .exs |
file, module, function |
| Go | .go |
file, struct/interface, function/method |
| Java | .java |
file, class/interface/enum, method |
| JavaScript | .js, .jsx, .mjs, .cjs |
file, class, function |
| Kotlin | .kt, .kts |
file, class/object, function |
| PHP | .php |
file, class/trait, method/function |
| Python | .py, .pyi |
file, class, method/function |
| Rust | .rs |
file, struct/enum/trait/impl, function |
| Solidity | .sol |
file, contract/interface, function |
| Swift | .swift |
file, class/struct/protocol, function |
| TypeScript | .ts, .tsx, .mts, .cts |
file, class/interface, function |
Other file types are indexed at file-level only (fallback chunking).
- Pommel runs natively on Windows (no WSL required)
- PowerShell 5.1+ required for install script
- Ollama installed via winget (Windows 10 1709+)
- Data stored in project
.pommel/directory - Daemon runs as background process (not Windows Service)
- See Windows Troubleshooting for Windows-specific issues
- Standard Unix process management
- Install script uses curl + bash
- Daemon managed via PID files
Symptom: Error message about Ollama connection failure.
Solution:
# Check if Ollama is running
ollama list
# Start Ollama if needed
ollama serveSymptom: Error about embedding model not being installed.
Solution:
# Install the embedding model
ollama pull unclemusclez/jina-embeddings-v2-base-code
# Verify it's installed
ollama listSymptom: pm start fails or daemon exits immediately.
Solution:
# Check for existing daemon process
pm status
# Check daemon logs
cat .pommel/daemon.log
# Try running in foreground to see errors
pm start --foregroundSymptom: First-time indexing takes very long.
Explanation: Initial indexing requires generating embeddings for all files, which can take 2-5 minutes for ~1000 files. Subsequent updates are incremental and much faster (< 500ms per file).
Tips:
- Add large generated directories to
.pommelignore - Exclude test fixtures and vendor dependencies
- Run initial indexing when you can let it complete
Symptom: Searches return empty results even for known code.
Solution:
# Check if daemon is running and indexing complete
pm status --json
# Wait for pending_changes to reach 0
# Then retry your search
# If needed, force a reindex
pm reindexSymptom: Errors about database or corrupted index.
Solution:
# Stop the daemon
pm stop
# Remove the database (will be rebuilt)
rm .pommel/index.db
# Restart and reindex
pm start
pm reindexPommel supports running across multiple unrelated repositories simultaneously. Each repository gets its own daemon instance with an automatically-assigned port.
# Initialize Pommel in each repository
cd ~/repos/project-a
pm init --auto --start
cd ~/repos/project-b
pm init --auto --start
cd ~/repos/project-c
pm init --auto --startEach project's daemon runs independently on its own port (calculated from a hash of the project path). The CLI automatically connects to the correct daemon based on your current directory.
# Check status of current project's daemon
pm status
# Each project has independent state
cd ~/repos/project-a && pm status # Shows project-a's index
cd ~/repos/project-b && pm status # Shows project-b's index- Port assignment: Each project gets a unique port based on a hash of its absolute path
- Independent indexes: Each
.pommel/directory contains its ownindex.db - No conflicts: Daemons don't interfere with each other
- Auto-discovery: The CLI finds the daemon by reading
.pommel/pommel.pid
For monorepos with multiple sub-projects (detected via markers like package.json, go.mod, etc.):
# Initialize with monorepo detection
cd ~/repos/monorepo
pm init --auto --monorepo --start
# Search defaults to current sub-project
cd ~/repos/monorepo/packages/frontend
pm search "component state"
# Search across entire monorepo
pm search "shared utilities" --all
# List detected sub-projects
pm subprojects| Operation | Typical Time |
|---|---|
| Initial indexing (1000 files) | 2-5 minutes |
| File change re-index | < 500ms |
| Search query (10 results) | < 100ms |
| Daemon memory usage | ~50-100 MB |
AI Agent / Developer
|
v
Pommel CLI (pm) ------> search, status, config
|
v
Pommel Daemon (pommeld)
├── File watcher (debounced)
├── Multi-level chunker (Tree-sitter)
├── Embedding generator (Ollama)
└── Search Pipeline:
├── Vector search (sqlite-vec)
├── Keyword search (FTS5)
├── RRF merge (k=60)
└── Heuristic re-ranker
|
v
SQLite Database
├── sqlite-vec (vector embeddings)
└── FTS5 (full-text index)
^
|
Jina Code Embeddings v2
(768-dim, via Ollama)
Pommel uses a multi-stage search pipeline for optimal result quality:
- Vector Search: Finds semantically similar code using embedding similarity
- Keyword Search: Finds exact keyword matches using SQLite FTS5 with BM25 scoring
- Results are merged using Reciprocal Rank Fusion (RRF) with k=60
Heuristic signals boost results based on:
- Name match: Query terms appearing in function/class names
- Exact phrase: Complete query phrase found in content
- Path match: Query terms in file path
- Recency: Recently modified files get a small boost
- Test penalty: Test files ranked slightly lower (configurable)
Each result includes:
match_source: Whether it matched via "vector", "keyword", or "both"match_reasons: Human-readable explanations of why it matchedscore_details: Breakdown of vector, keyword, and RRF scores
Pommel includes a dogfooding script that tests the system on its own codebase. This validates search quality and performance with real Go code.
# Run dogfooding tests (requires Ollama running)
./scripts/dogfood.sh
# Output results as JSON
./scripts/dogfood.sh --json
# Save results to file
./scripts/dogfood.sh --results-file results.json --json
# Keep .pommel directory after run (for debugging)
./scripts/dogfood.sh --skip-cleanupPrerequisites:
- Ollama running locally (
ollama serve) - Embedding model installed (
ollama pull unclemusclez/jina-embeddings-v2-base-code)
Exit Codes:
| Code | Meaning |
|---|---|
| 0 | All tests passed |
| 1 | Build failed |
| 2 | Ollama not available (skipped gracefully) |
| 3 | Daemon failed to start |
| 4 | Search tests failed |
The script cleans up the .pommel directory after each run unless --skip-cleanup is specified. Results are documented in docs/dogfood-results.md.
Real-world comparison of Pommel semantic search vs traditional code exploration (grep/glob/file reading). Tested on Pommel's own codebase (136 files, 2,111 chunks).
- 10 search queries: 7 expected to find matches, 3 expected to find nothing
- Pommel: Single
pm searchcall per query - Explorer Agent: Autonomous agent using grep, glob, and file reads to find relevant code
| Query | Pommel Tokens | Explorer Tokens | Savings |
|---|---|---|---|
| Expected Matches | |||
| hybrid search implementation | 157 | ~18,000 | 99.1% |
| file watcher debouncing | 1,471 | ~8,500 | 82.7% |
| tree-sitter code chunking | 227 | ~15,000 | 98.5% |
| vector similarity search | 789 | ~12,000 | 93.4% |
| CLI command parsing | 275 | ~14,000 | 98.0% |
| embedding generation with ollama | 545 | ~10,000 | 94.6% |
| database schema migrations | 169 | ~11,000 | 98.5% |
| Expected Non-Matches | |||
| credit card payment processing | 342 | ~6,500 | 94.7% |
| stripe integration webhooks | 342 | ~5,000 | 93.2% |
| kubernetes deployment config | 57 | ~7,000 | 99.2% |
┌─────────────────────────────────────────────────────────────┐
│ POMMEL (10 searches) EXPLORER AGENTS (10 searches)│
│ ═══════════════════ ═════════════════════════════│
│ Total tokens: 4,374 Total tokens: ~107,000 │
│ Avg per search: 437 Avg per search: ~10,700 │
│ Avg time: ~14ms Avg time: ~30-60s │
├─────────────────────────────────────────────────────────────┤
│ 💰 TOKENS SAVED: 102,626 (95.9% reduction) │
│ ⚡ SPEED: ~2000-4000x faster │
│ 📊 EFFICIENCY: 24x fewer tokens per search │
└─────────────────────────────────────────────────────────────┘
Even for non-matches (code that doesn't exist), explorer agents consume 5,000-7,000 tokens just to exhaustively search and conclude "nothing found." Pommel returns a low-confidence result in <100 tokens with a score of 0.40-0.49, allowing agents to quickly recognize weak matches and move on.
MIT