Intelligent Function/Tool Routing using Fine-tuned FunctionGemma
FuncRoute is a production-ready Python package for intelligent task routing in agentic AI systems. Fine-tune Google's FunctionGemma (270M parameters) to route user queries to the appropriate function or tool with high accuracy and low latency.
Problem: Modern AI agents need to route user queries to the right tool/function among dozens of options. Traditional approaches using massive LLMs are:
- πΈ Expensive ($0.10+ per 1000 queries)
- π Slow (1-3 seconds per query)
- π― Inconsistent (hallucinations, wrong tools)
FuncRoute Solution:
- π° 99% cheaper (fine-tuned 270M model vs GPT-4)
- β‘ 10-100x faster (50-200ms per query)
- π― More accurate (98%+ with proper training)
- π Self-hosted (no API costs, full control)
- β Easy Training: Fine-tune FunctionGemma with your own data in minutes
- β Synthetic Data: Generate 1000s of training samples automatically
- β Anti-Leakage: Pattern group splitting prevents overfitting
- β Data Validation: Automatic format checking and quality validation
- β Efficient Training: LoRA + 4-bit quantization (runs on 8GB GPU)
- β Fast Inference: Batch prediction, streaming, async support
- β Production Ready: REST API, caching (10x speedup), monitoring
- Synthetic Data Generation: Rule-based pattern expansion (like train.py)
- Pattern Group Splitting: Prevents data leakage between train/val/test
- Data Validation: Format checking, leakage detection, quality metrics
- Flexible Input: JSONL, CSV, pandas DataFrame, Hugging Face Datasets
- Memory Efficient: 4-bit quantization + LoRA (8GB GPU sufficient)
- Batch Processing: Parallel predictions with progress tracking
- Streaming: Process results as they arrive
- Async Support: Native asyncio for web frameworks
- Caching: LRU + TTL caching (5-10x speedup)
- REST API: FastAPI server with OpenAPI docs
- CLI: Complete command-line interface
- Metrics: Accuracy, precision, recall, F1 per tool
- Visualization: Confusion matrices, performance charts
- Cross-Validation: K-fold validation support
- Latency Tracking: Per-query timing and statistics
pip install funcroutegit clone https://github.com/yourusername/funcroute.git
cd funcroute
pip install -e .- Python 3.9+
- PyTorch 2.0+
- CUDA GPU (recommended, 8GB+ VRAM)
- CPU supported but 10x slower
from funcroute import FuncRoute, TrainingConfig
from funcroute.core.config import ToolDefinition
from funcroute.data.generator import SyntheticDataGenerator
from funcroute.data.splitter import PatternGroupSplitter
# Step 1: Define your tools
tools = [
ToolDefinition(
name="manage_order",
signature="manage_order(order_id: str) -> dict",
description="Track and manage customer orders, check delivery status",
examples=["Where is my order?", "Track package #12345"],
keywords=["order", "track", "delivery", "shipping"],
),
ToolDefinition(
name="search_products",
signature="search_products(query: str) -> list",
description="Search for products in the catalog",
examples=["Show me red dresses", "Find laptops under $1000"],
keywords=["search", "find", "show", "products"],
),
ToolDefinition(
name="process_return",
signature="process_return(order_id: str, reason: str) -> dict",
description="Process product returns and refunds",
examples=["Return this item", "I want a refund"],
keywords=["return", "refund", "exchange"],
),
]
# Step 2: Generate synthetic training data
generator = SyntheticDataGenerator(method="rule_based")
data = generator.generate(
tools=tools,
num_variations=50, # Creates 50 variations per pattern
num_samples=5000, # Target ~5000 total samples
)
# Step 3: Split with anti-leakage (prevents overfitting)
splitter = PatternGroupSplitter(seed=42)
train_data, val_data, test_data = splitter.split(
data,
train_ratio=0.7,
val_ratio=0.15,
test_ratio=0.15,
verify_no_leakage=True, # Automatic verification
)
# Step 4: Train the model
router = FuncRoute()
router.train(
train_data=train_data,
val_data=val_data,
tools=tools, # CRITICAL: Must provide tool definitions!
config=TrainingConfig(
output_dir="./my_router",
num_epochs=3, # 3 epochs for good accuracy
batch_size=4, # Adjust based on GPU memory
learning_rate=2e-4, # Standard for fine-tuning
eval_strategy="epoch", # Evaluate at end of each epoch
),
)
# Step 5: Save and load
# Model automatically saved to ./my_router with tool_definitions.json
loaded_router = FuncRoute.load("./my_router")
# Step 6: Make predictions
result = loaded_router.route("Where is my package?")
print(f"Tool: {result.tool}") # manage_order
print(f"Confidence: {result.confidence:.1%}") # 98.5%
print(f"Latency: {result.latency_ms:.1f}ms") # 150msfrom funcroute import FuncRoute
# Load from Hugging Face Hub
router = FuncRoute.from_pretrained("scionoftech/functiongemma-e-commerce-tool-calling")
# Route queries
result = router.route("Where is my order?")
print(f"Tool: {result.tool}") # manage_orderfrom funcroute.inference import Predictor, RouteCache
# Load trained model
router = FuncRoute.load("./my_router")
# Add caching for 10x speedup
cache = RouteCache(max_size=1000, ttl_seconds=3600)
predictor = Predictor(router, cache=cache)
# Batch prediction
queries = [
"Where is my order?",
"Show me laptops",
"Return this item",
# ... 100s more
]
results = predictor.predict_batch(queries, max_workers=4, show_progress=True)
# Async prediction (for web apps)
import asyncio
result = await predictor.predict_async("Where is my order?")# Start server
funcroute serve --model ./my_router --port 8000
# Or in Python
python examples/server_example.py# Make requests
curl -X POST http://localhost:8000/route \
-H "Content-Type: application/json" \
-d '{"query": "Where is my order?"}'
# Batch requests
curl -X POST http://localhost:8000/route/batch \
-H "Content-Type: application/json" \
-d '{"queries": ["Where is my order?", "Show me laptops"]}'
# Health check
curl http://localhost:8000/health# Generate synthetic data
funcroute generate \
--tools tools.json \
--output synthetic.jsonl \
--num-samples 5000
# Train model
funcroute train \
--train-data train.jsonl \
--val-data val.jsonl \
--tools tools.json \
--output-dir ./my_router \
--num-epochs 3 \
--batch-size 4
# Train with synthetic data generation
funcroute train \
--tools tools.json \
--output-dir ./my_router \
--generate-data \
--num-samples 5000# Evaluate model
funcroute evaluate \
--model ./my_router \
--test-data test.jsonl \
--output metrics.json
# With visualizations
funcroute evaluate \
--model ./my_router \
--test-data test.jsonl \
--plot \
--output-dir ./eval_results# Single prediction
funcroute predict \
--model ./my_router \
--query "Where is my order?"
# Batch prediction
funcroute predict \
--model ./my_router \
--file queries.txt \
--output results.jsonl
# Interactive mode
funcroute interactive --model ./my_router# Start REST API server
funcroute serve \
--model ./my_router \
--port 8000 \
--cache-size 1000
# With custom host
funcroute serve \
--model ./my_router \
--host 0.0.0.0 \
--port 8000We provide 9 comprehensive examples demonstrating all features:
- simple_example.py - Complete workflow with 5000 samples
- batch_prediction_example.py - 7 batch processing patterns
- streaming_prediction_example.py - 7 streaming patterns
- async_prediction_example.py - 9 async/await patterns
- caching_example.py - 8 caching strategies
- evaluation_example.py - Metrics and cross-validation
- synthetic_data_example.py - Data generation
- server_example.py - REST API deployment
- test_imports.py - Import verification
# Complete workflow (recommended first example)
cd examples
python simple_example.py
# Batch processing (creates model for other examples)
python batch_prediction_example.py
# Then run dependent examples
python streaming_prediction_example.py
python async_prediction_example.py
python caching_example.py
# Standalone examples
python evaluation_example.py
python synthetic_data_example.py
python server_example.pySee examples/README.md for detailed documentation.
funcroute/
βββ funcroute/
β βββ __init__.py # Main exports
β βββ cli.py # CLI interface
β βββ core/
β β βββ config.py # Configurations (TrainingConfig, ToolDefinition, etc.)
β β βββ router.py # FuncRoute main class
β βββ training/
β β βββ trainer.py # Training orchestration
β βββ inference/
β β βββ predictor.py # Batch, streaming, async prediction
β β βββ cache.py # LRU cache with TTL
β β βββ server.py # FastAPI REST server
β βββ evaluation/
β β βββ evaluator.py # Metrics computation
β β βββ metrics.py # Metric functions
β β βββ visualizer.py # Plotting and charts
β βββ data/
β βββ loader.py # Data loading (JSONL, CSV, DataFrame)
β βββ formatter.py # FunctionGemma format conversion
β βββ generator.py # Synthetic data generation
β βββ splitter.py # Pattern group splitting
β βββ validator.py # Data validation
βββ examples/ # 9 comprehensive examples
βββ tests/ # Test suite
βββ docs/ # Documentation
{"query": "Where is my order?", "tool": "manage_order"}
{"query": "Show me red dresses", "tool": "search_products"}
{"query": "Return this item", "tool": "process_return"}from funcroute.core.config import ToolDefinition
tools = [
ToolDefinition(
name="search_products",
signature="search_products(query: str, category: str = None) -> list",
description="Search for products in the catalog by name, category, or attributes",
examples=[
"Show me red dresses",
"Find laptops under $1000",
"Do you have iPhone 15?",
],
keywords=["search", "find", "show", "looking", "browse"],
parameters={
"query": {"type": "string", "description": "Search query"},
"category": {"type": "string", "description": "Product category", "required": False},
},
),
]{
"tools": [
{
"name": "search_products",
"signature": "search_products(query: str) -> list",
"description": "Search for products",
"examples": ["Show me laptops", "Find red shoes"],
"keywords": ["show", "find", "search"],
"parameters": {
"query": {"type": "string", "description": "Search query"}
}
}
]
}Problem: Random splitting can leak pattern variations between train/test, causing inflated accuracy.
Solution: FuncRoute groups similar queries and splits by groups, not individual samples.
from funcroute.data.splitter import PatternGroupSplitter
# Data with pattern groups (from SyntheticDataGenerator)
data = [
{"query": "Where is my order?", "tool": "manage_order", "base_pattern": "order_status_1"},
{"query": "Hi, where is my order?", "tool": "manage_order", "base_pattern": "order_status_1"},
{"query": "Track my package", "tool": "manage_order", "base_pattern": "track_package_1"},
# ...
]
splitter = PatternGroupSplitter(seed=42)
train, val, test = splitter.split(
data,
train_ratio=0.7,
val_ratio=0.15,
test_ratio=0.15,
verify_no_leakage=True, # Raises error if leakage detected
)
# Output:
# Pattern Group Splitting:
# Total groups: 150
# Train groups: 105 (3500 samples)
# Val groups: 22 (750 samples)
# Test groups: 23 (750 samples)
# NO DATA LEAKAGE - Splits are clean!from funcroute.data.validator import DataValidator
validator = DataValidator()
# Validate data quality
report = validator.validate(
train_data,
min_samples_per_tool=100,
warn_duplicates=True,
warn_imbalance=True,
)
if not report['is_valid']:
print("Validation failed:")
for error in report['errors']:
print(f" - {error}")
else:
print(" Data is valid")
# Check for leakage
no_leakage = validator.check_leakage(train_data, test_data)
if not no_leakage:
print(" Data leakage detected!")from funcroute.inference import RouteCache, WarmupCache
# LRU cache with TTL
cache = RouteCache(max_size=1000, ttl_seconds=3600)
predictor = Predictor(router, cache=cache)
# Pre-warm cache with common queries
warmup = WarmupCache(predictor)
common_queries = ["Where is my order?", "Track package", "Return item"]
warmup.warmup(common_queries)
# Cache statistics
stats = cache.get_stats()
print(f"Hit rate: {stats['hit_rate']:.1%}")
print(f"Hits: {stats['hits']}, Misses: {stats['misses']}")from funcroute.evaluation import Evaluator, Visualizer
# Evaluate on test set
evaluator = Evaluator(router, test_data)
metrics = evaluator.evaluate()
print(f"Overall Accuracy: {metrics['overall']['accuracy']:.2%}")
print(f"Per-tool metrics:")
for tool, tool_metrics in metrics['per_tool'].items():
print(f" {tool}: {tool_metrics['f1']:.2%} F1")
# Visualize results
visualizer = Visualizer(evaluator)
visualizer.plot_confusion_matrix(save_path="confusion.png")
visualizer.plot_per_tool_metrics(save_path="per_tool.png")# Clone repository
git clone https://github.com/yourusername/funcroute.git
cd funcroute
# Install development dependencies
pip install -e ".[dev]"
# Run tests
pytest tests/
# Run linting
black funcroute/
flake8 funcroute/- π Documentation improvements
- π Bug fixes and issue reports
- β¨ New examples and use cases
- π§ Performance optimizations
- π Framework integrations (LangChain, etc.)
- π Visualization enhancements
MIT License - see LICENSE for details.
- Google FunctionGemma: Base model (google/functiongemma-270m-it)
Upcoming v0.2:
- π Streaming inference improvements
- π Enhanced visualization dashboard
- π LangChain integration
- π Deployment guides (Docker, K8s)
- π More examples and tutorials
- Issues: GitHub Issues
- Discussions: GitHub Discussions
If you find FuncRoute useful, please consider starring the repository!