A zero-code, configuration-driven LLM task gateway. Simply submit a YAML configuration file to automatically generate a production-grade API endpoint with input validation, structured output, automatic retry, and multi-model routing capabilities.
- Zero-Code Configuration: Define tasks in YAML, no coding required
- Dynamic Type Generation: JSON Schema → Pydantic models for validation
- Structured Output: Enforced output schema using Instructor
- Multi-Model Support: OpenAI, Anthropic, vLLM, Ollama, and more
- Built-in Security: API key authentication, rate limiting, concurrency control
- Observability: Prometheus metrics integration
- Startup Validation: Config validation before service starts
# Clone the repository
git clone <repository-url>
cd llm-task-v3
# Install dependencies using uv
uv sync
# Copy environment variables
cp .env.example .env
# Edit .env with your API keys
nano .env- Configure Models (
config/models.yaml):
model_list:
- model_name: gpt-4o-mini
provider: openai
model: openai/gpt-4o-mini
api_key: os.environ/OPENAI_API_KEY- Define a Task (
tasks/sentiment_analysis.yaml):
meta:
id: "sentiment_analysis"
version: "1.0.0"
description: "Analyze sentiment of text"
input_schema:
type: object
properties:
content:
type: string
minLength: 10
required: ["content"]
output_schema:
type: object
properties:
sentiment:
type: string
enum: ["POSITIVE", "NEGATIVE", "NEUTRAL"]
confidence:
type: number
minimum: 0.0
maximum: 1.0
required: ["sentiment", "confidence"]
strategy:
primary_model: "gpt-4o-mini"
max_retries: 2
timeout: 15
prompt:
system: "You are a sentiment analyst."
user: "Analyze: {{ content }}"uv run uvicorn src.main:app --reloadThe API will be available at http://localhost:8000
curl -X POST "http://localhost:8000/api/v1/run" \
-H "X-API-Key: your-secret-api-key-here" \
-H "Content-Type: application/json" \
-d '{
"task_id": "sentiment_analysis",
"payload": {
"content": "This product is amazing!"
}
}'{
"code": 200,
"status": "success",
"data": {
"sentiment": "POSITIVE",
"confidence": 0.95
},
"meta": {
"task_id": "sentiment_analysis",
"model_used": "gpt-4o-mini",
"latency_ms": 850
}
}| Endpoint | Method | Description |
|---|---|---|
/health |
GET | Health check |
/api/v1/tasks |
GET | List all tasks |
/api/v1/tasks/{task_id} |
GET | Get task details |
/api/v1/run |
POST | Execute a task |
/metrics |
GET | Prometheus metrics |
# Run all tests
uv run pytest
# Run unit tests only
uv run pytest tests/unit/ -v
# Run with coverage
uv run pytest --cov=src --cov-report=html# Linting
uv run ruff check .
# Format code
uv run ruff format .llm-task-v3/
├── config/
│ └── models.yaml # Model configurations
├── tasks/
│ └── *.yaml # Task definitions
├── src/
│ ├── main.py # FastAPI application
│ ├── config.py # Configuration loading
│ ├── models.py # Dynamic model builder
│ ├── executor.py # Task executor
│ ├── gateway.py # Auth, rate limiting, concurrency
│ ├── metrics.py # Prometheus metrics
│ └── llm/
│ └── provider_registry.py # LLM client registry
├── tests/
│ ├── unit/ # Unit tests
│ ├── integration/ # Integration tests
│ └── e2e/ # End-to-end tests
└── doc/
├── prd.md # Product requirements
└── research.md # Dependency research
MIT License