An opinionated, production-shaped MLOps starter: a Dockerized FastAPI inference service (sklearn) with CI and a Cloud Run deploy workflow.
- Show end-to-end “ship a model as a service” basics (validation, readiness, logging, Docker, CI/CD) without heavyweight infra.
- Be small enough to understand in one sitting, but structured enough to look like real work in a 2026 MLOps interview.
- API: FastAPI (
src/app.py) with/health(readiness-gated) and/predict. - Model lifecycle: trains an Iris
StandardScaler + LogisticRegressionpipeline on startup and stores it onapp.state.model(src/model.py). - Contracts: request/response and validation behavior are test-backed (
tests/test_app.py). - Observability: JSON logs + request-id propagation middleware (
src/logging_util.py). - Packaging:
uv+pyproject.toml/uv.lock; common tasks viaMakefile. - Container:
Dockerfilerunsuvicornand honors Cloud Run’sPORT.
- Docker
- uv (modern Python package manager)
-
Install uv (once, if not installed):
curl -LsSf https://astral.sh/uv/install.sh | sh -
Setup project (automatically handles Python version + dependencies):
make setup
-
Run locally:
make dev
-
Or start the full stack with Docker:
# Optional: copy and edit `.env` if you want to override defaults cp .env.example .env make up -
Test the API:
curl -X POST "http://localhost:8080/predict" \ -H "Content-Type: application/json" \ -d '{"values": [5.1,3.5,1.4,0.2]}'
make setup– install project + dev deps with uvmake dev– run the FastAPI app with reloadmake test– run pytestmake lint/make fmt– ruff check/formatmake typecheck– mypymake up/make down– docker compose up/downmake logs– tail the API logs
.env is optional. Start from .env.example (or env.example) and override as needed.
api: builds fromDockerfileand serves HTTP on container port8080(mapped to host${APP_PORT:-8080}).db: optional Postgres 16 placeholder for future examples (the current demo app does not use a database).
- Structured logs are NDJSON to stdout (Datadog-friendly), tagged with
service,env,version, andrequest_id. - Incoming
X-Request-IDis honored (or generated); responses echo it, and each request log includes path/method/status/duration_ms. - Control verbosity with
LOG_LEVEL(default INFO); 4xx log at WARN, 5xx at ERROR with stack traces; no raw payloads are logged by default.
This repo ships with a GitHub Actions workflow that builds a container (Cloud Build), pushes it to Artifact Registry, and deploys to Cloud Run after CI succeeds on main: .github/workflows/deploy-cloudrun.yml.
Required GitHub secrets:
GCP_PROJECT_IDGCP_REGION(example:us-central1)GCP_ARTIFACT_REPO(Artifact Registry Docker repo name)CLOUD_RUN_SERVICE(Cloud Run service name)GCP_WIF_PROVIDERandGCP_WIF_SERVICE_ACCOUNT(recommended: Workload Identity Federation; no long-lived keys)
- KISS: keep modules small and focused; avoid magic config when defaults suffice.
- Clean code: prefer clear names, short functions, and single-purpose modules.
- DDD-lite: treat
src/as the domain boundary; keep infra concerns (I/O, Docker, CI) at the edges. - TDD: add/adjust tests in
tests/alongside behavior changes; keep tests fast and deterministic.
.github/workflows/test.yml runs: uv sync --frozen --group dev → uv run pytest -q → uv run ruff check . → uv run mypy ..
- Health/readiness:
GET /healthreturns200 {"status":"ok"}only when the model is loaded; otherwise503. - Logs:
make logs(Docker) or Cloud Run Logs Explorer; correlate requests viaX-Request-ID.
- Why Cloud Run: minimal ops surface area, fast iteration, and “production enough” for an inference microservice.
- Why not Kubernetes: this repo is intentionally small; adding GKE/Helm would add complexity without a demonstrated requirement.