Local dev stack for the SEC Visualizer MVP. This repo bootstraps the runtime services (Postgres, Prefect, API, Next.js frontend) and a Python workspace managed by uv.
- Copy env template
cp .env.example .env
- Install deps and create the local virtualenv
uv sync
- Start the stack
docker compose up --build -d
- Apply migrations (uses
.envfor DB settings)
uv run alembic upgrade head
- Verify services
- API health: http://localhost:8000/healthz
- Frontend UI: http://localhost:3000
- Prefect UI: http://localhost:4200
- State machines:
docs/state_machine.md - Mapping pipeline walkthrough:
docs/mapping_pipeline.md - Design notes:
docs/design.md
These steps run the flow via Prefect against the Docker-hosted server.
- Point CLI + worker at the local Prefect API
export PREFECT_API_URL=http://localhost:4200/api
- Ensure SEC user agent is set (required for ticker lookup)
export SEC_USER_AGENT="your@email.com"
- Start a local process worker (runs the flow code from your repo)
uv run prefect worker start --pool default --type process --work-queue default
- Create the deployment (one-time)
uv run prefect deploy api/app/flows/filings.py:backfill_company_flow -n backfill-company
- Run AMZN backfill
uv run prefect deployment run "backfill-company-flow/backfill-company" \
-p cik=null \
-p ticker="AMZN" \
-p start_date="2015-01-01" \
-p submission_types='["10-K","10-Q","8-K"]' \
-p ingest_simple_xbrl=true \
-p max_xbrl_accessions=null \
-p fetch_fundamentals_snapshot=true
- Ensure FMP and SEC env vars are set
export FMP_API_KEY="your_key"
export SEC_USER_AGENT="your@email.com"
- Run the reconciliation script
uv run python scripts/reconcile_income_statement.py --ticker AMZN
Use --no-fetch-fmp to reuse cached FMP rows from the DB.
- Check reconciliation version gaps
After a new mapping version runs, compare against the previous version:
psql postgresql://postgres:postgres@localhost:54322/postgres -c \
"select * from reconciliation_version_checks where ticker='AMZN' and statement='IS' order by period_end_date;"
If the table is empty, the new mapping version did not drop any metrics.
- Set OpenRouter + LiteLLM proxy env vars
export OPENROUTER_API_KEY="your_key"
export LITELLM_MASTER_KEY="local-dev-key"
export LITELLM_PROXY_URL="http://localhost:4000/v1"
export LITELLM_PROXY_KEY="local-dev-key"
export MAPPING_AGENT_MAX_TOKENS=2048
export MAPPING_AGENT_TEMPERATURE=0.2
export MAPPING_AGENT_TIMEOUT=120
export MAPPING_AGENT_OUTPUT_RETRIES=2
- Start the proxy (if not already running)
docker compose up -d litellm-proxy
- Run the mapping agent comparison (lightweight, no reconciliation context)
PYTHONPATH=. uv run python scripts/run_mapping_agent.py \
--ticker AMZN \
--models openrouter/anthropic/claude-3.5-sonnet,openrouter/openai/gpt-4o-mini
Results are stored in mapping_agent_batches, mapping_agent_runs, and draft rows in
reconciliation_mappings with is_active=false.
Note: model names must exist in litellm.yaml (add entries for any new models you want to compare).
This pipeline reconciles XBRL vs FMP, pulls RAG snippets, computes candidate tag values from
local XBRL facts, and runs the mapping agent in comparison mode. After each accepted override,
it re-runs reconciliation so the model sees fresh mismatches with XBRL/FMP values. It writes
draft overrides (reconciliation_mappings with is_active=false) and reconciliation results
for both the base mapping and candidate overrides.
See docs/mapping_pipeline.md for a full walkthrough.
-
Ensure filings, XBRL, and filing blocks are ingested (see sections above).
-
Run the mapping pipeline
PYTHONPATH=. uv run python scripts/run_mapping_pipeline.py \
--ticker AMZN \
--models openrouter/openai/gpt-oss-120b,openrouter/moonshotai/kimi-k2 \
--max-accessions 10 \
--max-rounds 7
Optional flags:
--no-fetch-fmpto use cached FMP rows--no-ragto skip RAG context--accessions 000101872421000004to target a specific filing--max-accessions 10to cap years used for mapping--max-rounds 7to set recursive improvement rounds
Use the pipeline output to grab each candidate mapping_version, then compare reconciliation
status counts by mapping version:
psql postgresql://postgres:postgres@localhost:54322/postgres -c \
"select mapping_version, status, count(*) \
from reconciliation_results \
where ticker='AMZN' and statement='IS' \
group by mapping_version, status \
order by mapping_version, status;"
Compare model coverage deltas from the mapping agent runs:
psql postgresql://postgres:postgres@localhost:54322/postgres -c \
"select model, (quality_json->>'coverage_delta')::float as coverage_delta, mapping_id, created_at \
from mapping_agent_runs \
where batch_id = ( \
select max(id) from mapping_agent_batches where ticker='AMZN' and statement='IS' \
) \
order by coverage_delta desc nulls last;"
- Apply migrations (adds the
vectorextension +filing_blockstable)
uv run alembic upgrade head
- Ingest filing blocks for a company
PYTHONPATH=. uv run python scripts/ingest_filing_blocks.py --ticker AMZN --limit 1
Or target a specific accession:
PYTHONPATH=. uv run python scripts/ingest_filing_blocks.py --accessions 000101872421000004
- Verify rows
psql postgresql://postgres:postgres@localhost:54322/postgres -c \
"select accession_number, block_type, section_title, length(content_text) as chars from filing_blocks limit 5;"
Run lint + coverage locally:
scripts/quality_checks.sh
Install git hooks to run checks before push:
scripts/install_git_hooks.sh
Or run commands directly:
uv run ruff check .
uv run --extra dev radon cc -n D -s api/app
uv run --extra dev radon mi -n C -s api/app
uv run pytest --cov=api/app/ingest --cov=api/app/reconciliation --cov-report=term-missing
- Alembic migrations live in
alembic/and should be applied before backfills. - Environment variables are centralized in
.envfor local dev. - The Prefect worker above runs locally; Docker workers need the repo mounted or an image with the flow code baked in.