Warren SEC Visualizer

Local dev stack for the SEC Visualizer MVP. This repo bootstraps the runtime services (Postgres, Prefect, API, Next.js frontend) and a Python workspace managed by uv.

Quickstart

Copy env template

cp .env.example .env

Install deps and create the local virtualenv

uv sync

Start the stack

docker compose up --build -d

Apply migrations (uses .env for DB settings)

uv run alembic upgrade head

Verify services

API health: http://localhost:8000/healthz
Frontend UI: http://localhost:3000
Prefect UI: http://localhost:4200

Docs

State machines: docs/state_machine.md
Mapping pipeline walkthrough: docs/mapping_pipeline.md
Design notes: docs/design.md

Backfill AMZN (Prefect v3)

These steps run the flow via Prefect against the Docker-hosted server.

Point CLI + worker at the local Prefect API

export PREFECT_API_URL=http://localhost:4200/api

Ensure SEC user agent is set (required for ticker lookup)

export SEC_USER_AGENT="your@email.com"

Start a local process worker (runs the flow code from your repo)

uv run prefect worker start --pool default --type process --work-queue default

Create the deployment (one-time)

uv run prefect deploy api/app/flows/filings.py:backfill_company_flow -n backfill-company

Run AMZN backfill

uv run prefect deployment run "backfill-company-flow/backfill-company" \
  -p cik=null \
  -p ticker="AMZN" \
  -p start_date="2015-01-01" \
  -p submission_types='["10-K","10-Q","8-K"]' \
  -p ingest_simple_xbrl=true \
  -p max_xbrl_accessions=null \
  -p fetch_fundamentals_snapshot=true

Income statement reconciliation (annual)

Ensure FMP and SEC env vars are set

export FMP_API_KEY="your_key"
export SEC_USER_AGENT="your@email.com"

Run the reconciliation script

uv run python scripts/reconcile_income_statement.py --ticker AMZN

Use --no-fetch-fmp to reuse cached FMP rows from the DB.

Check reconciliation version gaps

After a new mapping version runs, compare against the previous version:

psql postgresql://postgres:postgres@localhost:54322/postgres -c \
"select * from reconciliation_version_checks where ticker='AMZN' and statement='IS' order by period_end_date;"

If the table is empty, the new mapping version did not drop any metrics.

Mapping agent (LLM comparison)

Set OpenRouter + LiteLLM proxy env vars

export OPENROUTER_API_KEY="your_key"
export LITELLM_MASTER_KEY="local-dev-key"
export LITELLM_PROXY_URL="http://localhost:4000/v1"
export LITELLM_PROXY_KEY="local-dev-key"
export MAPPING_AGENT_MAX_TOKENS=2048
export MAPPING_AGENT_TEMPERATURE=0.2
export MAPPING_AGENT_TIMEOUT=120
export MAPPING_AGENT_OUTPUT_RETRIES=2

Start the proxy (if not already running)

docker compose up -d litellm-proxy

Run the mapping agent comparison (lightweight, no reconciliation context)

PYTHONPATH=. uv run python scripts/run_mapping_agent.py \
  --ticker AMZN \
  --models openrouter/anthropic/claude-3.5-sonnet,openrouter/openai/gpt-4o-mini

Results are stored in mapping_agent_batches, mapping_agent_runs, and draft rows in reconciliation_mappings with is_active=false.

Note: model names must exist in litellm.yaml (add entries for any new models you want to compare).

Mapping pipeline (LLM + RAG + reconciliation)

This pipeline reconciles XBRL vs FMP, pulls RAG snippets, computes candidate tag values from local XBRL facts, and runs the mapping agent in comparison mode. After each accepted override, it re-runs reconciliation so the model sees fresh mismatches with XBRL/FMP values. It writes draft overrides (reconciliation_mappings with is_active=false) and reconciliation results for both the base mapping and candidate overrides.

See docs/mapping_pipeline.md for a full walkthrough.

Ensure filings, XBRL, and filing blocks are ingested (see sections above).
Run the mapping pipeline

PYTHONPATH=. uv run python scripts/run_mapping_pipeline.py \
  --ticker AMZN \
  --models openrouter/openai/gpt-oss-120b,openrouter/moonshotai/kimi-k2 \
  --max-accessions 10 \
  --max-rounds 7

Optional flags:

--no-fetch-fmp to use cached FMP rows
--no-rag to skip RAG context
--accessions 000101872421000004 to target a specific filing
--max-accessions 10 to cap years used for mapping
--max-rounds 7 to set recursive improvement rounds

Validate and compare mappings across models

Use the pipeline output to grab each candidate mapping_version, then compare reconciliation status counts by mapping version:

psql postgresql://postgres:postgres@localhost:54322/postgres -c \
"select mapping_version, status, count(*) \
 from reconciliation_results \
 where ticker='AMZN' and statement='IS' \
 group by mapping_version, status \
 order by mapping_version, status;"

Compare model coverage deltas from the mapping agent runs:

psql postgresql://postgres:postgres@localhost:54322/postgres -c \
"select model, (quality_json->>'coverage_delta')::float as coverage_delta, mapping_id, created_at \
 from mapping_agent_runs \
 where batch_id = ( \
   select max(id) from mapping_agent_batches where ticker='AMZN' and statement='IS' \
 ) \
 order by coverage_delta desc nulls last;"

Filing blocks RAG (SEC parsers)

Apply migrations (adds the vector extension + filing_blocks table)

uv run alembic upgrade head

Ingest filing blocks for a company

PYTHONPATH=. uv run python scripts/ingest_filing_blocks.py --ticker AMZN --limit 1

Or target a specific accession:

PYTHONPATH=. uv run python scripts/ingest_filing_blocks.py --accessions 000101872421000004

Verify rows

psql postgresql://postgres:postgres@localhost:54322/postgres -c \
"select accession_number, block_type, section_title, length(content_text) as chars from filing_blocks limit 5;"

Quality checks

Run lint + coverage locally:

scripts/quality_checks.sh

Install git hooks to run checks before push:

scripts/install_git_hooks.sh

Or run commands directly:

uv run ruff check .
uv run --extra dev radon cc -n D -s api/app
uv run --extra dev radon mi -n C -s api/app
uv run pytest --cov=api/app/ingest --cov=api/app/reconciliation --cov-report=term-missing

Notes

Alembic migrations live in alembic/ and should be applied before backfills.
Environment variables are centralized in .env for local dev.
The Prefect worker above runs locally; Docker workers need the repo mounted or an image with the flow code baked in.

Name		Name	Last commit message	Last commit date
Latest commit History 327 Commits
.claude		.claude
.githooks		.githooks
.github/workflows		.github/workflows
alembic		alembic
api		api
assets/logos		assets/logos
ci		ci
docker		docker
docs		docs
scripts		scripts
tests		tests
viz-frontend		viz-frontend
.env.example		.env.example
.gitignore		.gitignore
.prefectignore		.prefectignore
CLAUDE.md		CLAUDE.md
DATAMULE_BUG_REPORT.md		DATAMULE_BUG_REPORT.md
README.md		README.md
RUN_WITH_DATABASE.md		RUN_WITH_DATABASE.md
SEGMENT_ROLLUP_SUMMARY.md		SEGMENT_ROLLUP_SUMMARY.md
VALUATION_TEST_PLAN.md		VALUATION_TEST_PLAN.md
alembic.ini		alembic.ini
docker-compose.ci.yml		docker-compose.ci.yml
docker-compose.yml		docker-compose.yml
litellm.yaml		litellm.yaml
prefect.yaml		prefect.yaml
pyproject.toml		pyproject.toml
result.txt		result.txt
sonar-project.properties		sonar-project.properties
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Warren SEC Visualizer

Quickstart

Docs

Backfill AMZN (Prefect v3)

Income statement reconciliation (annual)

Mapping agent (LLM comparison)

Mapping pipeline (LLM + RAG + reconciliation)

Validate and compare mappings across models

Filing blocks RAG (SEC parsers)

Quality checks

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

actions-im/warren-sec

Folders and files

Latest commit

History

Repository files navigation

Warren SEC Visualizer

Quickstart

Docs

Backfill AMZN (Prefect v3)

Income statement reconciliation (annual)

Mapping agent (LLM comparison)

Mapping pipeline (LLM + RAG + reconciliation)

Validate and compare mappings across models

Filing blocks RAG (SEC parsers)

Quality checks

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages