Skip to content

actions-im/warren-sec

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Warren SEC Visualizer

Local dev stack for the SEC Visualizer MVP. This repo bootstraps the runtime services (Postgres, Prefect, API, Next.js frontend) and a Python workspace managed by uv.

Quickstart

  1. Copy env template
cp .env.example .env
  1. Install deps and create the local virtualenv
uv sync
  1. Start the stack
docker compose up --build -d
  1. Apply migrations (uses .env for DB settings)
uv run alembic upgrade head
  1. Verify services

Docs

  • State machines: docs/state_machine.md
  • Mapping pipeline walkthrough: docs/mapping_pipeline.md
  • Design notes: docs/design.md

Backfill AMZN (Prefect v3)

These steps run the flow via Prefect against the Docker-hosted server.

  1. Point CLI + worker at the local Prefect API
export PREFECT_API_URL=http://localhost:4200/api
  1. Ensure SEC user agent is set (required for ticker lookup)
export SEC_USER_AGENT="your@email.com"
  1. Start a local process worker (runs the flow code from your repo)
uv run prefect worker start --pool default --type process --work-queue default
  1. Create the deployment (one-time)
uv run prefect deploy api/app/flows/filings.py:backfill_company_flow -n backfill-company
  1. Run AMZN backfill
uv run prefect deployment run "backfill-company-flow/backfill-company" \
  -p cik=null \
  -p ticker="AMZN" \
  -p start_date="2015-01-01" \
  -p submission_types='["10-K","10-Q","8-K"]' \
  -p ingest_simple_xbrl=true \
  -p max_xbrl_accessions=null \
  -p fetch_fundamentals_snapshot=true

Income statement reconciliation (annual)

  1. Ensure FMP and SEC env vars are set
export FMP_API_KEY="your_key"
export SEC_USER_AGENT="your@email.com"
  1. Run the reconciliation script
uv run python scripts/reconcile_income_statement.py --ticker AMZN

Use --no-fetch-fmp to reuse cached FMP rows from the DB.

  1. Check reconciliation version gaps

After a new mapping version runs, compare against the previous version:

psql postgresql://postgres:postgres@localhost:54322/postgres -c \
"select * from reconciliation_version_checks where ticker='AMZN' and statement='IS' order by period_end_date;"

If the table is empty, the new mapping version did not drop any metrics.

Mapping agent (LLM comparison)

  1. Set OpenRouter + LiteLLM proxy env vars
export OPENROUTER_API_KEY="your_key"
export LITELLM_MASTER_KEY="local-dev-key"
export LITELLM_PROXY_URL="http://localhost:4000/v1"
export LITELLM_PROXY_KEY="local-dev-key"
export MAPPING_AGENT_MAX_TOKENS=2048
export MAPPING_AGENT_TEMPERATURE=0.2
export MAPPING_AGENT_TIMEOUT=120
export MAPPING_AGENT_OUTPUT_RETRIES=2
  1. Start the proxy (if not already running)
docker compose up -d litellm-proxy
  1. Run the mapping agent comparison (lightweight, no reconciliation context)
PYTHONPATH=. uv run python scripts/run_mapping_agent.py \
  --ticker AMZN \
  --models openrouter/anthropic/claude-3.5-sonnet,openrouter/openai/gpt-4o-mini

Results are stored in mapping_agent_batches, mapping_agent_runs, and draft rows in reconciliation_mappings with is_active=false.

Note: model names must exist in litellm.yaml (add entries for any new models you want to compare).

Mapping pipeline (LLM + RAG + reconciliation)

This pipeline reconciles XBRL vs FMP, pulls RAG snippets, computes candidate tag values from local XBRL facts, and runs the mapping agent in comparison mode. After each accepted override, it re-runs reconciliation so the model sees fresh mismatches with XBRL/FMP values. It writes draft overrides (reconciliation_mappings with is_active=false) and reconciliation results for both the base mapping and candidate overrides.

See docs/mapping_pipeline.md for a full walkthrough.

  1. Ensure filings, XBRL, and filing blocks are ingested (see sections above).

  2. Run the mapping pipeline

PYTHONPATH=. uv run python scripts/run_mapping_pipeline.py \
  --ticker AMZN \
  --models openrouter/openai/gpt-oss-120b,openrouter/moonshotai/kimi-k2 \
  --max-accessions 10 \
  --max-rounds 7

Optional flags:

  • --no-fetch-fmp to use cached FMP rows
  • --no-rag to skip RAG context
  • --accessions 000101872421000004 to target a specific filing
  • --max-accessions 10 to cap years used for mapping
  • --max-rounds 7 to set recursive improvement rounds

Validate and compare mappings across models

Use the pipeline output to grab each candidate mapping_version, then compare reconciliation status counts by mapping version:

psql postgresql://postgres:postgres@localhost:54322/postgres -c \
"select mapping_version, status, count(*) \
 from reconciliation_results \
 where ticker='AMZN' and statement='IS' \
 group by mapping_version, status \
 order by mapping_version, status;"

Compare model coverage deltas from the mapping agent runs:

psql postgresql://postgres:postgres@localhost:54322/postgres -c \
"select model, (quality_json->>'coverage_delta')::float as coverage_delta, mapping_id, created_at \
 from mapping_agent_runs \
 where batch_id = ( \
   select max(id) from mapping_agent_batches where ticker='AMZN' and statement='IS' \
 ) \
 order by coverage_delta desc nulls last;"

Filing blocks RAG (SEC parsers)

  1. Apply migrations (adds the vector extension + filing_blocks table)
uv run alembic upgrade head
  1. Ingest filing blocks for a company
PYTHONPATH=. uv run python scripts/ingest_filing_blocks.py --ticker AMZN --limit 1

Or target a specific accession:

PYTHONPATH=. uv run python scripts/ingest_filing_blocks.py --accessions 000101872421000004
  1. Verify rows
psql postgresql://postgres:postgres@localhost:54322/postgres -c \
"select accession_number, block_type, section_title, length(content_text) as chars from filing_blocks limit 5;"

Quality checks

Run lint + coverage locally:

scripts/quality_checks.sh

Install git hooks to run checks before push:

scripts/install_git_hooks.sh

Or run commands directly:

uv run ruff check .
uv run --extra dev radon cc -n D -s api/app
uv run --extra dev radon mi -n C -s api/app
uv run pytest --cov=api/app/ingest --cov=api/app/reconciliation --cov-report=term-missing

Notes

  • Alembic migrations live in alembic/ and should be applied before backfills.
  • Environment variables are centralized in .env for local dev.
  • The Prefect worker above runs locally; Docker workers need the repo mounted or an image with the flow code baked in.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •