Echo Analytics Platform

An analytics platform I built to turn messy business data into something useful. Upload a CSV, get metrics, see trends, ask questions in plain English.

Live Dashboard Demo

The Problem

Small businesses have data everywhere - spreadsheets, exports, random CSVs. They need answers but don't have time to wrestle with formulas or learn BI tools. I wanted to build something that handles the messy parts automatically.

What I Built

Data Pipeline: Ingest raw files, detect schemas, clean the mess (currency symbols, date formats, inconsistent booleans), validate quality, then calculate metrics.

Analytics Layer: 20+ business metrics computed deterministically. Revenue trends, cohort retention, customer segmentation, funnel analysis. All tested, all reproducible.

Two Interfaces:

Streamlit Dashboard - KPIs, charts, drill-downs for BI-style analysis
Next.js Web App - Chat interface where you ask questions in plain English, powered by an LLM that explains the numbers (but never calculates them - that's handled by tested Python code)

Screenshots

Streamlit Dashboard

Overview	Revenue Analysis	Customer Segmentation

Next.js Web App

Metrics View	Chat Interface	Reports

Technical Highlights

Data Analytics and BI

SQL Portfolio - 10+ queries in sql/analytics/ covering:

CTEs and window functions (LAG, LEAD, NTILE, ROW_NUMBER)
Cohort retention matrices
RFM customer segmentation
Funnel conversion analysis
Time series with moving averages and anomaly detection

-- Month-over-month revenue growth
WITH monthly AS (
    SELECT DATE_TRUNC('month', transaction_date) AS month,
           SUM(amount) AS revenue
    FROM transactions
    GROUP BY 1
)
SELECT month, revenue,
       LAG(revenue) OVER (ORDER BY month) AS prev_month,
       ROUND((revenue - LAG(revenue) OVER (ORDER BY month)) /
             NULLIF(LAG(revenue) OVER (ORDER BY month), 0) * 100, 2) AS growth_pct
FROM monthly;

Jupyter Notebooks - Four analysis notebooks in notebooks/:

Cohort retention with heatmaps and survival curves
Revenue forecasting with time series decomposition
Customer segmentation using RFM and K-means
A/B test analysis with z-tests and confidence intervals

Streamlit Dashboard - Multi-page app with KPI cards, revenue trends with 7-day moving averages, customer segment breakdowns, and interactive filters.

Data Engineering

ETL with Prefect - Orchestrated flows for daily metric computation, incremental loads, and error handling.

dbt Models - Transformations organized into staging, intermediate, and mart layers. Incremental MRR calculations, customer lifetime aggregations.

Data Quality with Great Expectations - 26 validation rules catching nulls, duplicates, referential integrity issues, and schema drift before bad data hits the warehouse.

Data Cleaning - The DataAutoFixer service normalizes column names, parses mixed date formats, strips currency symbols, standardizes booleans, and flags outliers.

Software Engineering

FastAPI Backend - REST API with structured routers for ingestion, metrics, chat, reports, experiments, and analytics. Request validation with Pydantic, async where it matters.

Next.js Frontend - TypeScript, Tailwind CSS, drag-and-drop file upload, real-time metric display, and a chat interface for conversational analytics.

PostgreSQL + Redis - Postgres for persistence, Redis for caching expensive metric calculations.

238 Tests, 78% Coverage - Unit tests for metrics calculations, integration tests for API endpoints, fixture-based test data.

Docker Compose - One command to spin up the full stack locally.

Architecture

echo/
├── app/                      # FastAPI backend
│   ├── api/v1/               # REST endpoints
│   ├── services/             # Business logic
│   │   ├── metrics/          # Metric calculations
│   │   ├── experiments/      # A/B testing
│   │   └── data_profiler.py  # Data profiling
│   └── models/               # SQLAlchemy models
├── frontend/                 # Next.js web app (chat, uploads, reports)
├── dashboard/                # Streamlit BI dashboard
├── sql/                      # SQL query portfolio
├── notebooks/                # Analysis notebooks
├── orchestration/            # Prefect ETL flows
├── dbt/                      # dbt transformations
├── data_quality/             # Great Expectations
└── tests/                    # Test suite

Layer	Tech
API	FastAPI, Python 3.11
Database	PostgreSQL 15, Redis 7
Web App	Next.js 15, TypeScript, Tailwind
Dashboard	Streamlit, Plotly
ETL	Prefect 2.14
Transformations	dbt 1.7
Data Quality	Great Expectations 0.18

Running It

Dashboard (Streamlit)

Live at echo-analytics.streamlit.app

Or run locally:

pipx install streamlit --include-deps
pipx inject streamlit plotly
streamlit run dashboard/app.py

Full Stack (Docker)

git clone https://github.com/Hussain0327/Echo_Data_Scientist.git
cd Echo_Data_Scientist
cp .env.example .env
# Add your DEEPSEEK_API_KEY or OPENAI_API_KEY

docker-compose up -d

# Frontend
cd frontend && npm install && npm run dev

# Open http://localhost:3000

Notebooks

jupyter notebook notebooks/

What I Learned

Data cleaning is most of the work. The DataAutoFixer went through five rewrites. Real data is messy in ways you don't expect until you see it.

LLMs are bad at math. I learned this the hard way. Now Python handles all calculations, and the LLM just explains results it's given. Much more reliable.

Testing saves time. 238 tests sounds like a lot, but they caught regressions constantly. Especially when refactoring the metrics engine.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.devcontainer		.devcontainer
app		app
dashboard		dashboard
data/samples		data/samples
data_quality		data_quality
dbt		dbt
docs/screenshots		docs/screenshots
frontend		frontend
notebooks		notebooks
orchestration		orchestration
sql		sql
tests		tests
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Echo Analytics Platform

The Problem

What I Built

Screenshots

Streamlit Dashboard

Next.js Web App

Technical Highlights

Data Analytics and BI

Data Engineering

Software Engineering

Architecture

Running It

Dashboard (Streamlit)

Full Stack (Docker)

Notebooks

What I Learned

License

About

Uh oh!

Releases

Packages

Languages

Hussain0327/echo-analytics-platform

Folders and files

Latest commit

History

Repository files navigation

Echo Analytics Platform

The Problem

What I Built

Screenshots

Streamlit Dashboard

Next.js Web App

Technical Highlights

Data Analytics and BI

Data Engineering

Software Engineering

Architecture

Running It

Dashboard (Streamlit)

Full Stack (Docker)

Notebooks

What I Learned

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages