Skip to content

observer04/analyticsApi

Repository files navigation

Analytics API

A high-performance, self-hosted web analytics API built with FastAPI and TimescaleDB for tracking and analyzing website visitor events in real-time.

🎯 Overview

Analytics API provides a lightweight, privacy-focused alternative to traditional analytics platforms. It captures page views, user interactions, and engagement metrics while leveraging TimescaleDB's powerful time-series capabilities for efficient data storage and querying.

Live Demo: https://analytics-api-brown.vercel.app

✨ Features

  • Event Tracking: Capture page views with comprehensive metadata

    • Page paths and URLs
    • User agent detection (browser, device, OS)
    • IP addresses for geolocation
    • Referrer tracking
    • Session management
    • Duration metrics
  • Time-Series Optimization:

    • Automatic data chunking (1-day intervals)
    • Automatic data retention (3-month rolling window)
    • Efficient querying with TimescaleDB hyperfunctions
  • Analytics Endpoints:

    • Aggregated metrics with customizable time buckets
    • Operating system detection and classification
    • Page-level statistics
    • Average duration calculations
    • Flexible filtering by pages and time ranges
  • Production Ready:

    • Docker and Docker Compose support
    • CORS middleware configured
    • Health check endpoint
    • Comprehensive test suite

🏗️ Architecture

analyticsApi/
├── src/
│   ├── main.py                 # FastAPI application entry point
│   └── api/
│       ├── events/
│       │   ├── models.py       # Event data models & schemas
│       │   └── routing.py      # API route handlers
│       └── db/
│           ├── config.py       # Database configuration
│           └── session.py      # Database session management
├── tests/
│   └── test_events.py         # Unit and integration tests
├── boot/                      # Initialization scripts
├── compose.yaml              # Docker Compose configuration
├── Dockerfile.web           # Production Dockerfile
├── requirements.txt         # Python dependencies
└── railway.json            # Railway deployment config

🚀 Getting Started

Prerequisites

  • Python 3.8+
  • Docker & Docker Compose (for containerized deployment)
  • PostgreSQL with TimescaleDB extension

Local Development

  1. Clone the repository
git clone https://github.com/observer04/analyticsApi.git
cd analyticsApi
  1. Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install dependencies
pip install -r requirements.txt
  1. Configure environment variables

Create a .env file in the root directory:

DATABASE_URL=postgresql+psycopg://time-user:time-pw@localhost:5432/timescaledb
DB_TIMEZONE=UTC
PORT=8080
  1. Run with Docker Compose (Recommended)
docker compose up --watch

The API will be available at http://localhost:8080

Manual Setup

If running without Docker:

  1. Install and start TimescaleDB
  2. Set up the database connection in .env
  3. Run the application:
cd src
uvicorn main:app --reload --host 0.0.0.0 --port 8080

📚 API Documentation

Endpoints

Health Check

GET /healthz

Returns API health status.

Create Event

POST /api/events/
Content-Type: application/json

{
  "page": "/pricing",
  "user_agent": "Mozilla/5.0...",
  "ip_address": "192.168.1.1",
  "referrer": "https://google.com",
  "session_id": "abc123",
  "duration": 45000
}

Get Aggregated Analytics

GET /api/events/?duration=1%20day&pages=/,/about,/pricing

Query Parameters:

  • duration (optional): Time bucket size (e.g., "1 hour", "1 day", "1 week")
  • pages (optional): List of pages to filter (comma-separated)

Response:

[
  {
    "bucket": "2026-01-18T00:00:00Z",
    "page": "/pricing",
    "operating_system": "Windows",
    "avg_duration": 42500.5,
    "count": 156
  }
]

Get Single Event

GET /api/events/{event_id}

Interactive Documentation

Once running, visit:

  • Swagger UI: http://localhost:8080/docs
  • ReDoc: http://localhost:8080/redoc

🧪 Testing

Run the test suite:

pytest tests/

Run with coverage:

pytest --cov=src tests/

🔧 Configuration

Environment Variables

Variable Description Default
DATABASE_URL PostgreSQL connection string Required
DB_TIMEZONE Database timezone UTC
PORT API server port 8080

TimescaleDB Settings

The EventModel includes TimescaleDB-specific configurations:

  • __chunk_time_interval__: "INTERVAL 1 day"
  • __drop_after__: "INTERVAL 3 months"

Adjust these in src/api/events/models.py based on your retention needs.

📦 Deployment

Docker

Build and run the production image:

docker build -t analytics-api -f Dockerfile.web .
docker run -p 8080:8080 --env-file .env analytics-api

Railway

The project includes railway.json for easy deployment to Railway:

  1. Connect your GitHub repository to Railway
  2. Set environment variables in Railway dashboard
  3. Deploy automatically on push

Vercel

Currently deployed at: https://analytics-api-brown.vercel.app

🛠️ Technology Stack

  • FastAPI - Modern, fast web framework
  • TimescaleDB - PostgreSQL extension for time-series data
  • SQLModel - SQL databases with Python type annotations
  • Pydantic - Data validation using Python type hints
  • Uvicorn - ASGI server implementation
  • Gunicorn - Production WSGI server
  • pytest - Testing framework

🔮 Future Scope

Near-Term Enhancements

  1. Authentication & Authorization

    • API key authentication
    • Multi-tenant support
    • Role-based access control (RBAC)
    • JWT token-based authentication
  2. Advanced Analytics

    • Geolocation tracking and IP-to-location mapping
    • Browser and device fingerprinting
    • Funnel analysis
    • Conversion tracking
    • A/B testing support
    • Custom event tracking (clicks, forms, downloads)
  3. Data Visualization

    • Built-in dashboard interface
    • Real-time analytics with WebSocket support
    • Exportable reports (PDF, CSV, JSON)
    • Customizable charts and graphs
  4. Performance Optimization

    • Redis caching layer
    • Query result caching
    • Connection pooling optimization
    • Batch event ingestion
    • Rate limiting and throttling

Mid-Term Features

  1. Privacy & Compliance

    • GDPR compliance tools
    • Cookie consent management
    • Data anonymization options
    • User data deletion endpoints
    • Privacy-first mode (no IP tracking)
  2. Integration & Extensibility

    • Webhook notifications for events
    • Third-party integrations (Slack, Discord, email)
    • Plugin system for custom analytics
    • Export to data warehouses (BigQuery, Snowflake)
    • GraphQL API support
  3. Monitoring & Observability

    • Prometheus metrics export
    • OpenTelemetry integration
    • Error tracking (Sentry integration)
    • Performance monitoring
    • Audit logs
  4. Data Management

    • Configurable retention policies
    • Data archival to cold storage
    • Backup and restore functionality
    • Data migration tools

Long-Term Vision

  1. Machine Learning & AI

    • Anomaly detection in traffic patterns
    • Predictive analytics
    • User behavior clustering
    • Automated insights and recommendations
    • Bot detection and filtering
  2. Enterprise Features

    • High availability (HA) setup
    • Multi-region deployment
    • Horizontal scaling support
    • Advanced security features (encryption at rest)
    • SLA monitoring
    • White-label solutions
  3. Developer Experience

    • SDKs for multiple languages (JavaScript, Python, Go, Ruby)
    • Browser tracking snippet generator
    • CLI tool for management
    • GraphQL playground
    • API versioning strategy
  4. Advanced Querying

    • Custom SQL query builder
    • Saved queries and alerts
    • Scheduled reports
    • Data streaming with Apache Kafka
    • Real-time aggregations

🤝 Contributing

Contributions are welcome! Please follow these steps:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Please ensure:

  • Code follows PEP 8 style guidelines
  • Tests pass (pytest)
  • New features include tests
  • Documentation is updated

📄 License

This project is open source and available under the MIT License.

🙏 Acknowledgments

  • FastAPI for the excellent web framework
  • TimescaleDB for time-series database capabilities
  • The open-source community for inspiration and tools

📞 Support


Built with ❤️ by observer04

Star ⭐ this repository if you find it helpful!