Skip to content

eangao/ai-document-analyzer

Repository files navigation

AI Document Analyzer

Upload any PDF document and get instant AI-powered structured analysis β€” summaries, key entities, dates, obligations, risk flags, and action items.

Live Demo: https://ai-document-analyzer-ea.vercel.app

Next.js TypeScript Claude API Tailwind CSS


🎯 Features

  • Executive Summary β€” AI-generated 2-3 sentence overview with document type detection
  • Entity Extraction β€” Automatically identifies key parties and their roles
  • Date Detection β€” Extracts all dates with contextual descriptions
  • Financial Analysis β€” Identifies payments, fees, penalties, taxes, and totals with color coding
  • Obligation Mapping β€” Clear table of who owes what, by when
  • Risk Flagging β€” Highlights concerning terms, unfavorable clauses, and missing information (severity: high/medium/low)
  • Key Terms Glossary β€” Legal and technical jargon explained in plain English
  • Action Items β€” Extracted next steps and required actions
  • JSON Export β€” Download full analysis results for integration or record-keeping
  • Real-time Processing β€” No database, no login required β€” instant results

πŸ›  Tech Stack

Layer Technology
Framework Next.js 16 (App Router + Turbopack)
Language TypeScript (strict mode)
AI Model Claude Sonnet 4.5 (Anthropic API)
UI Components shadcn/ui + Tailwind CSS
PDF Processing unpdf
Icons lucide-react
Deployment Vercel
Testing Vitest + React Testing Library (667 tests)

πŸš€ Quick Start

Prerequisites

Installation

# Clone the repository
git clone git@github-personal:eangao/ai-document-analyzer.git
cd ai-document-analyzer

# Install dependencies
npm install

# Create environment file
cp .env.example .env.local

Environment Variables

Create a .env.local file in the project root:

ANTHROPIC_API_KEY=your_anthropic_api_key_here
DEMO_MODE=true  # Enable cost control (5K char limit vs 80K)

Environment Variables Explained:

  • ANTHROPIC_API_KEY β€” Required. Your Anthropic API key from console.anthropic.com
  • DEMO_MODE β€” Optional. Set to true to enable cost protection (reduces API token usage by ~80%)

Run Development Server

npm run dev

Open http://localhost:3000 to see the app.

Build for Production

npm run build
npm start

Run Tests

# Run all tests
npm test

# Run tests in watch mode
npm run test:watch

# Run tests with coverage
npm run test:coverage

πŸ“„ Sample Documents for Testing

The repository includes 7 sample PDFs in docs/sample documents/ to test the analyzer:

Document Type Filename What It Tests
Contract sample-contract.pdf Multi-party agreements, obligations, dates, risk flags
Invoice sample-invoice.pdf Financial items, payment terms, line items, totals
Report sample-report.pdf Executive summaries, data analysis, findings
Legal (NDA) sample-nda-legal.pdf Legal terminology, confidentiality clauses, penalties
Financial Statement sample-financial-statement.pdf Complex financial data, accounting terms, balance sheets
Business Letter sample-business-letter.pdf Simple document structure, minimal entities
Employee Handbook sample-employee-handbook.pdf Policies, procedures, obligations, multi-section documents

How to Use:

  1. Visit the live demo
  2. Upload any sample document from the docs/sample documents/ folder in this repository
  3. Observe how the AI extracts entities, dates, obligations, risk flags, and financial items
  4. Compare results across different document types

πŸ“ Project Structure

ai-document-analyzer/
β”œβ”€β”€ app/
β”‚   β”œβ”€β”€ api/
β”‚   β”‚   β”œβ”€β”€ extract/route.ts       # PDF text extraction endpoint
β”‚   β”‚   └── analyze/route.ts       # Claude API analysis endpoint
β”‚   β”œβ”€β”€ layout.tsx                 # Root layout with metadata
β”‚   β”œβ”€β”€ page.tsx                   # Main application page
β”‚   └── globals.css                # Global styles
β”œβ”€β”€ components/
β”‚   β”œβ”€β”€ FileUpload.tsx             # Drag & drop PDF upload
β”‚   β”œβ”€β”€ AnalysisLoader.tsx         # Step-by-step loading animation
β”‚   β”œβ”€β”€ ExportButton.tsx           # JSON export functionality
β”‚   β”œβ”€β”€ RateLimitAlert.tsx         # Countdown timer for rate-limited users
β”‚   β”œβ”€β”€ ui/                        # shadcn/ui components (managed by CLI)
β”‚   └── dashboard/
β”‚       β”œβ”€β”€ AnalysisDashboard.tsx  # Main dashboard layout
β”‚       β”œβ”€β”€ SummaryCard.tsx        # Document summary + type badge
β”‚       β”œβ”€β”€ RiskFlags.tsx          # Risk severity alerts
β”‚       β”œβ”€β”€ EntitiesAndDates.tsx   # Key parties and dates grid
β”‚       β”œβ”€β”€ FinancialItems.tsx     # Financial breakdown
β”‚       β”œβ”€β”€ ObligationsTable.tsx   # Obligations table
β”‚       β”œβ”€β”€ KeyTerms.tsx           # Terms glossary
β”‚       └── ActionItems.tsx        # Action items checklist
β”œβ”€β”€ types/
β”‚   └── analysis.ts                # TypeScript interfaces
β”œβ”€β”€ lib/
β”‚   β”œβ”€β”€ rate-limit.ts              # Dual-layer rate limiter (3 req/hour per IP + 50 req/day global)
β”‚   β”œβ”€β”€ error-messages.ts          # User-friendly error message generation
β”‚   β”œβ”€β”€ error-parser.ts            # Frontend error type discrimination
β”‚   └── demo-mode.ts               # Cost control configuration
β”œβ”€β”€ docs/
β”‚   └── sample documents/          # 7 sample PDFs for testing (contracts, invoices, reports, etc.)
└── next.config.ts                 # Next.js 16 TypeScript config

🎨 Features Deep Dive

Document Types Supported

  • Contracts
  • Invoices
  • Reports
  • Legal documents
  • Financial statements
  • Letters
  • General business documents

Analysis Components

Risk Severity Color Coding:

  • πŸ”΄ High β€” Critical issues requiring immediate attention
  • 🟑 Medium β€” Notable concerns to review
  • πŸ”΅ Low β€” Minor observations

Financial Item Types:

  • πŸ’° Payment β€” Regular payments (green)
  • πŸ’΅ Fee β€” Service fees (blue)
  • ⚠️ Penalty β€” Late fees, penalties (red)
  • πŸ“Š Total β€” Summary amounts (bold)
  • 🏦 Tax β€” Tax amounts (yellow)
  • 🎁 Discount β€” Discounts applied (purple)

πŸ§ͺ Testing

The project includes comprehensive test coverage (667 tests, 93.6% coverage):

  • Component Tests β€” All React components including RateLimitAlert countdown timer
  • API Tests β€” Dual-layer rate limiting, error message generation, DEMO_MODE truncation
  • Utility Tests β€” Error parser (39 tests), demo mode (21 tests), rate limiter (39 tests)
  • Integration Tests β€” Full user workflows, error handling, export functionality
  • Accessibility Tests β€” WCAG 2.1 Level A/AA compliance across all components

Test Distribution:

  • Rate limiting: 39 tests (per-IP hourly + global daily limits)
  • Error messages: 53 tests (100% coverage on user-facing messages)
  • Error parser: 39 tests (type-safe error discrimination)
  • RateLimitAlert: 30 tests (countdown timer, styling, accessibility)
  • Demo mode: 21 tests (cost control configuration)

Coverage: 93.6% statements | 87% branches | 92.7% functions


πŸ”’ Security & Cost Protection

Multi-Layer Rate Limiting

This application implements three layers of cost protection to prevent unexpected API charges:

  1. Per-IP Hourly Limit: 3 requests per hour per IP address

    • Prevents individual users from exhausting API quota
    • Resets on a rolling 1-hour window
    • Returns friendly error with countdown timer
  2. Global Daily Cap: 50 requests per day across all users

    • Hard limit to prevent runaway costs
    • Resets daily at UTC midnight
    • Displays "high traffic" message when reached
  3. Demo Mode Toggle: Configurable text truncation

    • DEMO_MODE=true β€” 5,000 characters (~80% cost reduction)
    • DEMO_MODE=false β€” 80,000 characters (full analysis)
    • Applied before Claude API call to save tokens

User Experience

When rate limits are reached, users see:

  • Clear error messages explaining what happened and why
  • Countdown timers showing exact retry time (MM:SS format)
  • Portfolio context acknowledging this is a demo limitation
  • Different styling for per-IP (amber) vs global (blue) limits

Example Messages:

  • Per-IP: "You've reached your personal limit of 3 documents per hour. Retry available in 47 minutes."
  • Global: "This demo is experiencing high traffic. Try again tomorrow morning (UTC)."

Security Features

  • File Validation: PDF-only, 10MB maximum file size
  • Input Sanitization: All user inputs validated and sanitized
  • No Data Persistence: No database, no stored files β€” privacy by design
  • Type-Safe Errors: Discriminated union types for frontend error handling

Estimated Monthly Cost

With current limits and DEMO_MODE=true:

  • Per-IP limit: 3 req/hour = 96% reduction vs unlimited
  • Global cap: 50 req/day maximum
  • Demo mode: 5K chars = ~80% token reduction per request
  • Total: ~$0.45/month maximum (vs potentially unlimited)

🚒 Deployment

Deploy to Vercel

Deploy with Vercel

  1. Click "Deploy" button above or go to vercel.com
  2. Import your GitHub repository
  3. Add environment variables in Vercel dashboard:
    • ANTHROPIC_API_KEY = your Anthropic API key
    • DEMO_MODE = true (optional, enables cost control)
  4. Click Deploy

Your app will be live at your-project.vercel.app


πŸ›£ Roadmap

MVP (Current):

  • βœ… PDF upload and text extraction
  • βœ… Claude AI structured analysis
  • βœ… Professional dashboard UI
  • βœ… JSON export
  • βœ… Comprehensive testing

Future Enhancements:

  • πŸ“ RAG (Retrieval-Augmented Generation) β€” Upload multiple documents, ask questions across them
  • πŸ” User authentication (NextAuth.js)
  • πŸ’Ύ Database integration (Prisma + PostgreSQL)
  • πŸ“Š Analysis history and saved documents
  • 🌐 Multi-language support
  • πŸ’³ SaaS monetization (Stripe integration)
  • πŸ“§ Email reports
  • πŸ”„ Batch processing

🀝 Contributing

Contributions are welcome! Feel free to open issues or submit pull requests.

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/your-feature
  3. Commit your changes: git commit -m "feat: add your feature"
  4. Push to the branch: git push origin feature/your-feature
  5. Open a pull request

πŸ“„ License

This project is open source and available under the MIT License.


πŸ‘¨β€πŸ’» Built By

Elmar Angao Full Stack Developer | AI Integration Specialist

Specializing in Next.js, TypeScript, and AI-powered applications using Claude API and modern web technologies.

πŸ“§ Contact: elmarcera@gmail.com πŸ’Ό Portfolio: [Coming Soon] πŸ”— LinkedIn: linkedin.com/in/elmar-angao πŸ™ GitHub: github.com/eangao


πŸ™ Acknowledgments

  • Next.js β€” The React Framework for Production
  • Anthropic β€” Claude AI API
  • shadcn/ui β€” Beautifully designed components
  • Vercel β€” Deployment platform

⭐ If you find this project useful, please consider giving it a star on GitHub!

Releases

No releases published

Packages

 
 
 

Contributors