Upload any PDF document and get instant AI-powered structured analysis β summaries, key entities, dates, obligations, risk flags, and action items.
Live Demo: https://ai-document-analyzer-ea.vercel.app
- Executive Summary β AI-generated 2-3 sentence overview with document type detection
- Entity Extraction β Automatically identifies key parties and their roles
- Date Detection β Extracts all dates with contextual descriptions
- Financial Analysis β Identifies payments, fees, penalties, taxes, and totals with color coding
- Obligation Mapping β Clear table of who owes what, by when
- Risk Flagging β Highlights concerning terms, unfavorable clauses, and missing information (severity: high/medium/low)
- Key Terms Glossary β Legal and technical jargon explained in plain English
- Action Items β Extracted next steps and required actions
- JSON Export β Download full analysis results for integration or record-keeping
- Real-time Processing β No database, no login required β instant results
| Layer | Technology |
|---|---|
| Framework | Next.js 16 (App Router + Turbopack) |
| Language | TypeScript (strict mode) |
| AI Model | Claude Sonnet 4.5 (Anthropic API) |
| UI Components | shadcn/ui + Tailwind CSS |
| PDF Processing | unpdf |
| Icons | lucide-react |
| Deployment | Vercel |
| Testing | Vitest + React Testing Library (667 tests) |
- Node.js 20.9.0+ (Next.js 16 requirement)
- Anthropic API key from https://console.anthropic.com/settings/keys
# Clone the repository
git clone git@github-personal:eangao/ai-document-analyzer.git
cd ai-document-analyzer
# Install dependencies
npm install
# Create environment file
cp .env.example .env.localCreate a .env.local file in the project root:
ANTHROPIC_API_KEY=your_anthropic_api_key_here
DEMO_MODE=true # Enable cost control (5K char limit vs 80K)Environment Variables Explained:
ANTHROPIC_API_KEYβ Required. Your Anthropic API key from console.anthropic.comDEMO_MODEβ Optional. Set totrueto enable cost protection (reduces API token usage by ~80%)
npm run devOpen http://localhost:3000 to see the app.
npm run build
npm start# Run all tests
npm test
# Run tests in watch mode
npm run test:watch
# Run tests with coverage
npm run test:coverageThe repository includes 7 sample PDFs in docs/sample documents/ to test the analyzer:
| Document Type | Filename | What It Tests |
|---|---|---|
| Contract | sample-contract.pdf |
Multi-party agreements, obligations, dates, risk flags |
| Invoice | sample-invoice.pdf |
Financial items, payment terms, line items, totals |
| Report | sample-report.pdf |
Executive summaries, data analysis, findings |
| Legal (NDA) | sample-nda-legal.pdf |
Legal terminology, confidentiality clauses, penalties |
| Financial Statement | sample-financial-statement.pdf |
Complex financial data, accounting terms, balance sheets |
| Business Letter | sample-business-letter.pdf |
Simple document structure, minimal entities |
| Employee Handbook | sample-employee-handbook.pdf |
Policies, procedures, obligations, multi-section documents |
How to Use:
- Visit the live demo
- Upload any sample document from the
docs/sample documents/folder in this repository - Observe how the AI extracts entities, dates, obligations, risk flags, and financial items
- Compare results across different document types
ai-document-analyzer/
βββ app/
β βββ api/
β β βββ extract/route.ts # PDF text extraction endpoint
β β βββ analyze/route.ts # Claude API analysis endpoint
β βββ layout.tsx # Root layout with metadata
β βββ page.tsx # Main application page
β βββ globals.css # Global styles
βββ components/
β βββ FileUpload.tsx # Drag & drop PDF upload
β βββ AnalysisLoader.tsx # Step-by-step loading animation
β βββ ExportButton.tsx # JSON export functionality
β βββ RateLimitAlert.tsx # Countdown timer for rate-limited users
β βββ ui/ # shadcn/ui components (managed by CLI)
β βββ dashboard/
β βββ AnalysisDashboard.tsx # Main dashboard layout
β βββ SummaryCard.tsx # Document summary + type badge
β βββ RiskFlags.tsx # Risk severity alerts
β βββ EntitiesAndDates.tsx # Key parties and dates grid
β βββ FinancialItems.tsx # Financial breakdown
β βββ ObligationsTable.tsx # Obligations table
β βββ KeyTerms.tsx # Terms glossary
β βββ ActionItems.tsx # Action items checklist
βββ types/
β βββ analysis.ts # TypeScript interfaces
βββ lib/
β βββ rate-limit.ts # Dual-layer rate limiter (3 req/hour per IP + 50 req/day global)
β βββ error-messages.ts # User-friendly error message generation
β βββ error-parser.ts # Frontend error type discrimination
β βββ demo-mode.ts # Cost control configuration
βββ docs/
β βββ sample documents/ # 7 sample PDFs for testing (contracts, invoices, reports, etc.)
βββ next.config.ts # Next.js 16 TypeScript config
- Contracts
- Invoices
- Reports
- Legal documents
- Financial statements
- Letters
- General business documents
Risk Severity Color Coding:
- π΄ High β Critical issues requiring immediate attention
- π‘ Medium β Notable concerns to review
- π΅ Low β Minor observations
Financial Item Types:
- π° Payment β Regular payments (green)
- π΅ Fee β Service fees (blue)
β οΈ Penalty β Late fees, penalties (red)- π Total β Summary amounts (bold)
- π¦ Tax β Tax amounts (yellow)
- π Discount β Discounts applied (purple)
The project includes comprehensive test coverage (667 tests, 93.6% coverage):
- Component Tests β All React components including RateLimitAlert countdown timer
- API Tests β Dual-layer rate limiting, error message generation, DEMO_MODE truncation
- Utility Tests β Error parser (39 tests), demo mode (21 tests), rate limiter (39 tests)
- Integration Tests β Full user workflows, error handling, export functionality
- Accessibility Tests β WCAG 2.1 Level A/AA compliance across all components
Test Distribution:
- Rate limiting: 39 tests (per-IP hourly + global daily limits)
- Error messages: 53 tests (100% coverage on user-facing messages)
- Error parser: 39 tests (type-safe error discrimination)
- RateLimitAlert: 30 tests (countdown timer, styling, accessibility)
- Demo mode: 21 tests (cost control configuration)
Coverage: 93.6% statements | 87% branches | 92.7% functions
This application implements three layers of cost protection to prevent unexpected API charges:
-
Per-IP Hourly Limit: 3 requests per hour per IP address
- Prevents individual users from exhausting API quota
- Resets on a rolling 1-hour window
- Returns friendly error with countdown timer
-
Global Daily Cap: 50 requests per day across all users
- Hard limit to prevent runaway costs
- Resets daily at UTC midnight
- Displays "high traffic" message when reached
-
Demo Mode Toggle: Configurable text truncation
DEMO_MODE=trueβ 5,000 characters (~80% cost reduction)DEMO_MODE=falseβ 80,000 characters (full analysis)- Applied before Claude API call to save tokens
When rate limits are reached, users see:
- Clear error messages explaining what happened and why
- Countdown timers showing exact retry time (MM:SS format)
- Portfolio context acknowledging this is a demo limitation
- Different styling for per-IP (amber) vs global (blue) limits
Example Messages:
- Per-IP: "You've reached your personal limit of 3 documents per hour. Retry available in 47 minutes."
- Global: "This demo is experiencing high traffic. Try again tomorrow morning (UTC)."
- File Validation: PDF-only, 10MB maximum file size
- Input Sanitization: All user inputs validated and sanitized
- No Data Persistence: No database, no stored files β privacy by design
- Type-Safe Errors: Discriminated union types for frontend error handling
With current limits and DEMO_MODE=true:
- Per-IP limit: 3 req/hour = 96% reduction vs unlimited
- Global cap: 50 req/day maximum
- Demo mode: 5K chars = ~80% token reduction per request
- Total: ~$0.45/month maximum (vs potentially unlimited)
- Click "Deploy" button above or go to vercel.com
- Import your GitHub repository
- Add environment variables in Vercel dashboard:
ANTHROPIC_API_KEY= your Anthropic API keyDEMO_MODE=true(optional, enables cost control)
- Click Deploy
Your app will be live at your-project.vercel.app
MVP (Current):
- β PDF upload and text extraction
- β Claude AI structured analysis
- β Professional dashboard UI
- β JSON export
- β Comprehensive testing
Future Enhancements:
- π RAG (Retrieval-Augmented Generation) β Upload multiple documents, ask questions across them
- π User authentication (NextAuth.js)
- πΎ Database integration (Prisma + PostgreSQL)
- π Analysis history and saved documents
- π Multi-language support
- π³ SaaS monetization (Stripe integration)
- π§ Email reports
- π Batch processing
Contributions are welcome! Feel free to open issues or submit pull requests.
- Fork the repository
- Create a feature branch:
git checkout -b feature/your-feature - Commit your changes:
git commit -m "feat: add your feature" - Push to the branch:
git push origin feature/your-feature - Open a pull request
This project is open source and available under the MIT License.
Elmar Angao Full Stack Developer | AI Integration Specialist
Specializing in Next.js, TypeScript, and AI-powered applications using Claude API and modern web technologies.
π§ Contact: elmarcera@gmail.com πΌ Portfolio: [Coming Soon] π LinkedIn: linkedin.com/in/elmar-angao π GitHub: github.com/eangao
- Next.js β The React Framework for Production
- Anthropic β Claude AI API
- shadcn/ui β Beautifully designed components
- Vercel β Deployment platform
β If you find this project useful, please consider giving it a star on GitHub!