Skip to content

cyronis/Document-Manager

Β 
Β 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ“„ DocumentManager

Python FastAPI Docker License

Transform your document chaos into an AI-powered knowledge powerhouse

Most document management systems feel like they're stuck in 2005. DocumentManager brings AI intelligence to understand your documents' actual content and meaning - not just their titles or tags.

Features β€’ Quick Start β€’ Demo β€’ Documentation β€’ API β€’ Contributing


🌟 Features

πŸ€– AI-Powered Intelligence

  • Semantic Search: Find documents by meaning, not just keywords. Search for "payment terms" and find invoicing documents, contracts with payment clauses, and financial agreements - even if they never use those exact words
  • Smart OCR: Extract text from scanned PDFs, photos of whiteboards, and documents in 50+ languages using Tesseract OCR
  • Auto-Tagging: AI automatically categorizes documents based on content - financial reports get tagged as "finance", contracts as "legal", technical specs as "engineering"
  • Natural Language Queries: Just ask questions like "Show me all contracts expiring this year" or "What were our Q4 marketing expenses?"
  • AI-Generated Summaries: Understand large documents at a glance with automatic summary generation

πŸ”’ Enterprise-Ready Security

  • Role-Based Access Control: Fine-grained permissions for users and groups
  • Complete Audit Trails: Track all document activities
  • Privacy First: Option to use Azure OpenAI to keep models in your own tenant
  • Self-Hosted: All data stays on your infrastructure - no vendor lock-in
  • Session Management: Secure session handling with automatic expiry

πŸš€ Modern Architecture

  • RESTful API: Complete OpenAPI 3.0 documented API built with FastAPI
  • Vector Database: ChromaDB for lightning-fast semantic search using embeddings
  • Flexible AI: Choose between OpenAI or Azure OpenAI (your choice)
  • Simple Frontend: Vanilla JavaScript keeping it simple and fast
  • Docker-Ready: Deploy in minutes with included setup script

πŸ“Έ Demo

Dashboard Overview

Dashboard Clean, intuitive dashboard showing document statistics and recent activities

AI-Powered Search

AI Search Find documents by meaning, not just keywords - ask questions in natural language

AI Chat

AI Chat Interactive AI chat for document analysis and knowledge extraction

Document Upload & Processing

Document Upload Drag-and-drop interface with automatic text extraction and AI tagging

Smart Tags & Organization

Tags Management AI auto-generates correspondents, document types, and tags - fully customizable with color coding

Document Viewer

Document Viewer Built-in document viewer with search highlighting and annotations

User Management

User Management Enterprise-grade user and permission management

Settings & Configuration

Settings Easy configuration of AI providers and system settings

πŸš€ Quick Start

Getting Started in 3 Minutes

The beauty of open source? You can have this running on your machine right now:

Prerequisites

  • Docker installed and running
  • 4GB+ RAM recommended
  • 10GB+ free disk space

🐳 Using Docker (Recommended)

# Clone the repository
git clone https://github.com/your-username/DocumentManager.git
cd DocumentManager

# Run the setup script
./setup.sh prod

# Or manually with Docker
docker build -t documentmanager .
docker run -d \
  --name documentmanager \
  -p 8000:8000 \
  -v $(pwd)/data:/app/data \
  -v $(pwd)/storage:/app/storage \
  documentmanager

The application will be available at http://localhost:8000

πŸ› οΈ Using the Setup Script

The setup.sh script provides an easy way to manage your DocumentManager installation:

# Start development environment with hot reload
./setup.sh dev

# Start production environment
./setup.sh prod

# Build Docker image
./setup.sh build

# View logs
./setup.sh logs

# Check status
./setup.sh status

# Stop all containers
./setup.sh stop

πŸ’» Local Development

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Run development server
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

πŸ“‹ Initial Setup

  1. Create Admin Account

    • Navigate to http://localhost:8000
    • The first user registration automatically becomes admin
  2. Configure AI Provider

    • Go to Settings β†’ AI Configuration
    • Choose between OpenAI or Azure OpenAI
    • Enter your API credentials
    • Test the connection
  3. Start Using

    • Upload documents via drag-and-drop
    • Watch AI automatically extract text, generate summaries, and categorize
    • AI detects: Title, Summary, Correspondent, Document Type, Document Date, Tags, and Tax Relevance
    • Use semantic search to find information instantly with natural language

πŸ—οΈ Architecture

DocumentManager/
β”œβ”€β”€ app/                    # Backend FastAPI application
β”‚   β”œβ”€β”€ api/               # REST API endpoints
β”‚   β”œβ”€β”€ core/              # Core business logic
β”‚   β”œβ”€β”€ models/            # SQLAlchemy models
β”‚   └── services/          # AI, OCR, and storage services
β”œβ”€β”€ frontend/              # Vanilla JS frontend
β”œβ”€β”€ docker/                # Docker configuration
β”œβ”€β”€ tests/                 # Test suite
└── docs/                  # Documentation

Technology Stack

  • Backend: FastAPI, SQLAlchemy, Pydantic
  • AI/ML: OpenAI GPT-4, Azure OpenAI, ChromaDB
  • OCR: Tesseract (50+ languages)
  • Database: SQLite (default), PostgreSQL (production)
  • Frontend: Vanilla JavaScript, modern CSS
  • Deployment: Docker, Docker Compose

πŸ”§ Configuration

Environment Variables

Create a .env file in the root directory:

# Security - CHANGE IN PRODUCTION!
SECRET_KEY=your-secret-key-here

# Database
DATABASE_URL=sqlite:///./data/documents.db
# For PostgreSQL: postgresql://user:pass@localhost/dbname

# AI Provider
AI_PROVIDER=openai
OPENAI_API_KEY=sk-...
# Or for Azure:
# AI_PROVIDER=azure
# AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com
# AZURE_OPENAI_KEY=your-key

# Application Settings
ENVIRONMENT=production
LOG_LEVEL=INFO
MAX_UPLOAD_SIZE=104857600  # 100MB
ALLOWED_EXTENSIONS=pdf,jpg,jpeg,png,txt,doc,docx

# Storage
STORAGE_TYPE=local
STORAGE_PATH=/app/data/storage

πŸ“š API Documentation

Interactive API Docs

Once running, access the interactive API documentation at:

  • Swagger UI: http://localhost:8000/docs
  • ReDoc: http://localhost:8000/redoc

Quick API Examples

import requests

# Base URL
BASE_URL = "http://localhost:8000"

# 1. Authentication
response = requests.post(f"{BASE_URL}/api/auth/login", json={
    "username": "admin",
    "password": "your-password"
})
session = requests.Session()
session.cookies = response.cookies

# 2. Upload Document
with open("document.pdf", "rb") as f:
    response = session.post(
        f"{BASE_URL}/api/documents/upload",
        files={"file": f},
        data={"title": "Q4 Report", "tags": "finance,quarterly"}
    )
    document_id = response.json()["id"]

# 3. Semantic Search
response = session.get(f"{BASE_URL}/api/search/semantic", params={
    "query": "What were the Q4 revenue numbers?",
    "limit": 5
})
results = response.json()

# 4. Ask Questions
response = session.post(f"{BASE_URL}/api/ai/ask", json={
    "question": "Summarize the key findings from Q4 reports",
    "document_ids": [document_id]
})
answer = response.json()["answer"]

🌟 Why Open Source?

Your document management system shouldn't be a black box. With DocumentManager you can:

  • Audit the code - Know exactly what happens to your documents
  • Customize for your needs - Modify anything to fit your workflow
  • Self-host everything - Your documents, your rules
  • Contribute improvements - Join the community making document management better

No vendor lock-in. Complete transparency. Total control.

πŸš€ Roadmap

The foundation is solid, but we're just getting started:

  • Self-hosted AI models - Run everything locally
  • Mobile apps - For on-the-go access and document scanning
  • Workflow automation - Documents that route themselves
  • Advanced analytics - Insights from your document repository
  • Plugin system - Custom integrations for your needs

🀝 Contributing

We love contributions! Please see our Contributing Guide for details.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

Development Setup

# Clone your fork
git clone https://github.com/your-username/DocumentManager.git
cd DocumentManager

# Create branch
git checkout -b feature/your-feature

# Install pre-commit hooks
pip install pre-commit
pre-commit install

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


Built with ❀️ by Jannik Reinhard and Fabian Peschke

⭐ Star the repo if you find it useful β€” it really helps with motivation!

β˜• If you want to support the project, you can buy us a coffee

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 39.7%
  • JavaScript 29.3%
  • HTML 20.2%
  • CSS 9.8%
  • Other 1.0%