Skip to content

Anand-Raut/skimpdf

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 

Repository files navigation

SkimPDF — AI-Powered PDF Summarizer 📝

A personal tool for summarizing large English PDFs with AI. SkimPDF breaks down your documents into manageable chunks and uses transformers to generate concise summaries — perfect for skimming research papers, reports, or any lengthy document.


✨ Features

  • ✅ Upload any English PDF and get a summarized output
  • ✅ Smart token-aware chunking to handle large files
  • ✅ Abstractive summarization powered by facebook/bart-large-cnn
  • ✅ Markdown-based parsing with pymupdf4llm for structured extraction
  • ✅ React-based sleek frontend (in separate repo)
  • ✅ CORS-enabled API for easy frontend integration

🛠️ Tech Stack

Layer Tech
Backend FastAPI, HuggingFace Transformers, PyMuPDF, pymupdf4llm
Frontend React (modern, clean UI)
Model facebook/bart-large-cnn

🚀 Getting Started

Clone the repo:

git clone https://github.com/Anand-Raut/skimpdf.git
cd skimpdf

Install backend dependencies:

pip install -r requirements.txt

Run the FastAPI server:

uvicorn main:app --reload

📄 API Reference

GET / — Health Check

Returns:

{ "isrunning": true }

POST /upload — Upload PDF & Get Summary

Body: multipart/form-data with field pdfFile

Response:

{
  "filename": "your_uploaded_file.pdf",
  "status": "received",
  "summary": "Summarized text here..."
}

📝 How It Works

  1. User uploads a PDF from the React frontend
  2. Backend converts PDF to Markdown via pymupdf4llm
  3. Text is chunked based on token count using tokenizer logic
  4. Each chunk is summarized using facebook/bart-large-cnn
  5. Summaries are merged and sent back as the final response

💻 Deployment

SkimPDF is designed for local use only. Run it on your machine, connect it with the frontend, and you’re good to go.


🚧 TODO / Roadmap

  • Async background summarization
  • Smarter semantic chunking
  • PDF upload frontend (styled, responsive)
  • Optional Docker support

🧑‍💻 Author

Anand Raut GitHub Profile


📜 License

MIT License — Open source, use it, tweak it, improve it. If it breaks, you get to keep both pieces.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published