Transform your PDFs into conversational knowledge with AI-powered chat
ChatPDF is a modern web application that allows users to upload PDF documents and engage in intelligent conversations about their content using advanced AI technologies. Built with .NET 9, Blazor Server, and cutting-edge AI tools, it provides a seamless experience for document analysis and information retrieval.
New to AI and RAG? Check out our comprehensive tutorial: "Building AI-Powered Document Chat with RAG in .NET" - a complete guide designed specifically for .NET developers who want to learn about AI integration, local LLMs, and Retrieval-Augmented Generation.
- Document Library: Modern table view with search, sort, and filter capabilities
- File Operations: Download, view, and delete documents with confirmation dialogs
- Real-time Updates: Document count badges and status notifications
- Intelligent Conversations: Ask questions about your PDF content in natural language
- Contextual Responses: AI provides accurate answers based on document content
- Source Citations: Each response includes citations with exact page references
- Function Calling: AI automatically searches relevant document sections
- Streaming Responses: Real-time response generation with live updates
- Chat History: Persistent conversation history per user
- Glassmorphism Design: Beautiful modern interface with gradient backgrounds
- Responsive Layout: Optimized for desktop and mobile devices
- Theme Ready: Professional color scheme with purple/blue gradients
- Interactive Elements: Smooth animations and hover effects
- File Upload: Drag-and-drop PDF upload with progress indicators
- Document Viewer: Integrated PDF viewer for document references
- Health Monitoring: Built-in diagnostics page to test all system components
- Component Testing: Individual tests for documents, embeddings, search, and chat
- Error Identification: Detailed error reporting for troubleshooting
- OpenTelemetry: Distributed tracing and observability support
- Aspire Dashboard: Integrated monitoring and metrics visualization
ChatPDF follows a modern microservices architecture leveraging the latest AI and database technologies:
graph TB
A[Blazor Server UI] --> B[Document Service]
A --> C[Chat Service]
A --> D[Semantic Search]
B --> E[File System]
C --> F[Ollama LLM]
D --> G[Qdrant Vector DB]
H[Data Ingestor] --> I[PDF Processing]
I --> J[Text Chunking]
J --> K[Embedding Generation]
K --> G
F --> L[llama3.2 Model]
F --> M[nomic-embed-text Model]
- Blazor Server: Real-time interactive web UI with server-side rendering
- Modern CSS: Tailwind CSS with custom gradients and glassmorphism effects
- Component Architecture: Reusable Razor components for chat, documents, and navigation
- Microsoft.Extensions.AI: Unified AI framework for .NET applications
- Function Calling: Automatic tool invocation for document search
- Chat Client: Seamless integration with language models
- Streaming Support: Real-time response generation
- OpenTelemetry Integration: Built-in observability and telemetry
- PDF Ingestion: Automatic document processing and text extraction
- Text Chunking: Intelligent document segmentation for optimal retrieval
- Vector Generation: Semantic embeddings for similarity search
- File System: PDF document storage in organized directory structure
- Vector Database: High-performance similarity search with Qdrant
- Chat History: JSON-based conversation persistence
- Authentication: OpenID Connect integration with identity providers
The backbone of ChatPDF's AI capabilities, providing:
- Unified API: Consistent interface across different AI providers
- Function Calling: Automatic tool invocation based on user queries
- Streaming Support: Real-time response generation
- Observability: Built-in telemetry and monitoring
// Example: AI automatically calls search functions
chatOptions.Tools = [AIFunctionFactory.Create(SearchAsync)];
var response = await ChatClient.GetResponseAsync(messages, chatOptions);Local AI model hosting for privacy and performance:
- Purpose: Primary chat and conversation model
- Capabilities: Natural language understanding, context awareness, function calling
- Performance: Optimized for conversational AI tasks
- Purpose: Convert text to high-dimensional vectors
- Capabilities: Semantic similarity, multilingual support
- Vector Dimensions: 768-dimensional embeddings for precise matching
# Download required models
ollama pull llama3.2
ollama pull nomic-embed-textHigh-performance vector storage and similarity search:
- Semantic Search: Find relevant content based on meaning, not keywords
- Scalability: Efficient handling of large document collections
- Real-time Updates: Immediate availability of newly uploaded documents
- Filtering: Search within specific documents or date ranges
{
"id": "uuid",
"vector": [0.1, -0.5, 0.9, ...], // 768 dimensions
"payload": {
"document_id": "research-paper.pdf",
"page_number": 5,
"text": "Machine learning algorithms...",
"chunk_index": 12
}
}PDF Upload → Text Extraction → Chunking → Embedding Generation → Vector Storage
- Upload: User uploads PDF through modern drag-drop interface
- Validation: File type and size verification (max 10MB)
- Processing: PDF text extraction and intelligent chunking
- Embedding: Convert text chunks to vectors using
nomic-embed-text - Storage: Save vectors to Qdrant with metadata (page numbers, document info)
User Question → Semantic Search → Context Retrieval → LLM Processing → Response + Citations
- Question: User asks about document content
- Search: AI automatically searches relevant document sections
- Context: Retrieve top matching chunks from vector database
- Generation: LLM processes context and generates response
- Citations: Include exact page references and quotes
[Description("Searches for information using a phrase or keyword")]
private async Task<IEnumerable<string>> SearchAsync(string searchPhrase, string? filenameFilter = null)
{
var results = await Search.SearchAsync(searchPhrase, filenameFilter, maxResults: 5);
return results.Select(result =>
$"<result filename=\"{result.DocumentId}\" page_number=\"{result.PageNumber}\">{result.Text}</result>");
}- .NET 9 SDK: Download here
- Docker Desktop: Download here
- Ollama: Download here
# Install required Ollama models
ollama pull llama3.2 # Chat model
ollama pull nomic-embed-text # Embedding model# Start Qdrant vector database
docker run -p 6333:6333 qdrant/qdrant
# Start Ollama (if not running natively)
docker run -p 11434:11434 ollama/ollamagit clone <repository-url>
cd ChatPDF# Start Ollama
ollama serve
# Download models
ollama pull llama3.2
ollama pull nomic-embed-text
# Start Qdrant
docker run -p 6333:6333 qdrant/qdrant- Open
ChatPDF-Ollama.sln - Set
ChatPDF.AppHostas startup project - Press
F5or click "Start"
cd ChatPDF.AppHost
dotnet run- Install C# Dev Kit extension
- Open project folder
- Run from Debug view
- Main App: https://localhost:7002
- Aspire Dashboard: https://localhost:15888
{
"ConnectionStrings": {
"vectordb": "Endpoint=http://localhost:6333",
"chat": "Endpoint=http://localhost:11434",
"embeddings": "Endpoint=http://localhost:11434"
},
"Ollama": {
"Chat": {
"ModelName": "llama3.2",
"EnableFunctionInvocation": true
},
"Embeddings": {
"ModelName": "nomic-embed-text"
}
},
"Application": {
"DataIngestion": {
"PdfDirectory": "Data",
"IngestOnStartup": true,
"MaxFileSizeMB": 10
}
}
}Access /diagnostics to test system components:
- 📄 Test Documents: Verify PDF detection and file access
- 🧠 Test Embeddings: Check Ollama embedding service
- 🔍 Test Search: Validate Qdrant vector database
- 💬 Test Chat: Confirm language model connectivity
| Issue | Symptoms | Solution |
|---|---|---|
| Ollama Not Running | Embedding/Chat tests fail | ollama serve |
| Missing Models | Model-specific errors | ollama pull llama3.2 |
| Qdrant Down | Search test fails | docker run -p 6333:6333 qdrant/qdrant |
| No Documents | Empty document list | Upload PDFs to wwwroot/Data/ |
ChatPDF/
├── ChatPDF.AppHost/ # .NET Aspire host project
├── ChatPDF.ServiceDefaults/ # Shared service configurations
├── ChatPDF.Web/ # Main Blazor Server application
│ ├── Components/
│ │ ├── Pages/
│ │ │ ├── Chat/ # Chat interface components
│ │ │ ├── Documents.razor # Document management
│ │ │ └── Diagnostics.razor # System testing
│ │ └── Layout/ # Navigation and layout
│ ├── Services/
│ │ ├── Ingestion/ # PDF processing pipeline
│ │ ├── DocumentService.cs # Document operations
│ │ └── SemanticSearch.cs # Vector search
│ └── wwwroot/
│ └── Data/ # PDF storage directory
└── README.md
- Local Processing: All AI processing runs locally (no data sent to cloud)
- File Validation: Strict PDF file type and size checking
- Input Sanitization: Protection against prompt injection attacks
- Access Control: File system permissions and validation
- Authentication: OpenID Connect with configurable identity providers
- HTTPS Enforcement: SSL/TLS encryption for all communications
- Content Security: Trusted content ingestion with validation
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Microsoft .NET Team: For the excellent AI integration framework
- Ollama: For providing local AI model hosting
- Qdrant: For high-performance vector database technology
- Blazor Community: For the modern web framework
For questions, issues, or contributions:
- Issues: Open a GitHub issue
- Discussions: Use GitHub Discussions
- Documentation: Check
/diagnosticsfor system health - Tutorial: See the complete tutorial for detailed guidance
Built with ❤️ using .NET 9, Blazor Server, and cutting-edge AI technologies