A Retrieval Augmented Generation (RAG) system built with LangChain and Anthropic's Claude.
rag-project/
├── src/
│ └── rag_system/
│ ├── __init__.py
│ ├── llm.py
│ └── document_loader.py
├── tests/
│ ├── __init__.py
│ └── test_llm.py
├── setup.py
├── requirements.txt
├── .env.example
├── .cursorignore
└── README.md
- Python 3.9+
- OpenAI API key (or other LLM provider)
- Clone the repository:
git clone <repository-url>
cd rag_system- Create and activate a virtual environment:
python -m venv venv- Activate the virtual environment:
- On macOS/Linux:
source venv/bin/activate- On Windows:
venv\Scripts\activate- Install the package in development mode:
pip install -e .- Copy
.env.exampleto.envand add your Anthropic API key:
cp .env.example .env
# Edit .env and add your ANTHROPIC_API_KEY- Ingest documents:
python -m rag_system.cli ingest documents/your_file.pdf- Query the system:
python -m rag_system.cli query "Your question here"- Interactive mode:
python -m rag_system.cli interactive- Run tests:
python -m pytest- Run specific test file:
python -m pytest tests/test_llm.py- Document ingestion from various formats (PDF, TXT, HTML)
- Web content ingestion
- Semantic search using embeddings
- RAG implementation with Anthropic's Claude
- Interactive query interface
- Vector database storage with ChromaDB
- langchain
- langchain-community
- langchain-anthropic
- chromadb
- python-dotenv
- sentence-transformers
- beautifulsoup4
- requests
- pypdf
- pytest
MIT License