An intelligent research agent built with LangGraph that breaks down complex questions, gathers information from Wikipedia, and synthesizes comprehensive, structured answers.
This project implements an agentic workflow that demonstrates:
- Query Decomposition: Breaking complex questions into researchable sub-questions
- Tool Orchestration: Calling external APIs (Wikipedia) to gather information
- Error Recovery: Retrying with reformulated queries when searches fail
- Synthesis: Combining multiple sources into coherent markdown reports
- Conditional Routing: Making intelligent decisions based on intermediate results
START
↓
analyze_query (LLM decomposes question)
↓
[Should we proceed?]
↓ ↓
fetch_info handle_error
↓ ↓
[Retry logic] END
↓ ↓ ↓
synthesize retry error
↓ ↓ ↓
END analyze END
The graph uses a GraphState TypedDict that flows through all nodes:
class GraphState(TypedDict):
user_query: str # Original question
sub_questions: list[str] # Decomposed sub-questions
tool_results: dict[str, str] # Sub-question → API result
final_answer: str | None # Synthesized markdown report
error: str | None # Any errors encountered
retry_count: int # Number of retry attempts
current_step: str # Current node (for debugging)analyze_query: Uses LLM to break down the user's question into 2-5 sub-questionscall_tool_node: Executes Wikipedia API calls for each sub-questionsynthesize_answer: Uses LLM to combine all findings into a structured reporthandle_error: Creates user-friendly error messages when things failincrement_retry: Increments retry counter and loops back to analyze
- After analyze: Route to
fetchif successful,errorif failed - After fetch:
- If all results failed + retries available →
retry - If all results failed + no retries →
error - If any results succeeded →
synthesize
- If all results failed + retries available →
The research_tool uses Wikipedia's REST API:
- Input: A search query string
- Output: Article summary or error message
- Error Handling: Timeouts, 404s, network failures
- Robustness: 10-second timeout, graceful degradation
- Self-Decomposition: The agent decides how to break down queries
- Intelligent Retry: Reformulates queries when initial attempts fail
- Adaptive Routing: Different paths based on intermediate results
- Partial Success Handling: Proceeds with synthesis even if some queries fail
- Error Recovery: Multiple fallback strategies at each step
- Python 3.10 or higher
- OpenAI API key
- Clone or extract the project:
cd lang-agent- Create Virtual Environment and Install dependencies:
python3 -m venv venv
source venv/bin/acticate
pip install -r requirements.txt- Configure environment:
Create your
.envfile like this:
touch .envOpen the file and input this:
OPENAI_API_KEY=sk-your-actual-api-key-here
MODEL_NAME=gpt-4o-mini
TEMPERATURE=0.7
TOOL_TIMEOUT=10
MAX_RETRIES=2
MAX_SUB_QUESTIONS=5
Save and close .env file
Check the content of the file like this:
cat .env
# Edit .env and add your OpenAI API keyYour .env should look like:
OPENAI_API_KEY=sk-your-actual-api-key-here # Replace this with your OpenAI api key
MODEL_NAME=gpt-4o-mini
TEMPERATURE=0.7
TOOL_TIMEOUT=10
MAX_RETRIES=2
MAX_SUB_QUESTIONS=5
python main.pyThis starts an interactive session:
AI Research Concierge
============================================================
Ask me any research question and I'll break it down,
gather information, and provide a comprehensive answer.
Type 'quit' or 'exit' to stop.
Your question: Compare Python and JavaScript for backend development
from main import run_research_agent
# Run a query
result = run_research_agent(
"What are the advantages of solar energy?",
verbose=True
)
# Access the answer
print(result["final_answer"])Run the test suite:
pytest tests/ -vRun tests with coverage:
pytest tests/ -v --cov=. --cov-report=htmlNote: Some tests require an API key and make real API calls. These are marked with @pytest.mark.skip by default.
research_agent/
├── README.md # This file
├── requirements.txt # Python dependencies
├── .env # Environment variables template
├── config.py # Configuration management
├── main.py # Entry point & graph builder
│
├── graph/ # LangGraph components
│ ├── __init__.py
│ ├── state.py # GraphState definition
│ ├── nodes.py # Node implementations
│ └── edges.py # Conditional routing logic
│
├── tools/ # External tools
│ ├── __init__.py
│ └── research_tool.py # Wikipedia API integration
│
├── prompts/ # LLM prompt templates
│ ├── __init__.py
│ └── templates.py # Prompt engineering
│
└── tests/ # Test suite
├── __init__.py
├── test_tools.py # Tool unit tests
└── test_graph.py # Graph integration tests
- Endpoint:
https://en.wikipedia.org/api/rest_v1/page/summary/{topic} - Why Wikipedia?:
- Public API (no authentication required)
- Rich, structured content
- Realistic failure modes (404s, ambiguous topics)
- Good for demonstrating tool integration
The tool handles:
- 404s: Topic not found
- Timeouts: Network delays
- Network errors: Connection failures
- Malformed responses: Invalid JSON
All errors are returned as [Error: description] strings, which the graph can detect and handle.
What is machine learning?
Flow: analyze → fetch (1 query) → synthesize
Compare Python and JavaScript for backend development
Flow: analyze → fetch (4-5 queries) → synthesize
What are the advantages and disadvantages of blockchain technology?
Flow: analyze → fetch (multiple queries) → some might fail → retry → synthesize
- Explicit State Management: Clear data flow through typed state
- Conditional Logic: First-class support for routing decisions
- Debuggability: Easy to visualize and trace execution
- Testability: Each node is independently testable
- Modularity: Each component has a single responsibility
- Testability: Can test tools without testing the graph
- Maintainability: Easy to find and modify specific logic
- Scalability: Can add new tools/nodes without touching existing code
- No Auth Required: Easy to run without API keys (beyond OpenAI)
- Rich Content: Provides meaningful data for synthesis
- Realistic Errors: Teaches proper error handling
- Well-Known: Reviewers can verify the tool works correctly
All prompts are in prompts/templates.py and include:
- Clear instructions: What the LLM should do
- Output format: JSON arrays for parsing
- Examples: Few-shot learning for better results
- Constraints: Max sub-questions, quality guidelines
Edit .env to customize behavior:
MODEL_NAME: Which OpenAI model to use (default:gpt-4o-mini)TEMPERATURE: LLM creativity (0.0-1.0, default: 0.7)TOOL_TIMEOUT: Wikipedia API timeout in seconds (default: 10)MAX_RETRIES: How many times to retry failed queries (default: 2)MAX_SUB_QUESTIONS: Limit on query decomposition (default: 5)
- Single Tool: Only Wikipedia (could add arXiv, PubMed, Google Scholar)
- No Parallel Execution: Tool calls are sequential (could parallelize)
- Simple Reformulation: Retry logic is basic (could use more sophisticated strategies)
- No Result Ranking: All sources treated equally (could prioritize by relevance)
- English Only: Wikipedia API defaults to English
- Add multiple data sources (insurance legislations and laws, news, etc.)
- Implement parallel tool execution
- Add source quality assessment
- Support multiple languages
- Add caching for repeated queries
- Implement streaming responses
- Add visualization of the graph execution
- Create a web UI (FastAPI + React)
The code is structured to be extended (add new tools), maintained (clear separation of concerns), and debugged (explicit state flow with logging).
- If you encounter any issue when trying to install or run it raise an issue tracker here and let me know.