This assistant processes user requests using a small multi-agent graph (LangGraph + LangChain). It classifies intent, routes the request to the appropriate agent (QA / Summarization / Calculation), uses tools (Document Retriever & Calculator), enforces structured outputs using Pydantic schemas, and updates persistent state/memory between turns.
Key goals:
- Reliable intent routing
- Typed, machine-parseable responses (structured outputs)
- Clear state & memory so conversation can be resumed
- Auditability (tools used, sources, timestamps)
User Input
↓
classify_intent (LLM -> returns UserIntent)
↓ next_step -> "qa_agent" | "summarization_agent" | "calculation_agent"
qa_agent / summarization_agent / calculation_agent
↓
update_memory
↓
END
-
classify_intentis always the entry point. It returns aUserIntent(see schema below) and setsnext_step. -
Each agent (qa/summarization/calculation) uses structured-output schemas for its response and may call tools:
- Document Retriever (reads source docs; returns doc ids & snippets)
- Calculator (performs numerical computations; used by calculation agent)
-
After an agent completes,
update_memorysummarizes the turn and updatesAgentState.
-
Structured outputs — enforce LLM responses using Pydantic schemas via
llm.with_structured_output(Schema).- Reason: avoids brittle string parsing; outputs are validated and typed.
-
Small, focused agents — each agent is responsible for one class of tasks (QA, summarization, calculation).
- Reason: easier prompt tuning, tools mapping and testing.
-
Tools as first-class functions —
@tooldecorated functions (LangChain style) with safe validation.- Calculator validates expressions to avoid code injection and only allows basic arithmetic.
-
State and persistence — use an
AgentStatemodel andInMemorySavercheckpointer (or replaceable persistence layer).- Reason: Allows resuming conversations and testing state transitions.
-
Auditable actions — state keeps
tools_used,actions_taken,active_documents, timestamps for traceability.
class UserIntent(BaseModel):
intent_type: Literal["qa", "summarization", "calculation", "unknown"]
confidence: float = Field(ge=0.0, le=1.0)
reasoning: strclass AnswerResponse(BaseModel):
question: str
answer: str
sources: List[str]
confidence: float = Field(ge=0.0, le=1.0)
timestamp: datetime(You will also have schemas for SummarizationResponse and CalculationResponse — follow the same pattern: typed fields, sources, confidence, timestamp.)
class AgentState(BaseModel):
user_input: str
messages: List[MessageAnnotation]
intent: Optional[UserIntent] = None
next_step: Optional[str] = None
conversation_summary: Optional[str] = None
active_documents: List[str] = []
current_response: Optional[Dict] = None # structured result placed here
tools_used: List[str] = []
actions_taken: List[str] = [] # reducer: operator.add
session_id: str
user_id: Optional[str]actions_takenuses anoperator.addreducer in the workflow to accumulate nodes executed each turn (helps auditing).current_responsestores the validated structured output of the last executed agent.
-
Use the LLM helper to wrap responses with the Pydantic schema:
llm_structured = llm.with_structured_output(UserIntent) out = llm_structured.generate(prompt) # library-specific API user_intent: UserIntent = out.value
-
If the LLM returns invalid data, the
.with_structured_output()wrapper will raise/return validation errors; handle them by:- re-prompting (fallback)
- use default:
intent_type="unknown"
-
For agent nodes, do the same with
AnswerResponse/SummarizationResponse/CalculationResponse.
get_intent_classification_prompt()— used byclassify_intent. Inputs:user_input,conversation_history. Output schema:UserIntent.QA_SYSTEM_PROMPT,SUMMARIZATION_SYSTEM_PROMPT,CALCULATION_SYSTEM_PROMPT— used inget_chat_prompt_template(intent_type, ...)which selects the system context for the agent node.
Important: CALCULATION_SYSTEM_PROMPT must instruct the model to always use the calculator tool for arithmetic and to indicate which document(s) it needs.
- API:
retrieve_documents(query) -> List[Document] - Returns document id, title, and local snippet.
- Populates
active_documentsin state.
- Decorated with
@tool. - Validates input expression using a whitelist regex (digits, spaces,
+ - * / ( ) .). - Uses
eval()only after validation, inside a restricted namespace. - Logs tool use (ToolLogger) and returns a formatted string.
Example (pseudocode):
@tool("calculator")
def calculator_tool(expression: str) -> str:
if not SAFE_RE.match(expression):
return "Invalid expression"
result = eval(expression, {"__builtins__": {}})
ToolLogger.log("calculator", expression)
return f"Result: {result}"-
Start:
process_messagecreates or loadsAgentStateforsession_id. -
classify_intent:
-
Compose prompt with
user_input+messages. -
Call
llm.with_structured_output(UserIntent)and parse response. -
Populate
state.intent, choosenext_step:qa→"qa_agent"summarization→"summarization_agent"calculation→"calculation_agent"- default →
"qa_agent"
-
actions_taken += ["classify_intent"]
-
-
Agent Node (qa/summarization/calculation):
-
Construct prompt using
get_chat_prompt_template(intent_type, ...). -
If needed, call
retrieve_documents()to get sources; add tostate.active_documentsandtools_used. -
If
calculation_agent:- the LLM must decide the expression and call
calculatortool; we enforce calculator usage in the system prompt. - record
tools_used += ["calculator"].
- the LLM must decide the expression and call
-
Receive structured response and set
state.current_response. -
actions_taken += ["qa_agent"](or the relevant agent name).
-
-
update_memory:
- Summarize the turn (LLM or local summarizer).
- Update
conversation_summary,messages,active_documents. - Persist
AgentStatevia the chosen checkpointer.
-
Persist: the workflow compiled with
InMemorySaver()or other saver persists theAgentStateacross invocations.
-
Persistence:
- Default:
InMemorySaver()— useful for tests and demos. - For production, swap the saver for a file/db-backed saver (Redis, DynamoDB, or custom).
- Sessions saved under
sessions/(or configured storage) bysession_id. Example file:sessions/<session_id>.json(if using file saver).
- Default:
-
Logging:
-
Tools log entries via
ToolLoggerwith fields:timestamp,tool_name,input,output. -
Application logs (info/debug) stored in
logs/assistant.log. -
Example log contents:
2025-12-05T10:12:23Z INFO classify_intent session=abc123 prompt=... 2025-12-05T10:12:24Z INFO qa_agent session=abc123 retrieved_docs=['doc_42','doc_9'] 2025-12-05T10:12:25Z INFO tool:calculator session=abc123 input='10 / 2' output='5.0'
-
-
How to inspect previous sessions:
- Read
sessions/<session_id>.jsonor callassistant.get_session(session_id)(helper). - Logs are plain text rotated monthly; look in
logs/.
- Read
NOTE: These examples include the structured outputs the agents produce. Timestamps and session ids are illustrative.
User: "In the March financial statement, what is the total operating expense for product line A?"
Flow:
-
classify_intent→ returns:{ "intent_type":"qa", "confidence":0.92, "reasoning":"User asks a direct fact question about a document (financial statement)." } -
qa_agent:-
Document retriever called with query "March financial statement product line A operating expense".
-
Tools used:
doc_retriever→ found["doc_fin_2025_march"] -
Prompt enforces
AnswerResponsestructured output. -
LLM returns
AnswerResponse:{ "question":"In the March financial statement, what is the total operating expense for product line A?", "answer":"The total operating expense for product line A in March is $1,238,450.", "sources":["doc_fin_2025_march"], "confidence":0.86, "timestamp":"2025-12-05T10:20:32Z" }
-
-
update_memorysummarizes the turn and recordsactive_documents=["doc_fin_2025_march"].
User: "Summarize the main risks discussed in the uploaded compliance report."
Flow:
-
classify_intent→{"intent_type":"summarization", ...} -
summarization_agent:-
retriever finds
doc_compliance_v2. -
LLM enforces
SummarizationResponseschema, returns:{ "summary":"The report highlights three major risks: 1) Third-party vendor compliance gaps; 2) Data retention policy inconsistencies; 3) Insufficient access controls for exporting data.", "key_points":[ "Vendor compliance checks missing in 12% of cases", "Retention policy not unified across subsidiaries", "Export permissions enabled for legacy accounts" ], "sources":["doc_compliance_v2"], "confidence":0.81, "timestamp":"2025-12-05T10:25:01Z" }
-
-
update_memoryappends the summary toconversation_summary.
User: "From the spreadsheet, what's the net margin for product B if revenue is 4,200 and costs equal fixed 1200 plus 12% of revenue?"
Flow:
-
classify_intent→{"intent_type":"calculation", ...} -
calculation_agent:-
Document retriever optionally fetches numeric parameters from
doc_spreadsheet_productB. -
System prompt forces: "use the calculator tool for all math".
-
LLM decides expression:
net_margin = revenue - (fixed_cost + 0.12 * revenue)→ expression4200 - (1200 + 0.12*4200). -
Calls
calculatortool with expression4200 - (1200 + 0.12*4200). -
Calculator returns
Result: 4200 - (1200 + 504) = 2496. -
CalculationResponse:{ "formula":"net_margin = revenue - (fixed_cost + 0.12 * revenue)", "expression":"4200 - (1200 + 0.12*4200)", "result":2496, "sources":["doc_spreadsheet_productB"], "confidence":0.9, "timestamp":"2025-12-05T10:30:10Z" }
-
-
update_memorystores the numeric result and the fact thatcalculatorwas used.
def classify_intent(state: AgentState, config) -> AgentState:
llm = config.llm
# wrap with schema enforcement
llm_struct = llm.with_structured_output(UserIntent)
prompt = get_intent_classification_prompt().format(
user_input=state.user_input,
conversation_history="\n".join([m.text for m in state.messages])
)
response = llm_struct.generate(prompt)
intent: UserIntent = response.value
state.intent = intent
state.actions_taken = state.actions_taken + ["classify_intent"]
state.next_step = {
"qa":"qa_agent",
"summarization":"summarization_agent",
"calculation":"calculation_agent"
}.get(intent.intent_type, "qa_agent")
return state@tool("calculator")
def calculator_tool(expression: str) -> str:
# allow only digits, whitespace, parentheses, ., and + - * /
if not re.fullmatch(r"[0-9\.\s\+\-\*\/\(\)]+", expression):
raise ValueError("Unsafe expression")
result = eval(expression, {"__builtins__": {}})
ToolLogger.log(tool="calculator", input=expression, output=str(result))
return f"Result: {result}"-
Integration:
- Run
python main.pyand try the three examples above. Inspectsessions/<session_id>.jsonto verifyactions_taken,current_response,conversation_summary, andactive_documents.
- Run
-
Manual checks:
- Inspect
logs/assistant.logandlogs/tools.logfor full traceability.
- Inspect
.envmust containOPENAI_API_KEY(or the appropriate LLM provider credentials).- Replace
InMemorySaverwith a persistent saver for production (Redis or DB). - Consider rate-limiting or batching prompts for cost control.
- For production, add stricter input sanitation (especially for any user input passed into
eval()-like constructs).