Document Assistant — Design, Implementation and Examples

Project summary

This assistant processes user requests using a small multi-agent graph (LangGraph + LangChain). It classifies intent, routes the request to the appropriate agent (QA / Summarization / Calculation), uses tools (Document Retriever & Calculator), enforces structured outputs using Pydantic schemas, and updates persistent state/memory between turns.

Key goals:

Reliable intent routing
Typed, machine-parseable responses (structured outputs)
Clear state & memory so conversation can be resumed
Auditability (tools used, sources, timestamps)

Design overview (agent graph)

User Input
   ↓
classify_intent  (LLM -> returns UserIntent)
   ↓ next_step -> "qa_agent" | "summarization_agent" | "calculation_agent"
qa_agent / summarization_agent / calculation_agent
   ↓
update_memory
   ↓
END

classify_intent is always the entry point. It returns a UserIntent (see schema below) and sets next_step.
Each agent (qa/summarization/calculation) uses structured-output schemas for its response and may call tools:
- Document Retriever (reads source docs; returns doc ids & snippets)
- Calculator (performs numerical computations; used by calculation agent)
After an agent completes, update_memory summarizes the turn and updates AgentState.

Implementation decisions & rationale

Structured outputs — enforce LLM responses using Pydantic schemas via llm.with_structured_output(Schema).
- Reason: avoids brittle string parsing; outputs are validated and typed.
Small, focused agents — each agent is responsible for one class of tasks (QA, summarization, calculation).
- Reason: easier prompt tuning, tools mapping and testing.
Tools as first-class functions — @tool decorated functions (LangChain style) with safe validation.
- Calculator validates expressions to avoid code injection and only allows basic arithmetic.
State and persistence — use an AgentState model and InMemorySaver checkpointer (or replaceable persistence layer).
- Reason: Allows resuming conversations and testing state transitions.
Auditable actions — state keeps tools_used, actions_taken, active_documents, timestamps for traceability.

Key schemas

`UserIntent` (intent classification)

class UserIntent(BaseModel):
    intent_type: Literal["qa", "summarization", "calculation", "unknown"]
    confidence: float = Field(ge=0.0, le=1.0)
    reasoning: str

`AnswerResponse` (Q&A response)

class AnswerResponse(BaseModel):
    question: str
    answer: str
    sources: List[str]
    confidence: float = Field(ge=0.0, le=1.0)
    timestamp: datetime

(You will also have schemas for SummarizationResponse and CalculationResponse — follow the same pattern: typed fields, sources, confidence, timestamp.)

`AgentState` (important fields)

class AgentState(BaseModel):
    user_input: str
    messages: List[MessageAnnotation]
    intent: Optional[UserIntent] = None
    next_step: Optional[str] = None
    conversation_summary: Optional[str] = None
    active_documents: List[str] = []
    current_response: Optional[Dict] = None   # structured result placed here
    tools_used: List[str] = []
    actions_taken: List[str] = []              # reducer: operator.add
    session_id: str
    user_id: Optional[str]

actions_taken uses an operator.add reducer in the workflow to accumulate nodes executed each turn (helps auditing).
current_response stores the validated structured output of the last executed agent.

How structured outputs are enforced (practical notes)

Use the LLM helper to wrap responses with the Pydantic schema:

llm_structured = llm.with_structured_output(UserIntent)
out = llm_structured.generate(prompt)  # library-specific API
user_intent: UserIntent = out.value

If the LLM returns invalid data, the .with_structured_output() wrapper will raise/return validation errors; handle them by:
- re-prompting (fallback)
- use default: intent_type="unknown"
For agent nodes, do the same with AnswerResponse / SummarizationResponse / CalculationResponse.

Prompts & how they map to agents

get_intent_classification_prompt() — used by classify_intent. Inputs: user_input, conversation_history. Output schema: UserIntent.
QA_SYSTEM_PROMPT, SUMMARIZATION_SYSTEM_PROMPT, CALCULATION_SYSTEM_PROMPT — used in get_chat_prompt_template(intent_type, ...) which selects the system context for the agent node.

Important: CALCULATION_SYSTEM_PROMPT must instruct the model to always use the calculator tool for arithmetic and to indicate which document(s) it needs.

Tools

Document Retriever

API: retrieve_documents(query) -> List[Document]
Returns document id, title, and local snippet.
Populates active_documents in state.

Calculator tool (safe)

Decorated with @tool.
Validates input expression using a whitelist regex (digits, spaces, + - * / ( ) .).
Uses eval() only after validation, inside a restricted namespace.
Logs tool use (ToolLogger) and returns a formatted string.

Example (pseudocode):

@tool("calculator")
def calculator_tool(expression: str) -> str:
    if not SAFE_RE.match(expression):
        return "Invalid expression"
    result = eval(expression, {"__builtins__": {}})
    ToolLogger.log("calculator", expression)
    return f"Result: {result}"

State & memory flow (detailed)

Start: process_message creates or loads AgentState for session_id.
classify_intent:
- Compose prompt with user_input + messages.
- Call llm.with_structured_output(UserIntent) and parse response.
- Populate state.intent, choose next_step:
  - qa → "qa_agent"
  - summarization → "summarization_agent"
  - calculation → "calculation_agent"
  - default → "qa_agent"
- actions_taken += ["classify_intent"]
Agent Node (qa/summarization/calculation):
- Construct prompt using get_chat_prompt_template(intent_type, ...).
- If needed, call retrieve_documents() to get sources; add to state.active_documents and tools_used.
- If calculation_agent:
  - the LLM must decide the expression and call calculator tool; we enforce calculator usage in the system prompt.
  - record tools_used += ["calculator"].
- Receive structured response and set state.current_response.
- actions_taken += ["qa_agent"] (or the relevant agent name).
update_memory:
- Summarize the turn (LLM or local summarizer).
- Update conversation_summary, messages, active_documents.
- Persist AgentState via the chosen checkpointer.
Persist: the workflow compiled with InMemorySaver() or other saver persists the AgentState across invocations.

Session persistence & logging

Persistence:
- Default: InMemorySaver() — useful for tests and demos.
- For production, swap the saver for a file/db-backed saver (Redis, DynamoDB, or custom).
- Sessions saved under sessions/ (or configured storage) by session_id. Example file: sessions/<session_id>.json (if using file saver).

Logging:

Tools log entries via ToolLogger with fields: timestamp, tool_name, input, output.
Application logs (info/debug) stored in logs/assistant.log.

Example log contents:

2025-12-05T10:12:23Z INFO classify_intent session=abc123 prompt=... 
2025-12-05T10:12:24Z INFO qa_agent session=abc123 retrieved_docs=['doc_42','doc_9']
2025-12-05T10:12:25Z INFO tool:calculator session=abc123 input='10 / 2' output='5.0'

How to inspect previous sessions:
- Read sessions/<session_id>.json or call assistant.get_session(session_id) (helper).
- Logs are plain text rotated monthly; look in logs/.

Example conversations

NOTE: These examples include the structured outputs the agents produce. Timestamps and session ids are illustrative.

1) QA flow (intent: `qa`)

User: "In the March financial statement, what is the total operating expense for product line A?"

Flow:

classify_intent → returns:

{
  "intent_type":"qa",
  "confidence":0.92,
  "reasoning":"User asks a direct fact question about a document (financial statement)."
}

qa_agent:

Document retriever called with query "March financial statement product line A operating expense".
Tools used: doc_retriever → found ["doc_fin_2025_march"]
Prompt enforces AnswerResponse structured output.

LLM returns AnswerResponse:

{
  "question":"In the March financial statement, what is the total operating expense for product line A?",
  "answer":"The total operating expense for product line A in March is $1,238,450.",
  "sources":["doc_fin_2025_march"],
  "confidence":0.86,
  "timestamp":"2025-12-05T10:20:32Z"
}

update_memory summarizes the turn and records active_documents=["doc_fin_2025_march"].

2) Summarization flow (intent: `summarization`)

User: "Summarize the main risks discussed in the uploaded compliance report."

Flow:

classify_intent → {"intent_type":"summarization", ...}

summarization_agent:

retriever finds doc_compliance_v2.

LLM enforces SummarizationResponse schema, returns:

{
  "summary":"The report highlights three major risks: 1) Third-party vendor compliance gaps; 2) Data retention policy inconsistencies; 3) Insufficient access controls for exporting data.",
  "key_points":[
    "Vendor compliance checks missing in 12% of cases",
    "Retention policy not unified across subsidiaries",
    "Export permissions enabled for legacy accounts"
  ],
  "sources":["doc_compliance_v2"],
  "confidence":0.81,
  "timestamp":"2025-12-05T10:25:01Z"
}

update_memory appends the summary to conversation_summary.

3) Calculation flow (intent: `calculation`)

User: "From the spreadsheet, what's the net margin for product B if revenue is 4,200 and costs equal fixed 1200 plus 12% of revenue?"

Flow:

classify_intent → {"intent_type":"calculation", ...}
calculation_agent:
- Document retriever optionally fetches numeric parameters from doc_spreadsheet_productB.
- System prompt forces: "use the calculator tool for all math".
- LLM decides expression: net_margin = revenue - (fixed_cost + 0.12 * revenue) → expression 4200 - (1200 + 0.12*4200).
- Calls calculator tool with expression 4200 - (1200 + 0.12*4200).
- Calculator returns Result: 4200 - (1200 + 504) = 2496.
- CalculationResponse:
```
{
  "formula":"net_margin = revenue - (fixed_cost + 0.12 * revenue)",
  "expression":"4200 - (1200 + 0.12*4200)",
  "result":2496,
  "sources":["doc_spreadsheet_productB"],
  "confidence":0.9,
  "timestamp":"2025-12-05T10:30:10Z"
}
```
update_memory stores the numeric result and the fact that calculator was used.

Example code snippets

classify_intent (illustrative)

def classify_intent(state: AgentState, config) -> AgentState:
    llm = config.llm
    # wrap with schema enforcement
    llm_struct = llm.with_structured_output(UserIntent)

    prompt = get_intent_classification_prompt().format(
        user_input=state.user_input,
        conversation_history="\n".join([m.text for m in state.messages])
    )

    response = llm_struct.generate(prompt)
    intent: UserIntent = response.value

    state.intent = intent
    state.actions_taken = state.actions_taken + ["classify_intent"]
    state.next_step = {
        "qa":"qa_agent",
        "summarization":"summarization_agent",
        "calculation":"calculation_agent"
    }.get(intent.intent_type, "qa_agent")
    return state

Using the calculator tool (illustrative)

@tool("calculator")
def calculator_tool(expression: str) -> str:
    # allow only digits, whitespace, parentheses, ., and + - * /
    if not re.fullmatch(r"[0-9\.\s\+\-\*\/\(\)]+", expression):
        raise ValueError("Unsafe expression")
    result = eval(expression, {"__builtins__": {}})
    ToolLogger.log(tool="calculator", input=expression, output=str(result))
    return f"Result: {result}"

Testing & example scripts

Integration:
- Run python main.py and try the three examples above. Inspect sessions/<session_id>.json to verify actions_taken, current_response, conversation_summary, and active_documents.
Manual checks:
- Inspect logs/assistant.log and logs/tools.log for full traceability.

Deployment & config notes

.env must contain OPENAI_API_KEY (or the appropriate LLM provider credentials).
Replace InMemorySaver with a persistent saver for production (Redis or DB).
Consider rate-limiting or batching prompts for cost control.
For production, add stricter input sanitation (especially for any user input passed into eval()-like constructs).

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
docs		docs
src		src
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Document Assistant — Design, Implementation and Examples

Project summary

Design overview (agent graph)

Implementation decisions & rationale

Key schemas

`UserIntent` (intent classification)

`AnswerResponse` (Q&A response)

`AgentState` (important fields)

How structured outputs are enforced (practical notes)

Prompts & how they map to agents

Tools

Document Retriever

Calculator tool (safe)

State & memory flow (detailed)

Session persistence & logging

Example conversations

1) QA flow (intent: `qa`)

2) Summarization flow (intent: `summarization`)

3) Calculation flow (intent: `calculation`)

Example code snippets

classify_intent (illustrative)

Using the calculator tool (illustrative)

Testing & example scripts

Deployment & config notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Document Assistant — Design, Implementation and Examples

Project summary

Design overview (agent graph)

Implementation decisions & rationale

Key schemas

UserIntent (intent classification)

AnswerResponse (Q&A response)

AgentState (important fields)

How structured outputs are enforced (practical notes)

Prompts & how they map to agents

Tools

Document Retriever

Calculator tool (safe)

State & memory flow (detailed)

Session persistence & logging

Example conversations

1) QA flow (intent: qa)

2) Summarization flow (intent: summarization)

3) Calculation flow (intent: calculation)

Example code snippets

classify_intent (illustrative)

Using the calculator tool (illustrative)

Testing & example scripts

Deployment & config notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`UserIntent` (intent classification)

`AnswerResponse` (Q&A response)

`AgentState` (important fields)

1) QA flow (intent: `qa`)

2) Summarization flow (intent: `summarization`)

3) Calculation flow (intent: `calculation`)

Packages