MatHud pairs an interactive drawing canvas with an AI assistant to help visualize, analyze, and solve real-world arithmetic, geometry, algebra, calculus and statistics problems in real-time.
- Primary interaction is conversational: users express intent in chat and the AI executes tool workflows.
- The HUD canvas is the visual output surface for AI actions, not the primary control surface.
- Direct UI gestures are optional support tools (inspection, quick anchoring) and should not be required for core workflows.
- Features should optimize for intent resolution, deterministic execution, and explainable AI responses tied to canvas state.
- Draw and manipulate geometric objects (points, segments, vectors, polygons, circles, ellipses, angles) directly on the canvas.
- Ask the assistant to solve algebra, calculus, trigonometry, statistics, and linear algebra problems with LaTeX-formatted explanations.
- Plot functions, compare intersections, shade bounded regions, and translate/rotate objects to explore relationships visually.
- Plot statistics visualizations (probability distributions and bar charts).
- Fit regression models to data (linear, polynomial, exponential, logarithmic, power, logistic, sinusoidal) and visualize fitted curves with R² statistics.
- Compute descriptive statistics (mean, median, mode, standard deviation, variance, min, max, quartiles, IQR) for any dataset.
- Create and analyze graph theory graphs (graphs, trees, DAGs).
- Save, list, load, and delete named workspaces so projects can be resumed or shared later.
- Share the current canvas with the assistant using Vision mode to get feedback grounded in your drawing.
- Attach images directly to chat messages for the AI to analyze alongside your prompts.
- Use slash commands (
/help,/vision,/model,/image, etc.) for quick local operations without waiting for an AI response. - Choose from multiple AI providers — OpenAI, Anthropic (Claude), and OpenRouter — with the model dropdown automatically filtered by which API keys you have configured.
- Trigger client-side tests from the UI or chat to verify canvas behavior without leaving the app.
- Frontend (Brython) –
static/client/hosts the Brython application (main.py) that wires aCanvas,AIInterface,CanvasEventHandler, and numerous managers. Canvas objects stay math-only; renderers translate them to screen primitives via shared plan builders. - Backend (Flask) –
app.pyboots a Flask app assembled bystatic/app_manager.py, registers routes (static/routes.py), and injects OpenAI, workspace, webdriver, and logging services. - AI integration –
static/providers/implements a multi-provider architecture supporting OpenAI, Anthropic (Claude), and OpenRouter.static/ai_model.pystores model configs with per-model vision and reasoning flags. The model dropdown is populated dynamically fromGET /api/available_models, which filters by which API keys are present in the environment. - Rendering –
static/client/rendering/factory.pyprefers Canvas2D, then SVG, and finally the still-incomplete WebGL path if earlier options fail. Canvas and SVG renderers include opt-in offscreen staging toggled bywindow.MatHudCanvas2DOffscreen/window.MatHudSvgOffscreenor matchinglocalStorageflags. - Vision pipeline – When the chat payload signals vision, the server either stores a data URL snapshot or drives Selenium (
static/webdriver_manager.py) to replay SVG state in headless Firefox and capturecanvas_snapshots/canvas.pngfor the model.
- Python 3.10+ (tested with Python 3.11).
- Firefox installed locally for the vision workflow (the
geckodriver-autoinstallerpackage handles the driver). - At least one AI provider API key (see Configuration below).
- Clone the repository and create a virtual environment:
python -m venv venv
- Activate the environment:
- macOS/Linux:
source venv/bin/activate - Windows (PowerShell):
.\venv\Scripts\Activate.ps1
- macOS/Linux:
- Install dependencies:
pip install -r requirements.txt
- Provide at least one AI provider API key by setting environment variables or creating
.envin the project root:Only models for configured providers will appear in the model dropdown.OPENAI_API_KEY=sk-... # OpenAI models (GPT-4o, GPT-5, o3, etc.) ANTHROPIC_API_KEY=sk-ant-... # Anthropic models (Claude Opus/Sonnet/Haiku 4.5) OPENROUTER_API_KEY=sk-or-... # OpenRouter models (Gemini, DeepSeek, Llama, etc.)
- Launch the Flask server from the project root:
python app.py
- Open
http://127.0.0.1:5000/in a desktop browser (Chrome, Firefox, or Edge confirmed). The Brython client loads automatically. - Stop the server with
Ctrl+C. The shutdown handler closes any active Selenium session before exiting.
- The server reads configuration from environment variables or
.env(loaded viapython-dotenv). Common options:OPENAI_API_KEY=sk-... # OpenAI provider ANTHROPIC_API_KEY=sk-ant-... # Anthropic provider OPENROUTER_API_KEY=sk-or-... # OpenRouter provider AUTH_PIN=123456 # Optional: access code required when auth is enabled REQUIRE_AUTH=true # Force authentication in local development PORT=5000 # Set by hosting platforms to indicate deployed mode SECRET_KEY=override-me # Optional: otherwise a random key is generated per launch
- Authentication rules (
static/app_manager.py):- When
PORTis set (typical in hosted deployments), authentication is enforced automatically. - Locally, you can opt-in by setting
REQUIRE_AUTH=true. The login page accepts theAUTH_PINvalue. - Sessions use
flask-sessionwith a CacheLib-backed store; cookies are upgraded to secure/HTTP-only in deployed mode.
- When
- Vision capture requires Firefox. The first request that needs Selenium will call
/init_webdriver, which in turn relies ongeckodriver-autoinstallerto download the driver if necessary.
MatHud now supports adaptive canvas-state prompt normalization to reduce AI context noise for large scenes while preserving full detail for small scenes.
AI_CANVAS_SUMMARY_MODE=hybrid # off | hybrid | summary_only
AI_CANVAS_HYBRID_FULL_MAX_BYTES=6000 # hybrid threshold for sending full canvas_state
AI_CANVAS_SUMMARY_TELEMETRY=0 # 1/true/on to emit canvas_prompt_telemetry logsoff: send original payload unchanged.hybrid(default): keep fullcanvas_statefor small scenes, attachcanvas_state_summaryand remove full state for large scenes.summary_only: always remove fullcanvas_stateand send summary envelope.
Developer utilities:
- Browser console helper:
window.compareCanvasState()(development mode) prints full vs summary structures with byte/token metrics. - Log report script:
python scripts/canvas_prompt_telemetry_report.py --mode hybrid --json-out /tmp/canvas_summary_report.json - Deep-dive rollout notes:
documentation/development/canvas_prompt_summary_rollout.md
- Use chat as the default control channel: describe what you want and let the AI perform the steps.
- Gesture support remains available for quick inspection:
- Double-click the canvas to log precise math coordinates into the chat box.
- Pan by click-dragging; zoom with the mouse wheel (anchored around the cursor).
- The canvas tracks undo/redo, dependencies, and name generation automatically through managers in
static/client/managers/.
- Type a request in the chat input and press Enter or click Send. The assistant inspects the current canvas state and can call functions on your behalf.
- Responses support Markdown and LaTeX; MathJax renders inline (
\( ... \)) and block ($$ ... $$) math. - Sample prompts that map directly to available tools:
Note: In this section, "plot" refers to function plots. "graph" refers to graph theory vertices/edges (not dependency graphs).
create point at (2, 3) named Adraw a segment from (0,0) to (3,4) called s1plot y = sin(x) from -pi to pievaluate expression 2*sin(pi/4)derive x^3 + 2x - 1solve system of equations: x + y = 5, x - y = 1evaluate linear algebra expression with matrices A=[[1,2],[3,4]]; compute inv(A)plot a normal distribution with mean 0 and sigma 1, continuous, shade from -1 to 1plot a bar chart with values [10,20,5] and labels ["A","B","C"]fit a linear regression to x_data=[1,2,3,4,5] and y_data=[2,4,6,8,10], show points and report R²compute descriptive statistics for [10, 20, 30, 40, 50]create an undirected weighted graph named G1 with vertices A,B,C,D and edges A-B (1), B-C (2), A-C (4), C-D (1)on graph G1, find the shortest path from A to D and highlight the edgescreate a DAG named D1 with vertices A,B,C,D and edges A->B, A->C, B->D, C->D; then topologically sort itsave workspace as "demo"/load workspace "demo"run tests
Type / in the chat input to access local commands that execute instantly without contacting the AI:
| Command | Description |
|---|---|
/help [command] |
Show available commands or detailed help for a specific command |
/undo / /redo |
Undo or redo the last canvas action |
/clear / /reset |
Clear all objects or reset view to default |
/save [name] / /load [name] |
Save or load a named workspace |
/workspaces |
List all saved workspaces |
/fit |
Fit the view to show all objects |
/zoom <in|out|factor> |
Zoom the canvas |
/grid / /axes |
Toggle grid or axes visibility |
/polar / /cartesian |
Switch coordinate system |
/status |
Show canvas info (object count, bounds) |
/vision |
Toggle vision mode (vision-capable models only) |
/image |
Attach an image to your next message (vision-capable models only) |
/model [name] |
Show or switch the current AI model |
/test |
Run the client test suite |
/export / /import <json> |
Export or import canvas state as JSON |
/list |
List all objects on the canvas |
/new |
Start fresh (clear canvas + new conversation) |
Autocomplete suggestions appear as you type. Unknown commands trigger fuzzy-match suggestions.
- Click the paperclip button next to the chat input (or use
/image) to attach images to your message. - Multiple images can be attached per message (up to the configured limit).
- Image previews appear below the chat input; click the X on a preview to remove it.
- Images are sent alongside your text message for the AI to analyze.
- The attach button and
/imagecommand are only available when the selected model supports vision. Non-vision models show "(text only)" in the dropdown.
- Use the Enable Vision checkbox in the chat header to include screenshots of the current canvas.
- The vision toggle and attach button are hidden for models without vision support. Models marked "(text only)" in the dropdown do not support image input.
- The server stores the latest snapshot under
canvas_snapshots/canvas.pngfor troubleshooting.
MatHud supports three AI providers. The model dropdown dynamically shows only models for providers with configured API keys:
| Provider | Environment Variable | Models |
|---|---|---|
| OpenAI | OPENAI_API_KEY |
GPT-5.2, GPT-5, GPT-4.1, GPT-4o, o3, o4-mini, GPT-3.5 Turbo, etc. |
| Anthropic | ANTHROPIC_API_KEY |
Claude Opus 4.5, Claude Sonnet 4.5, Claude Haiku 4.5 |
| OpenRouter | OPENROUTER_API_KEY |
Gemini 3 Pro/Flash, Gemini 2.5 Pro, DeepSeek V3.2, Grok, Llama, Gemma, and more (paid and free tiers) |
Models without vision support are labeled "(text only)" in the dropdown. When no API keys are configured, the dropdown shows "No API keys configured".
- Workspaces are persisted as JSON under
workspaces/. - The chat tools
save_workspace,load_workspace,list_workspaces, anddelete_workspaceare exposed to the assistant and UI. - Client-side restores rebuild the Brython objects through
static/client/workspace_manager.py.
- Server tests: run
python run_server_tests.py(add--with-authto exercise authenticated flows). - Client tests: click Run Tests in the UI or ask the assistant to "run tests". Results stream back into the chat after execution (
static/client/test_runner.py).
static/client/rendering/factory.pyinstantiates renderers in preference ordercanvas2d → svg → webgl. If a constructor raises (for example, WebGL unavailable), the factory continues down the chain.- Canvas2D rendering (
canvas2d_renderer.py) supports optional offscreen compositing. Toggle it withwindow.MatHudCanvas2DOffscreen = trueorlocalStorage["mathud.canvas2d.offscreen"] = "1". - SVG rendering (
svg_renderer.py) mirrors the same offscreen staging controls throughwindow.MatHudSvgOffscreenorlocalStorage["mathud.svg.offscreen"]. - The WebGL renderer (
webgl_renderer.py) is experimental, not feature complete, and only instantiates when the browser exposes a WebGL context.
- Generate the full suite of diagrams from the project root:
python generate_diagrams_launcher.py
- Output directories:
diagrams/generated_png/– raster versions for quick sharing.diagrams/generated_svg/– scalable diagrams for documentation.
- Additional guidance lives in
diagrams/README.mdanddiagrams/WORKFLOW_SUMMARY.md.
app.py– entry point with graceful shutdown and threaded dev server.static/a.app_manager.py,routes.py,openai_api.py,ai_model.py,tool_call_processor.py,workspace_manager.py,log_manager.py,webdriver_manager.py. b.providers/– Multi-provider AI backend (OpenAI, Anthropic, OpenRouter) withProviderRegistryfor API key detection. c.client/– Brython modules (canvas, managers, rendering, slash commands, tests, utilities, workspace manager).templates/index.html– main HTML shell that loads Brython, MathJax, styles, and UI controls.workspaces/– saved canvas states.canvas_snapshots/– latest Selenium captures used for vision.server_tests/– pytest suites, including renderer plan tests underserver_tests/client_renderer/.documentation/– extended reference material.logs/– session-specific server logs.
documentation/Project Architecture.txt– deep dive into system design.documentation/Reference Manual.txt– comprehensive API and module reference.documentation/Example Prompts.txt– curated prompts for common workflows.
