A research framework for autonomous agents with incremental planning, persistent memory, and tool use.
Cognitive Workbench is experimental research software for studying LLM-based cognitive architectures. It prioritizes inspectable agent behavior and fast iteration over stability.
The core idea: an incremental planner that interleaves reasoning with tool execution. Rather than generating a complete plan and then executing it, the planner generates one step at a time, runs it, observes the result, and decides what to do next. This tight feedback loop — combined with persistent memory, reflective quality control, and autonomous goal scheduling — produces agents that can pursue complex goals over extended periods.
User: "goal: Find recent papers on multi-agent coordination"
│
┌──────────▼──────────┐
│ Executive Node │ OODA loop: Observe → Orient → Decide → Act
│ (goal queue, │
│ scheduling) │
└──────────┬──────────┘
│
┌──────────▼──────────┐
│ Incremental Planner │ Stage 0: Retrieve context (FAISS)
│ │ Stage 1: Analyze + select tools
│ ┌───────────────┐ │ Stage 2: Generate code → Execute → Evaluate
│ │ Reason → Act │──│──────► repeat until done
│ │ ← Observe │ │
│ └───────────────┘ │ Reflect: learn from execution trace
└──────────┬──────────┘
│
┌──────────▼──────────┐
│ Infospace Executor │ Primitives + Tools
│ │ Notes + Collections + Relations
│ search-web, say, │ FAISS semantic search
│ create-note, ... │ Persistent memory
└─────────────────────┘
- Incremental Planning — the planner interleaves LLM reasoning with tool execution, adapting its approach based on real results
- Goal Scheduling — submit goals with
goal:prefix; schedule them for manual, automatic, recurring, or daily-at-time execution - Envisioning & Quality Control — lightweight LLM framing for coherent dialog; post-execution reflection for failure recovery and learning
- Infospace Memory — Notes, Collections, and Relations as structured working memory with FAISS semantic search
- Extensible Tools — 20+ built-in tools (web search, email, academic papers, shell scripts) plus world-specific integrations
- Missing Affordance Monitoring — automatic detection and logging of capability gaps for future tool development
- World Integrations — optional worlds (Minecraft, file system, desktop automation, ScienceWorld) with specialized tools
- Web UI — real-time dashboard with action log, goal scheduling, and resource browser
git clone https://github.com/bdambrosio/Cognitive_workbench.git
cd Cognitive_workbench
python3 -m venv zenoh_venv
source zenoh_venv/bin/activate
pip install -r requirements.txtOption A — Local GPU (SGLang): Edit scenarios/jill-infospace.yaml and set sgl_model_path to your preferred model.
Option B — Cloud API (no GPU needed):
export OPENROUTER_API_KEY="sk-or-v1-..." # from openrouter.aicd src
python3 launcher.py ../scenarios/jill-infospace.yaml --ui --resource-browser
# Or for OpenRouter:
python3 launcher.py ../scenarios/jill-infospace-openrouter.yaml --ui --resource-browserOpen http://localhost:3000 and type:
goal: Find and summarize recent papers on transformer architectures
See Getting Started for full setup details, environment variables, and troubleshooting.
| Document | Description |
|---|---|
| Getting Started | Installation, credentials, LLM backend setup, first run |
| Architecture | Core cognitive architecture — incremental planner, OODA loop, infospace memory |
| Goals & Scheduling | Goal submission (goal: prefix), scheduled goals, daily-at-time, autonomous execution |
| Envisioning & QC | Conversational envisioning, reflection, failure recovery, missing affordance monitoring |
| Tools & Primitives | Infospace primitives, tool catalog, run-script, plan tools |
| Configuration | Scenario YAML reference, available scenarios, directory structure |
| UI Guide | Web dashboard, resource browser, API endpoints |
| Tool Development | Creating new tools (Skill.md + tool.py) |
| Background | Research motivation and philosophy |
| Contributor Guidelines | Code style, testing, commit conventions |
- You submit a goal:
goal: Monitor stock prices and alert on changes > 5% - The Executive Node queues it and invokes the Incremental Planner
- The Planner retrieves relevant context (FAISS), selects tools, then enters a generate-execute-evaluate loop:
- LLM writes a code block calling tools (
search-web,stock-price,create-note, etc.) - Executor runs it, returns structured results
- LLM evaluates: done? next step? error recovery?
- LLM writes a code block calling tools (
- Reflection analyzes the full execution trace — updates task state, world model, tool insights
- If it failed with a missing capability, the gap is logged for future tool development
- Scheduled goals can repeat daily at a set time, or auto-proceed through multi-step workflows
| Scenario | World | Backend |
|---|---|---|
jill-infospace.yaml |
Core infospace | SGLang (local GPU) |
jill-infospace-openrouter.yaml |
Core infospace | OpenRouter (cloud) |
jill-fs.yaml |
File system | SGLang |
jill-fs-anthropic.yaml |
File system | Anthropic Claude |
jill-minecraft.yaml |
Minecraft 3D world | SGLang |
jill-osworld.yaml |
Desktop automation | SGLang |
jill-scienceworld.yaml |
Science simulation | SGLang |
jack-and-jill.yaml |
Multi-agent | SGLang |
See Configuration for details on each.
Cognitive_workbench/
├── README.md # This file
├── BACKGROUND.md # Research philosophy
├── requirements.txt # Python dependencies
├── docs/ # Detailed documentation
├── scenarios/ # Scenario YAML files + runtime data
└── src/
├── launcher.py # Entry point
├── executive_node.py # OODA loop coordinator
├── incremental_planner.py # Core planner (the heart of the system)
├── infospace_executor.py # Primitives + tool execution
├── infospace_resource_manager.py # Notes/Collections/Relations + FAISS
├── fastapi_action_display.py # Web UI
├── task_scheduler.py # Autonomous goal scheduling
├── tools/ # Core tools (search-web, run-script, etc.)
├── world-tools/ # World-specific tools (minecraft, fs, etc.)
├── scripts/ # Shell scripts for run-script tool
└── utils/ # Shared utilities
See src/AGENTS.md for repository guidelines, code style, and commit conventions.
MIT License — see LICENSE.