Cognitive Workbench

A research framework for autonomous agents with incremental planning, persistent memory, and tool use.

What This Is

Cognitive Workbench is experimental research software for studying LLM-based cognitive architectures. It prioritizes inspectable agent behavior and fast iteration over stability.

The core idea: an incremental planner that interleaves reasoning with tool execution. Rather than generating a complete plan and then executing it, the planner generates one step at a time, runs it, observes the result, and decides what to do next. This tight feedback loop — combined with persistent memory, reflective quality control, and autonomous goal scheduling — produces agents that can pursue complex goals over extended periods.

User: "goal: Find recent papers on multi-agent coordination"
                    │
         ┌──────────▼──────────┐
         │   Executive Node    │  OODA loop: Observe → Orient → Decide → Act
         │   (goal queue,      │
         │    scheduling)      │
         └──────────┬──────────┘
                    │
         ┌──────────▼──────────┐
         │ Incremental Planner │  Stage 0: Retrieve context (FAISS)
         │                     │  Stage 1: Analyze + select tools
         │  ┌───────────────┐  │  Stage 2: Generate code → Execute → Evaluate
         │  │ Reason → Act  │──│──────► repeat until done
         │  │ ← Observe     │  │
         │  └───────────────┘  │  Reflect: learn from execution trace
         └──────────┬──────────┘
                    │
         ┌──────────▼──────────┐
         │ Infospace Executor   │  Primitives + Tools
         │                     │  Notes + Collections + Relations
         │  search-web, say,   │  FAISS semantic search
         │  create-note, ...   │  Persistent memory
         └─────────────────────┘

Key Features

Incremental Planning — the planner interleaves LLM reasoning with tool execution, adapting its approach based on real results
Goal Scheduling — submit goals with goal: prefix; schedule them for manual, automatic, recurring, or daily-at-time execution
Envisioning & Quality Control — lightweight LLM framing for coherent dialog; post-execution reflection for failure recovery and learning
Infospace Memory — Notes, Collections, and Relations as structured working memory with FAISS semantic search
Extensible Tools — 20+ built-in tools (web search, email, academic papers, shell scripts) plus world-specific integrations
Missing Affordance Monitoring — automatic detection and logging of capability gaps for future tool development
World Integrations — optional worlds (Minecraft, file system, desktop automation, ScienceWorld) with specialized tools
Web UI — real-time dashboard with action log, goal scheduling, and resource browser

Quick Start

1. Install

git clone https://github.com/bdambrosio/Cognitive_workbench.git
cd Cognitive_workbench
python3 -m venv zenoh_venv
source zenoh_venv/bin/activate
pip install -r requirements.txt

2. Configure an LLM backend

Option A — Local GPU (SGLang): Edit scenarios/jill-infospace.yaml and set sgl_model_path to your preferred model.

Option B — Cloud API (no GPU needed):

export OPENROUTER_API_KEY="sk-or-v1-..."   # from openrouter.ai

3. Run

cd src
python3 launcher.py ../scenarios/jill-infospace.yaml --ui --resource-browser
# Or for OpenRouter:
python3 launcher.py ../scenarios/jill-infospace-openrouter.yaml --ui --resource-browser

Open http://localhost:3000 and type:

goal: Find and summarize recent papers on transformer architectures

See Getting Started for full setup details, environment variables, and troubleshooting.

Documentation

Document	Description
Getting Started	Installation, credentials, LLM backend setup, first run
Architecture	Core cognitive architecture — incremental planner, OODA loop, infospace memory
Goals & Scheduling	Goal submission (`goal:` prefix), scheduled goals, daily-at-time, autonomous execution
Envisioning & QC	Conversational envisioning, reflection, failure recovery, missing affordance monitoring
Tools & Primitives	Infospace primitives, tool catalog, run-script, plan tools
Configuration	Scenario YAML reference, available scenarios, directory structure
UI Guide	Web dashboard, resource browser, API endpoints
Tool Development	Creating new tools (`Skill.md` + `tool.py`)
Background	Research motivation and philosophy
Contributor Guidelines	Code style, testing, commit conventions

How It Works (In Brief)

You submit a goal: goal: Monitor stock prices and alert on changes > 5%
The Executive Node queues it and invokes the Incremental Planner
The Planner retrieves relevant context (FAISS), selects tools, then enters a generate-execute-evaluate loop:
- LLM writes a code block calling tools (search-web, stock-price, create-note, etc.)
- Executor runs it, returns structured results
- LLM evaluates: done? next step? error recovery?
Reflection analyzes the full execution trace — updates task state, world model, tool insights
If it failed with a missing capability, the gap is logged for future tool development
Scheduled goals can repeat daily at a set time, or auto-proceed through multi-step workflows

Available Scenarios

Scenario	World	Backend
`jill-infospace.yaml`	Core infospace	SGLang (local GPU)
`jill-infospace-openrouter.yaml`	Core infospace	OpenRouter (cloud)
`jill-fs.yaml`	File system	SGLang
`jill-fs-anthropic.yaml`	File system	Anthropic Claude
`jill-minecraft.yaml`	Minecraft 3D world	SGLang
`jill-osworld.yaml`	Desktop automation	SGLang
`jill-scienceworld.yaml`	Science simulation	SGLang
`jack-and-jill.yaml`	Multi-agent	SGLang

See Configuration for details on each.

Repository Structure

Cognitive_workbench/
├── README.md                          # This file
├── BACKGROUND.md                      # Research philosophy
├── requirements.txt                   # Python dependencies
├── docs/                              # Detailed documentation
├── scenarios/                         # Scenario YAML files + runtime data
└── src/
    ├── launcher.py                    # Entry point
    ├── executive_node.py              # OODA loop coordinator
    ├── incremental_planner.py         # Core planner (the heart of the system)
    ├── infospace_executor.py           # Primitives + tool execution
    ├── infospace_resource_manager.py   # Notes/Collections/Relations + FAISS
    ├── fastapi_action_display.py      # Web UI
    ├── task_scheduler.py              # Autonomous goal scheduling
    ├── tools/                         # Core tools (search-web, run-script, etc.)
    ├── world-tools/                   # World-specific tools (minecraft, fs, etc.)
    ├── scripts/                       # Shell scripts for run-script tool
    └── utils/                         # Shared utilities

Contributing

See src/AGENTS.md for repository guidelines, code style, and commit conventions.

License

MIT License — see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 329 Commits
.vscode		.vscode
.wiki		.wiki
docs		docs
scenarios		scenarios
src		src
tests		tests
#scratch#		#scratch#
.cursorrules		.cursorrules
.gitignore		.gitignore
=6.0		=6.0
BACKGROUND.md		BACKGROUND.md
README.md		README.md
README.old		README.old
bfg-1.14.0.jar		bfg-1.14.0.jar
create_conversation_collection.json		create_conversation_collection.json
current_display.txt		current_display.txt
current_subplanner.txt		current_subplanner.txt
requirements.txt		requirements.txt
run_all_tests.sh		run_all_tests.sh
test_primitives.md		test_primitives.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cognitive Workbench

What This Is

Key Features

Quick Start

1. Install

2. Configure an LLM backend

3. Run

Documentation

How It Works (In Brief)

Available Scenarios

Repository Structure

Contributing

License

About

Uh oh!

Releases 5

Packages

Uh oh!

Contributors

Uh oh!

Languages

bdambrosio/Cognitive_workbench

Folders and files

Latest commit

History

Repository files navigation

Cognitive Workbench

What This Is

Key Features

Quick Start

1. Install

2. Configure an LLM backend

3. Run

Documentation

How It Works (In Brief)

Available Scenarios

Repository Structure

Contributing

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages