Opus Nx

Open-source platform for persistent AI reasoning research

Opus Nx turns model reasoning traces into persistent, inspectable artifacts you can explore, verify, rerun, and improve. Every thinking step becomes a node in a navigable graph — not a black box.

This repository is positioned for research and open-source collaboration. Run it locally with your own credentials.

A live reasoning session: 17 thinking nodes, typed edges (influence, support, contradiction, refinement), compaction boundaries, and fork branches — all persisted and queryable.

Why Opus Nx

Most AI workflows keep only final answers. Opus Nx keeps the entire reasoning path and supports policy improvement over time.

Standard AI UX	Opus Nx
Final answer only	Persistent reasoning graph artifacts
Single perspective	6-agent swarm + 4-style branching workflows
Limited traceability	Decision points, typed edges, step verification, lifecycle state
Prompt-only iteration	Promote → rerun → compare → retain loops
Ephemeral context	3-tier memory hierarchy (working → recall → archival)

See It In Action

Full visual walkthrough with all screenshots: docs/features.md

Persistent Reasoning Graphs (ThinkGraph)

Every extended thinking session becomes a graph of discrete reasoning steps — nodes scored for confidence, connected by typed edges showing how ideas flow, branch, and build on each other.

Reasoning nodes colored by type (thinking, compaction, fork branch) with edges showing influence, support, contradiction, and refinement relationships. Minimap in corner for navigation.

6-Agent Swarm Orchestration

Deploy a swarm of 6 specialized AI agents that collaborate in real-time via WebSocket streaming. Maestro decomposes the problem, DeepThinker analyzes, Contrarian challenges, Verifier validates, Synthesizer merges, and Metacognition audits.

The Synthesizer agent merging perspectives from all agents into a coherent framework, with live session stats (17 nodes, 62K tokens) and human-in-the-loop checkpoints.

Graph of Thoughts (GoT)

Explore problems using arbitrary reasoning graphs with BFS, DFS, or best-first search. Thoughts branch, aggregate, and get verified at each depth level — a visual implementation of Besta et al. (2023).

A 4-depth GoT tree with 8 branches. Each node shows its thought, confidence score (40%-94%), verification status (Verified/Aggregated), and reasoning path. Color-coded by depth level.

Step-by-Step Verification (PRM)

Process Reward Model verifies each reasoning step independently. See structured steps (CONSIDERATION → HYPOTHESIS → EVALUATION → CONCLUSION) with confidence scores, decision counts, and edge relationships.

13 structured reasoning steps extracted from a single thinking pass. Each step typed (Consideration, Hypothesis, Evaluation, Conclusion) with 1.6K thinking tokens and 13 decision points persisted.

Final steps: EVALUATION → MAIN CONCLUSION → MODEL OUTPUT. The reasoning chain ends with a persisted artifact showing both the internal deliberation and the final structured response.

ThinkFork — 4-Style Divergent Analysis

Fork any question into 4 concurrent reasoning styles: conservative, aggressive, balanced, and contrarian. Each branch reasons independently, then results are compared with confidence scores and key points.

Left: Fork analysis showing 4 perspectives with confidence scores (45%–82%) and synthesis. Right: Metacognitive Insights panel with 3 biases, 3 patterns, and 1 improvement idea detected.

Memory Hierarchy (MemGPT-inspired)

A 3-tier memory system: working context (active reasoning), recall buffer (recent history), and archival storage (long-term knowledge). Entries persist across sessions with semantic search and importance scoring.

Left: Memory hierarchy showing 4 entries in Main Context, 4 in Recall, with importance scores and source types. Right: Session stats with confidence breakdown and token usage visualization.

Current Capabilities

ThinkGraph — Persistent reasoning graphs with queryable nodes and typed edges
ThinkFork — 4-style branching, steering, and debate mode
PRM Verification — Step-level verification with structured reasoning extraction
Agent Swarm — 6-agent orchestration with WebSocket streaming
Graph of Thoughts — BFS/DFS/best-first search over thought trees
Metacognitive Insights — Bias detection, pattern recognition, improvement hypotheses
Memory Hierarchy — 3-tier MemGPT-style memory with semantic retrieval
Hypothesis Lifecycle — Promote → rerun → compare → retain loops
Session Sharing — Persistent sessions with replay and sharing
Evaluation Harnesses — Retrieval benchmarks and quality metrics

Architecture

Two-service runtime with shared persistence:

Browser
  -> Next.js web app (apps/web)
      -> packages/core reasoning modules
      -> packages/db Supabase access
      -> swarm proxy routes
  -> Python FastAPI swarm service (agents)
  -> Supabase Postgres + pgvector

See full architecture details: docs/architecture.md

Quick Start

One-Command Setup

git clone https://github.com/omerakben/opus-nx.git
cd opus-nx
./scripts/dev-start.sh

The setup script handles everything: prerequisites check, dependency install, env bootstrap, connection verify, build, and launch. It will prompt you for API credentials on first run.

Docker Quick Start (Local Database)

Run everything locally with just an Anthropic API key — no Supabase cloud account needed. Data stays on your machine.

Prerequisites: Docker, Node.js 22+, pnpm. Optional: Python 3.12+ and uv for the agent swarm.

git clone https://github.com/omerakben/opus-nx.git
cd opus-nx
./scripts/docker-start.sh

The script handles everything: checks prerequisites, copies .env.docker to .env, prompts for your Anthropic API key, starts a local PostgreSQL + pgvector database in Docker, installs all dependencies, builds the project, and launches the dev servers.

When it's done, open http://localhost:3000 in your browser.

Or step by step:

cp .env.docker .env
# Edit .env → add your ANTHROPIC_API_KEY

docker compose -f docker-compose.local.yml up -d    # Start local DB
pnpm install && pnpm build && pnpm dev              # Install, build, run

Service	URL	Purpose
Dashboard	`http://localhost:3000`	Next.js web app — open this in your browser
Agent Swarm	`http://localhost:8000`	Python FastAPI backend (auto-starts if uv is installed)
REST API	`http://localhost:54321`	Supabase-compatible DB API (used internally)
PostgreSQL	`localhost:54322`	Direct DB access (psql, pgAdmin)

# Lifecycle
./scripts/docker-start.sh --stop       # Stop everything (dev servers + database)
./scripts/docker-start.sh --reset      # Wipe database and start fresh
./scripts/docker-start.sh --db-only    # Start only the database (no dev servers)

# Database access
docker exec -it opus-nx-postgres psql -U postgres -d opus_nx  # Direct SQL access
docker compose -f docker-compose.local.yml logs -f postgres   # Stream DB logs

Manual Setup

1) Prerequisites

Node.js >= 22
pnpm 9.x
Python 3.12+
uv

2) Install

git clone https://github.com/omerakben/opus-nx.git
cd opus-nx
pnpm install

3) Bootstrap local env

pnpm setup

This creates .env and agents/.env if missing and aligns AUTH_SECRET across both files.

4) Add your own credentials

Required values:

ANTHROPIC_API_KEY
AUTH_SECRET
SUPABASE_URL
SUPABASE_SERVICE_ROLE_KEY
SUPABASE_ANON_KEY

5) Verify setup

pnpm setup:verify

6) Run

pnpm dev

Optional local swarm backend:

cd agents
uv run uvicorn src.main:app --reload --port 8000

Credential Ownership Model

Use your own provider accounts and keys.

Do not rely on maintainer personal credentials
Keep AUTH_SECRET consistent across web and agents
Treat demo mode as optional (DEMO_MODE=true only when intentionally enabled)

Key Commands

pnpm dev                        # Start all dev servers
pnpm lint                       # Lint all packages
pnpm typecheck                  # Type-check all packages
pnpm test                       # Run tests
pnpm db:migrate                 # Run Supabase migrations
pnpm setup                      # Bootstrap env files
pnpm setup:verify               # Verify API connections
./scripts/dev-start.sh          # Full setup + launch (recommended)
./scripts/docker-start.sh       # Docker local DB + dev servers
./scripts/docker-start.sh --db-only  # Docker DB only (no dev servers)

Agent tests:

cd agents
uv run pytest

API Groups

Reasoning

POST /api/thinking — Extended thinking request
POST /api/thinking/stream — SSE streaming for thinking deltas
POST /api/fork — ThinkFork parallel reasoning
POST /api/verify — PRM step-by-step verification
POST /api/got — Graph of Thoughts reasoning

Sessions and Artifacts

GET/POST /api/sessions — List/create sessions
GET /api/sessions/[sessionId]/nodes — Get thinking nodes
GET /api/reasoning/[id] — Get reasoning node details

Swarm

POST /api/swarm — Initiate multi-agent swarm
GET /api/swarm/token — WebSocket auth token
POST /api/swarm/[sessionId]/checkpoint — Human-in-the-loop checkpoint

Memory

GET/POST /api/memory — Hierarchical memory operations
GET/POST /api/insights — Metacognitive insights

Tech Stack

Layer	Technology
LLM	Claude Opus 4.6 (50K extended thinking budget)
Dashboard	Next.js 16, React 19, Tailwind CSS 4, shadcn/ui
Agent Swarm	Python 3.12, FastAPI, Anthropic SDK, NetworkX
Database	Supabase (PostgreSQL + pgvector with HNSW indexes)
Visualization	@xyflow/react (react-flow)
Deployment	Vercel (dashboard) + Fly.io (agents)
Runtime	Node.js 22+, TypeScript 5.7+
Testing	Vitest 4, Playwright, pytest

Research Foundation

Implemented concepts are grounded in:

Paper	Module	Key Contribution
Tree of Thoughts (Yao et al., 2023)	ThinkFork	BFS/DFS search over reasoning trees with state evaluation
Let's Verify Step by Step (Lightman et al., 2023)	PRM Verifier	Process supervision — verify each reasoning step independently
Graph of Thoughts (Besta et al., 2023)	GoT Engine	Arbitrary thought graph topology with aggregation and refinement
MemGPT (Packer et al., 2023)	Memory Hierarchy	3-tier memory hierarchy with paging and auto-eviction

See:

Documentation Map

Visual feature guide: docs/features.md
Canonical docs index: docs/README.md
PRD: docs/prd.md
Architecture: docs/architecture.md
Runbooks: docs/runbooks/
Historical docs archive: docs/archive/build-history/

Contributing

Contributions are welcome.

Contribution guide: CONTRIBUTING.md
Code of conduct: CODE_OF_CONDUCT.md

Priority areas:

Reasoning quality and evaluation rigor
Setup ergonomics and onboarding
Lifecycle and experiment UX
Reliability and observability

Built By

Ozzy — AI Engineer & Full-Stack Developer

TUEL AI — AI Research Platform

Claude — AI Research Partner (Anthropic)

A human + AI collaboration exploring persistent reasoning artifacts.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 122 Commits
.github		.github
agents		agents
apps/web		apps/web
configs		configs
docker		docker
docs		docs
images		images
packages		packages
scripts		scripts
supabase		supabase
.env.docker		.env.docker
.env.example		.env.example
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
README.md		README.md
RESEARCH.md		RESEARCH.md
ROADMAP.md		ROADMAP.md
docker-compose.local.yml		docker-compose.local.yml
docker-compose.yml		docker-compose.yml
opus_nx_icon.svg		opus_nx_icon.svg
opus_nx_logo.svg		opus_nx_logo.svg
opus_nx_logo_dark.svg		opus_nx_logo_dark.svg
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
tsconfig.json		tsconfig.json
turbo.json		turbo.json

omerakben/opus-nx

Folders and files

Latest commit

History

Repository files navigation

Opus Nx

Open-source platform for persistent AI reasoning research

Why Opus Nx

See It In Action

Persistent Reasoning Graphs (ThinkGraph)

6-Agent Swarm Orchestration

Graph of Thoughts (GoT)

Step-by-Step Verification (PRM)

ThinkFork — 4-Style Divergent Analysis

Memory Hierarchy (MemGPT-inspired)

Current Capabilities

Architecture

Quick Start

One-Command Setup

Docker Quick Start (Local Database)

Manual Setup

1) Prerequisites

2) Install

3) Bootstrap local env

4) Add your own credentials

5) Verify setup

6) Run

Credential Ownership Model

Key Commands

API Groups

Reasoning

Sessions and Artifacts

Swarm

Memory

Tech Stack

Research Foundation

Documentation Map

Contributing

Built By

License

About

Topics

Resources

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages