Vera: Intelligence sans Frontières

Important

🚀 Help Wanted!

I'm looking for contributors to help with Vera.

Please:

Clone the repo
Make improvements or fixes
Push them back by opening a Pull Request (PR)

Any help is appreciated — thank you!

Vera is still very-much in development any issues running or using the code please post an issue, I will get back to you asap. Please don't be surprised if something doesn't work or is unfinished.

Vera: Intelligence sans Frontières

A Distributed Multi-Agent Architecture for Autonomous Orchestration and Complex Execution

Core Mechanisms: Self-Modification Engine, Multi Agent Congnition, Proactive Reflection, and Execution Engine (SMMAC-PBR-XE)

📺 Follow the above link for an 8 minute video overview of Vera.

🎧 Listen to the Podcast - A 40 minute deep-dive podcast discussing the architecture of Vera.

Warning

Vera has high system requirements
At least 16Gb of idle system RAM, or A GPU with 12Gb+ VRAM and 12 real cores (24 hyperthreaded) running at 3Ghz+. Please check the System Requirements section for more info

Note

Vera utilises the Agentic-Stack-POC repository
To bootstrap the various services required for vera we have built an AI development framework called Agentic-Stack-POC its not required but recommended

Core Components
Component Deep Dive
- Central Executive Orchestrator (CEO)
- Proactive Background Cognition (PBC)
- Composite Knowledge Graph (CKG)
  - Memory Buffer Hierarchy: Micro, Macro, and Meta
  - Memory Explorer UI
- ToolChain Engine (TCE)
- API Integration Shim
- Babelfish Translator (BFT)
- Integration API Shim (IAS)
- Self-Modification Engine (SME)
- Perceptron Forge (PF)
- Edit Pipeline (EP)
Agents

Ingestors

Reference

Safeguarding & Data Privacy
FAQ & Troubleshooting
Known Issues
License
Contact
Roadmap

What is Vera?

Vera is an advanced, model-agnostic, multi-agent AI architecture inspired by principles from cognitive science and agent-based systems. It integrates a framework combining short-term, long-term, and archival memory, token prediction, task triage, reasoning, proactive background cognition, self-modification, and modular tool execution to deliver flexible, intelligent automation.

Why Vera?

While many AI tools exist online, Vera was created to address a specific gap: a self-hosted, locally-running AI system that doesn't require cloud infrastructure or external API dependencies. The motivation has multiple dimensions:

Self-Sovereignty: Run everything locally on hardware you control. No data leaves your machine unless you explicitly configure external integrations. This provides privacy and independence from third-party service availability.

Local Compute Exploration: Vera is an exploration of how far modern LLMs can be pushed when given proper context, memory systems, and tool integration—all running on local hardware. The architecture demonstrates that sophisticated autonomous behavior doesn't require massive cloud resources; it requires smart architecture.

Cost Efficiency: After initial hardware investment, there are no ongoing API costs or subscription fees. For users running intensive workloads, this can represent significant savings compared to cloud-based solutions.

Customization & Control: Full source code control allows deep customization for your specific needs. Add custom tools, agents, and memory structures without vendor restrictions. Self-modification capabilities enable the system to evolve autonomously within your environment.

Research & Experimentation: Vera serves as a testbed for exploring multi-agent architectures, memory systems, and reasoning patterns. The modular design makes it suitable for academic research and experimental AI development.

The Art of the Possible: Ultimately, Vera exists to answer the question: given unrestricted context, persistent memory, proper tool integration, and architectural sophistication—how far can we push local AI systems? The answer is further than many assume.

How does vera work?

Vera orchestrates multiple large language models (LLMs), specialized AI sub-agents and tools synchronously to tackle complex, high-level user requests. It decomposes broad tasks into discrete, manageable steps, then dynamically plans and executes these steps through various external and internal tools to achieve comprehensive outcomes.

This distributed agent design enables parallel specialization—some agents focus on rapid query response, others on strategic forward planning—while sharing a unified memory and goal system to maintain coherence across operations.

A hallmark of Vera's architecture is its capacity for proactive background processing. Autonomous sub-agents continuously monitor context and system state, coordinating via dynamic focus prioritization. This allows Vera to process perceptual inputs, data processing, and environmental interactions adaptively, even without direct user prompts, enabling it to enrich its own memories and progress toward long-term goals.

Vera grounds its intelligence in a highly structured, multi-layered memory system (Layers 1-4) that mirrors human cognition by separating volatile context from persistent knowledge. This memory uses a hybrid storage model: the Neo4j Knowledge Graph stores entities and rich, typed relationships, while ChromaDB serves as a vector database for the full text content of documents, notes, and code, binding the textual information to its contextual network. All the while postgres is keeping an immutable, versioned record of everything.

Vera is fundamentally designed for extensibility and seamless interaction with external systems, achieved through the Integration API Shim (IAS) and the Babelfish Translator (BFT), complemented by the ToolChain Executor (TCE). The IAS serves as a compatibility layer and API endpoint that allows Vera to mimic other LLM APIs (such as those provided by OpenAI's ChatGPT or Anthropic's Claude), enabling Vera to effectively take their place in existing workflows. Crucially, the IAS also allows those external LLM APIs to interface with Vera’s systems, which means Vera can effectively share and pull in context from the external models while simultaneously allowing them access to Vera’s own structured memory and context.

All task execution, whether internal or external, is orchestrated by the ToolChain Executor (TCE), which dynamically plans and executes sequences using available tools. Complementing this execution framework is Babelfish, a universal communication toolkit that is protocol agnostic, enabling the agent to speak any digital protocol—from HTTP and WebSockets, to IRC, MQTT, and more—and to combine multiple carriers into hybrid tunnels. This robust tooling architecture ensures that Vera can operate within virtually any digital environment, providing flexibility for external service integration while preserving comprehensive context and system coherence.

Complementing Vera's cognitive capabilities is a comprehensive suite for autonomous evolution, which ensures the system transcends static programming and continuously improves itself. This suite is anchored by the Self Modification Engine (SME), which acts as a full CI/CD pipeline enabling program synthesis, empowering Vera to autonomously review, generate, and iteratively improve its own codebase, thereby extending its functionality without requiring manual reprogramming.

Further augmenting self-improvement are advanced tools like the Perceptron Forge (PF), which allows Vera to build new models from the fundamental building blocks of all AI models (perceptrons), alongside Model Overlays (currently in development) that provide the capability to overlay additional training onto existing models. By integrating these tools for code evolution and model synthesis, Vera achieves self-reflection and continuous evolution, maintaining adaptability and resilience across rapidly changing task demands and environments.

Use Cases:

1. Automate Planning and Execution

You can delegate multi-step, complex goals with a single command, confident that Vera will handle the underlying workflow:

Execute Comprehensive Workflows

Single Command Delegation: Instead of manually running a sequence of tools or scripts, you can ask Vera to "Research the new compliance requirements, compare them against our current project code, and draft an executive summary of necessary changes."
ToolChain Executor (TCE): Breaks vague goals into discrete steps (research, analysis, comparison, writing) and runs them across various tools dynamically
Unified Output: Provides a single, comprehensive final output from multi-step processes

Handle Failures Seamlessly

Automatic Failure Detection: Initiate complex tasks knowing that if an external service or tool fails halfway through, Vera will automatically detect the failure
Intelligent Replanning: Triggers replanning to recover and retry execution without requiring user intervention
Resilient Execution: Ensures complex tasks continue despite temporary service disruptions

Rapidly Add New Capabilities

Instant Toolkit Expansion: Expand Vera's operational toolkit instantly by defining new data ingestion methods or specialized analysis tools
Immediate Integration: New tools are immediately incorporated into Vera's toolset for use in any future complex plan
Flexible System Growth: Adapt Vera's capabilities to evolving project requirements without system overhaul

2. Access Deep, Multi-Dimensional Context

You can query Vera for knowledge that requires bridging information across months or years of separate interactions, mimicking comprehensive associative memory:

Ask Associative Questions

Cross-Temporal Queries: Pose questions that require connecting information from disparate sources, such as, "Review our Q4 debugging logs and connect any memory consumption issues to the initial architectural discussions we had back in Q1."
Graph-Accelerated Search: Leverages search across the Knowledge Graph Memory (KGM) to retrieve full relational context
Contextual Understanding: Goes beyond simple text matches to provide comprehensive relational insights

Explore Cognitive History

Memory Explorer (MX): Visually traverse Vera's entire knowledge graph to audit or explore relationships between entities, concepts, and projects. Visualise entire systems as dynamic relational graphs, from ip ranges to codebases, you can break down the architecture and properties of any digital system vera has interacted with.
Relationship Mapping: Enables broad or targeted traversal of the knowledge graph for comprehensive understanding
Visual Knowledge Navigation: Provides intuitive exploration of complex information relationships

Maintain Focus on Active Tasks

Micro Buffer Filtering: Ensures Vera's attention is filtered and focused only on the current working set of information (e.g., current function, recent variables)
Real-Time Cognitive Processing: Enables efficient, real-time processing during complex reasoning tasks
Context-Aware Focus: Maintains relevant context while filtering out unrelated information

3. Benefit from Proactive Intelligence

You can rely on Vera to monitor long-term goals and offer guidance or intervention during system downtime:

Receive Context-Aware Alerts

Long-Term Goal Tracking: Set long-term goals and trust Vera to track them in the background
Proactive Background Cognition (PBC): Generates proactive thoughts (reminders, hypotheses, or plans) when critical deadlines approach or inconsistencies are detected
Timely Intervention: Delivers alerts without requiring user prompting, ensuring proactive support

Witness System Learning

Idle Time Optimization: Benefits from Vera using idle time to enrich its own memories, detect inconsistencies, or prepare for future complex operations
Continuous Improvement: Leads to more contextually aware and timely responses through ongoing system learning
Autonomous Knowledge Enrichment: Observes Vera's self-directed learning and preparation activities

4. Manage System Evolution and Improvement

You can initiate or observe continuous, autonomous improvement within the AI itself:

Experience Self-Healing Software

Self Modification Engine (SME): Creates a non-static system that evolves over time
Autonomous Optimization: Detects performance bottlenecks (e.g., vector search latency) and autonomously generates optimized code
Validated Deployment: Tests and validates improvements before deployment, resulting in an AI that fixes its own bugs and improves its own speed over time

Guide Internal Learning

Meta Buffer Utilization: Observe Vera using the Meta Buffer to recognize its own knowledge gaps when faced with novel problems
Strategic Learning Plans: Generates targeted learning strategies like creating research roadmaps or identifying necessary research papers before problem-solving
Informed Problem-Solving: Ensures Vera approaches novel challenges with appropriate preparation and research

5. Communicate Universally

You can integrate Vera into virtually any digital environment:

Interface with Legacy Systems

Babelfish (BFT) Integration: Integrate data streams from any digital protocol, allowing Vera to manage complex environments
Protocol Versatility: Works with both modern APIs (HTTP, WebSockets) and older protocols (IRC, MQTT)
Comprehensive Environment Management: Handles mixed-technology environments seamlessly

Build Custom Networking

Multi-Modal Tunnels: Leverage Babelfish to create hybrid tunnels combining different communication protocols
Resilient Data Paths: Build robust networked data paths that can adapt to various protocol requirements
Novel Network Architectures: Create custom communication solutions tailored to specific operational needs

System Requirements

Minimum Recommended Hardware

CPU Build (Linux)

CPU: 12+ cores (24 hyperthreaded) @ 3GHz+
RAM: 16GB–32GB (or up to 150GB for large deployments)
HDD: 100GB
GPU: None

GPU Build (Linux)

CPU: Varies by workload
RAM: 8GB system + 14–150GB VRAM
HDD: 100GB
GPU: 14–150GB VRAM (NVIDIA recommended)

Understanding the Requirements

RAM: Determines how many models can run simultaneously. 16GB minimum runs single models; 32GB+ enables parallel agent execution.
CPU cores: Each agent requires ~1–2 cores. More cores allow higher parallelism. Hyperthreading counts as 0.5 cores each for planning purposes.
VRAM: GPU builds allow running larger models (20B–70B parameters). CPU-only builds use quantized models (3B–13B).
Storage: Accommodates Neo4j database, ChromaDB vector store, and model weights.

Model Tier Recommendations

Tier	CPU	RAM	Storage	VRAM	Use Case
Basic	8 cores	16GB	100GB	—	Development & testing
Standard	12+ cores	32GB	100GB	—	Production (CPU-only)
Advanced	16+ cores	64GB	200GB	14GB+	GPU-accelerated
Enterprise	24+ cores	150GB+	500GB+	80GB+	Large-scale deployment

Minimum Viable Setup

If you have fewer resources, Vera runs with reduced capability:

16GB RAM + CPU only: Single fast model, no parallelism
8+ physical cores: Suitable for background processing, not real-time queries
Smaller SSD: Start with one small model (3–7B parameters)

Note

Vera is compatible with Windows; however, detailed configuration instructions are provided only for Linux, WSL, and macOS. Windows users may need to adapt the setup process accordingly.

Installation

System Dependencies

Vera requires several external services. You have two options:

Option A: Automated Setup (Recommended)

Use the companion Agentic Stack POC to bootstrap all services via Docker:

git clone https://github.com/BoeJaker/AgenticStack-POC
cd AgenticStack-POC
docker compose up

This starts:

Neo4j server (port 7687)
Ollama with pre-configured models (port 11434)
ChromaDB (port 8000)
Supporting UIs

Option B: Manual Setup

Install required services individually:

# Install Ollama
curl https://ollama.ai/install.sh | sh

# Start Ollama and pull models
ollama serve &
ollama pull gemma2
ollama pull mistral:7b
ollama pull gpt-oss:20b

# Install Neo4j (see: https://neo4j.com/download)
# Installation varies by OS

# ChromaDB installs via pip (see below)

Quick Installation (Recommended)

# Clone the repository
git clone https://github.com/BoeJaker/Vera-AI
cd Vera-AI

# Use Makefile for automated installation
make install-system    # Install system dependencies
make install-python    # Create virtual environment
make install-deps      # Install Python dependencies
make install-browsers  # Install browser drivers
make setup-env         # Create environment configuration
make verify-install    # Verify installation

Or use the single-command installation:

make full-install      # Complete installation process

Manual Installation (Alternative)

Clone the repository

git clone https://github.com/BoeJaker/Vera-AI
cd Vera-AI

Create virtual environment

python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies

pip install -r requirements.txt

Key dependencies:

chromadb – Vector database for semantic memory
playwright – Browser automation and web scraping
requests – HTTP client
tqdm, rich – Terminal UI enhancements
python-dotenv – Environment configuration
llama-index – LLM framework

Install browser drivers

playwright install

Configure environment variables

cp .env.example .env
# Edit .env with your settings:
# - Neo4j connection URL and credentials
# - Ollama API endpoint
# - API keys for external services (optional)

Makefile Installation Commands

Complete Installation:

make full-install                    # One-command full installation

Step-by-Step Installation:

make install-system                  # Install system dependencies
make install-python                  # Setup Python virtual environment  
make install-deps                    # Install Python packages
make install-browsers               # Install Playwright browsers
make setup-env                      # Create environment file
make verify-install                 # Validate installation

Development Installation:

make dev-install                    # Includes development dependencies

Verification & Troubleshooting:

make verify-install                 # Check all components
make check-services                # Verify required services
make install-fix-permissions       # Fix file permissions if needed

Environment Configuration

After installation, configure your environment:

# Copy and edit environment template
make setup-env

# Edit the generated .env file
nano .env

Required environment variables:

# Neo4j Database
NEO4J_URL=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=your_password

# Ollama LLM Service
OLLAMA_BASE_URL=http://localhost:11434

# ChromaDB Vector Database
CHROMADB_HOST=localhost
CHROMADB_PORT=8000

Verification

Verify installation:

make verify-install

Check individual services:

make check-services

Test imports:

python3 -c "from vera import Vera; print('✓ Vera imported successfully')"

If you see the checkmark, installation succeeded. If you get an error, check that all system services are running:

# Check Neo4j
curl http://localhost:7474

# Check Ollama  
curl http://localhost:11434/api/tags

# Check ChromaDB
curl http://localhost:8000/api/v1/heartbeat

Troubleshooting

Common issues:

# Permission errors
make install-fix-permissions

# Missing dependencies
make install-system

# Browser installation issues
make install-browsers-force

# Environment setup
make setup-env-reset

Service health checks:

make check-health                  # Comprehensive health check
make check-neo4j                  # Check Neo4j connection
make check-ollama                 # Check Ollama service
make check-chromadb              # Check ChromaDB status

Quick Start

Download

mkdir ./VeraAI
cd ./VeraAI
git clone https://github.com/BoeJaker/Vera-AI.git
cp ./Vera-AI ./Vera

Terminal Interface

cd <your/path/to/VeraAI>
python -m Vera.vera

Web Interface

cd <your/path/to/VeraAI>
python -m Vera.ChatUI.api.vera_api.py

# Opens on localhost:8000

Open a prowser and visit http://localhost:8000

Python API

from vera import Vera

# Initialize Vera agent system
vera = Vera(chroma_path="./vera_agent_memory")

# Query Vera with a simple prompt
for chunk in vera.stream_llm(vera.fast_llm, "What is the capital of France?"):
    print(chunk, end="")

# Use toolchain engine for complex queries
complex_query = "Schedule a meeting tomorrow and send me the list of projects."
result = vera.execute_tool_chain(complex_query)
print(result)

Docker Stack

docker compose up

Usage: Commands & Flags

`vera.py` System Flags

Use these flags at the command line when starting Vera:

Flag	Description
`--triage-memory`	Enable triage agent memory of past interactions
`--forgetful`	No memories will be saved or recalled this session
`--dumbledore`	Won't respond to questions.
`--replay`	Replays the last plan

In-Chat Commands

Use these commands with / prefix in chat:

Command	Purpose
`/help`	Display available commands
`/status`	Show system status and resource usage
`/memory-stats`	Display memory layer statistics
`/agents-list`	List active agents and their status
`/tools-list`	Show available tools
`/config`	Display current configuration

Configuration

Vera's configuration is controlled via environment variables in .env:

LLM Configuration

FAST_LLM_MODEL=mistral:7b
INTERMEDIATE_LLM_MODEL=gemma2
DEEP_LLM_MODEL=gpt-oss:20b
OLLAMA_API_BASE=http://localhost:11434

Memory Configuration

NEO4J_URL=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=your_password
CHROMA_PATH=./vera_agent_memory

Performance Configuration

MAX_PARALLEL_TASKS=4
CPU_PINNING=false
NUMA_ENABLED=false

Core Concepts

Agents vs LLMs vs Encoders in Vera

A Multi-Level Hierarchy

Vera’s architecture distinguishes between LLMs and Agents, each operating at multiple levels of complexity and capability to handle diverse tasks efficiently.

Encoders

Encoders: Extremely light models specialized to encode text. Parses all data sent the vectorstore.

Large Language Models (LLMs)

LLMs are the foundational language engines performing natural language understanding and generation. Vera uses several LLMs, each specialized by size, speed, and reasoning ability:

Fast LLMs: Smaller, generalized text models, optimized for quick straightforward responses.
Intermediate LLMs: Larger generalized text models that balance speed and reasoning capacity.
Deep LLMs: Large, resource-intensive text models suited for complex reasoning and extended dialogues.
Specialized Reasoning LLMs: Models fine-tuned or architected specifically for heavy logical textual processing and multi-step deduction.

Each LLM level provides different trade-offs between speed, resource use, and depth of reasoning. Models can be upgraded in-place meaning when a new model is released it is plug-and-play so to speak. The memories will carry over as if nothing changed.

How Levels Interact

Lower-level LLMs handle quick, direct responses and routine tasks.
Higher-level LLMs monitor overall goals, manage focus, and coordinate lower-level LLMs activities.
LLMs at different levels are selected dynamically depending on task complexity and required depth of reasoning.

This multi-level, hierarchical approach allows Vera to balance responsiveness with deep cognitive abilities, making it a flexible and powerful autonomous AI system.

Agents

Agents are LLM instances configured with augmented capabilities, including memory management, tool integration, task triage, and autonomous goal setting. Vera’s agents also exist at multiple levels:

Triage Agents: Lightweight agents responsible for prioritizing tasks and delegating work among other agents or tools.
Tool Agents: Lightweight agents using fast LLMs to handle immediate simple tool invocations.
Strategic Agents: Deep-level agents running large LLMs tasked with long-term planning, proactive reflection, and orchestrating complex tool chains.
Specialized Agents: Agents with domain-specific expertise or enhanced reasoning modules, capable of focused tasks like code generation, calendar management, or data analysis.

These LLMs & Agents can communicate via shared memory and coordinate through a dynamic

Micro Models

Micro Models: Tiny models, specialized to complete one task or read a particular dataset. Can be built and trained on a case-by-case basis. Capable of massive parallel reasoning.

Model Overlays

Allows you to overlay additional training onto existing models

LLM Flexibility and Seamless Integration

For users running models locally, Vera is built for seamless integration with Ollama, which is required as a system dependency.

Key Advantages of Local Model Management:

Plug-and-Play Architecture: Vera features a plug-and-play design, meaning that models can be upgraded in-place. When a new LLM is released (such as a newer version of gemma2 or gpt-oss), it can be swapped in without system downtime.
Memory Continuity: Crucially, when an underlying LLM is swapped or upgraded, the system’s deep knowledge base remains intact. The framework ensures that memories will carry over as if nothing changed. This is managed by Vera's highly structured, multi-layered memory system (Layers 1-4) that uses Neo4j for contextual relationships and ChromaDB as a vector database for full text content.

External API Integration and History Retention

Vera provides comprehensive compatibility with external LLM services, enabling users to employ their existing API keys or favourite third-party models, such as OpenAI's ChatGPT or Anthropic's Claude.

This capability is facilitated by the Integration API Shim (IAS). The IAS is a compatibility layer and API endpoint that serves two critical functions for external integration:

Working In-Place of External APIs: The IAS allows Vera to mimic other LLM APIs. This means Vera can effectively take the place of services like ChatGPT or Claude in existing workflows, routing requests through the Vera framework while providing the expected API response format.
Handover and Chat History Retention: The IAS also permits these external LLM APIs to interface with Vera’s systems. This allows you to hand over all LLM tasks to an external service while ensuring that the external model retains access to the vast, structured knowledge base and persistent chat history managed by the Vera framework (Knowledge Graph Memory and vector stores). This prevents the loss of complex, cross-sessional context, enabling deep reasoning regardless of the model or api currently in use.

Core Components

Top level components:

All top level components are designed to run stand-alone or togehter as a complete framework.

CEO - Central Executive Orchestrator
#in-development #poc-working
Responsible for routing requests to the correct agent and creating, destroying & allocating system resources via workers.

PBC- Proactive Background Cognition
#in-development #poc-working
Responsible for co-ordinating long term goals, short term focus and delivering actionables during downtime

TCE - Toolchain Engine
#in-development #poc-working
Breaks down complex tasks into achievable steps then executes the plan using tools, built in or called via an MCP server.

CKG - Composite Knowledge Graph
#in-development #poc-working
Stores memories and their relationships in vector and graph stores. Systematically enriches information stored within the graph

BFT - Babelfish Translator
#production #poc-working
A protocol agnostic communication tool with encryption. Facilitates arbitrary webserver creation, ad-hoc network protocol comms. And VPN construction.

IAS - Integration API Shim
#production #poc-working
Allows Vera to mimic other LLM APIs. Also allows those same APIs to interface with Veras systems.

SME - Self Modification Engine
#in-development
A full CI/CD pipeline for Ver to review and edit its own code.

PF - Perceptron Forge
#in-development
Allows Vera to build new models from the fundamental building blocks of all AI models - perceptrons.

EP - Edit Pipeline #in-development
Version control for edits the AI makes to files, settings etc

User Interfaces

CUI - Chat UI
A web UI to chat with the triage agent. With full duplex speech synthesis and chat logs

SUI - Schedule UI A web UI for the scheduling agent. view & manage your calendar, veras calendar, chat with the scheduling agent

OUI - Orchestrator UI
A web UI for management of the orchestrator

TCEUI - ToolChain Engine UI
A standalone UI for managing the ToolChain Engine

MX - Memory Explorer
A web UI enabling broad or targeted traversal of the knowledge graph. The Graph contains more than just memories, its a networks of relationships. for example if vera has interacted with a network, a map of the network will be navigable in the explorer. If vera has navigated a website, it and all its resources will be mapped into the graph. This allows you to navigate these systems ina visually appealing and data rich form.

GUI - Graph UI A web compponent for monitoring graph events of any scale

Component Deep Dive

1. Central Executive Orchestrator

Task scheduler & worker orchestrator

Purpose: Heart of Vera. Collects performance data, queues user input, and allocates or creates resources locally or in remote worker pools.

Capabilities:

Identifies tasks and steps that can execute in parallel
Schedules execution when resources available
Manages local and remote worker pools
Queues requests when resources exhausted
Provides real-time performance metrics and resource utilization

Example workflow:

1. Query received: "Scan network and analyze results"
2. Query triaged to Toolchain Engine
3. TCE requests: network scanner + analysis LLM
4. CEO: network scanner available → allocate
5. CEO: all analysis LLMs busy → queue request
6. Network scan runs
7. Analysis LLM becomes free
8. CEO dequeues and provides resources
9. TCE receives scan results for analysis

Configuration:

ceo = CEOOrchestrator(
    max_parallel_tasks=4,
    max_queue_depth=20,
    resource_polling_interval=0.5  # seconds
)

2. Proactive Background Cognition

Proactive Background Cognition Documentation

Vera maintains a Proactive Focus Manager that continuously evaluates system priorities, context, and pending goals. During idle moments, it generates proactive thoughts—such as reminders, hypotheses, or plans—that enhance its understanding and readiness for future interactions.

Purpose: Autonomous background thinking engine generating actionable tasks during idle moments.

Capabilities:

Continuously monitors project context and pending goals
Generates proactive thoughts (reminders, hypotheses, plans)
Validates proposed actions using fast LLM
Executes validated actions through toolchain
Maintains focus board tracking progress, ideas, actions, issues
Non-blocking scheduling with configurable intervals
Detect inconsistencies or gaps in knowledge
Anticipate user needs
Prepare for complex multi-step operations
Improve self-awareness and performance over time

Features:

Context-aware task generation: Pulls context from multiple providers (conversation history, focus board, custom sources)
LLM-driven reasoning: Uses deep LLM to generate actionable next steps
Action validation: Fast LLM validates executability before acting
Distributed execution: Integrates with local pools, remote HTTP workers, and Proxmox nodes
Focus tracking: Maintains board showing progress, next steps, ideas, actions, issues
Non-blocking scheduling: Periodic autonomous ticks with configurable intervals

It is designed to integrate seamlessly with local, remote, and Proxmox-based worker nodes, providing a distributed, scalable, and high-throughput execution environment.

Configuration:

pbc = ProactiveBackgroundCognition(
    tick_interval=60,  # Check every 60 seconds
    context_providers=[ConversationProvider(), FocusBoardProvider()],
    max_parallel_thoughts=3,
    action_validation_threshold=0.8
)

3. Composite Knowledge Graph

Above: The memory explorer

Memory Documentation ⚠
Memory Schema

The Vera agent is powered by a sophisticated, multi-layered memory system known as the Composite Knowledge Graph. Designed to mirror human cognition, this architecture separates volatile context from persistent knowledge, enabling both coherent real-time dialogue and deep, relational reasoning over a vast, self-curated knowledge base. The system is built on a core principle: ChromaDB vectorstores hold the raw textual content, the Neo4j graph maps the relationships and context between them, while the Postgres database stotes an imutable ledger of changes over time, system logs and telemetry records.**

Architecture Overview

Vera's memory is structured into four distinct storage layers, excluding Layer 5 each layer contains or is derived from data in the previous layer, each serving a specific purpose in the cognitive process:

Layer 1: Short-Term Buffer - The agent's immediate conversational context.
Layer 2: Working Memory - Its private scratchpad for a single task, session or memory. Gives vera a place to think, make notes, plan.
Layer 3: Long-Term Knowledge - A persistent snapshot of Veras entire mind, an interconnected library of interactions, facts and insights. This is how Vera can quickly derive insights from large datasets.
Layer 4: Temporal Archive - A complete, immutable record of activity logs, metrics, codebase changes, graph changes. Allowing you to 'scroll' back through the entire history of Vera.
Layer 5: External Knowledge Bases - Dynamic networked data stores. Web documentation, APIs, Git repos. Allows Vera to extend its graph beyond its own boundaries.

A key advanced capability, the Memory Buffer, can dynamically bridge Layers to enable unified, cross-sessional, highly enriched reasoning.

Layer 1: Short-Term Context Buffer

Purpose: To maintain the immediate context of the active conversation, ensuring smooth and coherent multi-turn dialogue. This is a volatile, rolling window of recent events. It will contain systenm prompts, user input, the last n chat history entries, vector store matches & nlp data.
Implementation: A simple in-memory buffer (e.g., a list of the last 10-20 message exchanges). This data is transient and is not persisted to any database.
Content: Raw chat history between the user and the agent.

Layer 2: Working Memory (Session Context)

Purpose: To provide an isolated "scratchpad" for the agent's internal monologue, observations, and findings during a specific task, problem, or session or recollection. This allows for exploratory thinking.
Implementation:
- Neo4j (Structure): A Session node is created and linked to relevant entities in the main graph (e.g., (Session)-[:FOCUSED_ON]->(ProjectX)).
- ChromaDB (Content): A dedicated Chroma collection (session_<id>) is created to store the full text of the agent's thoughts, notes, and relevant snippets generated during this session.
Content: Agent's "thoughts," observed facts, code snippets, and summarizations. All data is scoped to the session's task.

Layer 3: Long-Term Knowledge

Purpose: To serve as the agent's persistent, semantically searchable library of validated knowledge. This is the core of its "intelligence," built over time through a careful process of promotion and curation.
Implementation: Layers 1 and two are continually promoted into Layer 3 before session end
- Vector Database - ChromaDB (Content & Semantic Search): The primary long_term_docs collection stores the full text of all important information: documents, code examples, notes, and promoted "thoughts." Each entry contains metadata that points back to the Neo4j graph.
- Knowledge Graph - Neo4j (Context & Relationships): The graph stores all memories, entities & insights (e.g., Project, Document, Person, Feature, Memory) and the rich, typed relationships between them (e.g., USES, AUTHORED_BY, CONTAINS). It does not store large text bodies, only pointers to them in Chroma. See Memory Schema for more information on types.
How It Works (Basic Retrieval):
1. A semantic query is performed on the long_term_docs Chroma collection.
2. The search returns the most relevant text passages and their metadata, including a neo4j_id.
3. This ID is used to fetch the corresponding node and its entire network of relationships from Neo4j.
4. The agent receives both the retrieved text and its full relational context, enabling deep, multi-hop reasoning.

Layer 4: Temporal Archive & Telemetry Stream

Purpose: To provide an immutable, historical record of all agent interactions for auditing, debugging, and future model training. It also allows the system to 'scroll back in time' for the entire graph, just a particular subgraph, section or node.
Implementation: Postgres captures and archives all data and changes flowing through the system.sessions, Queries, memory creations, links, unlinks, deletions, promotion events and more. An optional JSONL stream can act as a backup log.
Content: Raw, timestamped logs of all system activity.

Layer 5: Knowledge Basees

Purpose: External source of truth
Implementation: HTTP / API calls to external services, via requests to resolve data from archives like Wikipedia, DNS Records, OHLCV Data, OWSAP, etc
Content: Typically json blobs

The Promotion Process: From Thought to Knowledge

Promotion is the key mechanism for learning. It transforms ephemeral session data into permanent, connected knowledge.

Identification: At the moment all content is promoted to Layer 3, selective promotion is on the roadmap

Curation: The agent creates a new Memory, Entity or Insight node in the Neo4j graph.
Linking: This new node is parsed with nlp & linked via relationships to all relevant entities (e.g., (Insight)-[:ABOUT]->(Project), (Insight)-[:DERIVED_FROM]->(Document)).
Storage: The full text of the "thought" is inserted into the sessions Chroma collection. The metadata for this entry includes the ID of the new Neo4j node (neo4j_id: <memory_node_id>), permanently binding the text to its contextual graph.

Summary of Data Flow

Conversation happens -> Stored in Layer 1 (Short-Term Buffer).
Agent thinks/acts -> Thoughts stored in Layer 2 (Working Memory Chroma + Graph links).
Valuable insight is made -> Promoted to Layer 3 (LTM Chroma + Graph context).
Cross-sessional query asked -> Macro Buffer orchestrates a search across LTM and relevant Session stores via Graph-Accelerated Search.
Everything is recorded -> Logged to Layer 4 (Archive).

This architecture ensures Vera can fluidly operate in the moment while continuously building a structured, retrievable, and intelligent knowledge base, capable of learning from its entire lived experience.

Memory Buffer Hierarchy: Micro, Macro, and Meta

Vera employs a sophisticated three-tier memory buffer system that operates at different scales of retrieval and reasoning, enabling seamless cognitive processing across temporal and conceptual dimensions.

Think of them as three zoom lenses focusing memory retrieval and processing to the required scale

Micro Buffer: The Working Context Engine

#in-development

The Micro Buffer is always active and serves as the real-time cognitive workspace—managing the immediate context and attention span during active reasoning and task execution.

Purpose: To maintain optimal cognitive load by dynamically managing the active working set of information. It filters, prioritizes, and sequences relevant memories for the current task moment-by-moment.
How it Works:
- Attention Scoring: Continuously scores available memories based on recency, relevance to current task, and relationship strength
- Cognitive Load Management: Limits active context to 7±2 chunks to prevent overload (Miller's Law implementation)
- Real-time Pruning: Drops low-relevance information and promotes high-value context as tasks evolve
- Focus Tracking: Maintains attention on the most salient entities and relationships during complex reasoning
- NLP Processing: Extracts key information and meaning from text and stores them as relationships in the knowledge graph. i.e. triplets, URLs, filepaths, references, entities like person, technology. It can also parse code into relational trees.
Technical Implementation:

// Micro Buffer maintains focus stack during reasoning
MATCH (current:Task {id: $task_id})
MATCH (current)-[:HAS_FOCUS]->(focus_entity)
WITH focus_entity
MATCH (focus_entity)-[r*1..2]-(related)
WHERE r.relevance_score > 0.7
RETURN related 
ORDER BY r.relevance_score DESC 
LIMIT 15  // Working memory constraint

Example Usage: When debugging code, the Micro Buffer automatically maintains focus on the current function, related variables, and recent stack traces while filtering out unrelated project documentation.

Macro Buffer: The Cross-Sessional Associative Engine

#in-development

The Macro Buffer serves as the connective tissue between cognitive sessions—enabling holistic reasoning across time and context boundaries.

Purpose: To break down the isolation between sessions, allowing Vera to connect ideas, hypotheses, and information that were originally recorded in different contexts. This is the foundation for associative reasoning and holistic problem-solving.
How it Works:
- Graph-Accelerated Search: Uses Neo4j to efficiently find relevant sessions and entities across time
- Multi-Collection Vector Search: Performs targeted semantic search across relevant session collections
- Temporal Pattern Recognition: Identifies sequences and evolution of ideas across sessions
- Context Bridging: Creates conceptual bridges between seemingly disconnected sessions
Technical Implementation:

// Macro Buffer: Cross-sessional associative retrieval
MATCH (s:Session)-[:HAS_TOPIC|FOCUSED_ON]->(topic)
WHERE topic.name =~ "(?i).*authentication.*"
WITH collect(DISTINCT s.session_id) as relevant_sessions
MATCH (idea:Concept)-[r:EVOLVED_FROM|RELATED_TO*1..3]-(connected)
WHERE idea.session_id IN relevant_sessions
RETURN idea, connected, r
ORDER BY r.temporal_weight DESC

Benefit: It allows Vera to answer complex, cross-sessional questions like, "What were all the challenges we faced when integrating service X?" by pulling together notes from initial research, debugging logs, and the final summary document.

Meta Buffer: The Strategic Reasoning Layer

#in-development

The Meta Buffer operates as the executive control system—managing higher-order reasoning about reasoning itself, strategic planning, and self-modeling.

Purpose: To enable Vera to reason about its own cognitive processes, identify knowledge gaps, and strategically plan learning and problem-solving approaches.
How it Works:
- Cognitive Pattern Recognition: Identifies recurring reasoning patterns, successful strategies, and common failure modes
- Knowledge Gap Analysis: Detects missing information, contradictory knowledge, and underspecified concepts
- Strategic Planning: Generates learning agendas, research plans, and problem-solving roadmaps
- Self-Modeling: Maintains and updates Vera's understanding of its own capabilities and limitations
Technical Implementation:

// Meta Buffer: Strategic reasoning and gap analysis
MATCH (capability:Capability {name: $current_task})
MATCH (capability)-[r:REQUIRES|BENEFITS_FROM]->(required_knowledge)
OPTIONAL MATCH (vera:SelfModel)-[has:HAS_KNOWLEDGE]->(required_knowledge)
WITH required_knowledge, 
     CASE WHEN has IS NULL THEN 1 ELSE 0 END as knowledge_gap,
     r.importance as importance
WHERE knowledge_gap = 1
RETURN required_knowledge.name as gap, 
       importance,
       "Learning priority: " + toString(importance) as recommendation
ORDER BY importance DESC

Example Usage: When faced with a novel problem, the Meta Buffer might identify that Vera lacks understanding of quantum computing concepts, then generate and execute a learning plan that includes reading research papers, running simulations, and seeking expert knowledge.

Buffer Interaction Dynamics

The three buffers work in concert to create a balanced & comprehensive cognitive experience:

Micro Buffer (Tactical) → Manages immediate working context ↑ ↓ Macro Buffer (Operational) → Connects cross-sessional knowledge
↑ ↓ Meta Buffer (Strategic) → Guides long-term learning and reasoning

Real-world Example: Complex Problem-Solving

Meta Buffer identifies Vera needs to learn about blockchain for a new project
Macro Buffer retrieves all past sessions mentioning cryptography, distributed systems, and related concepts
Micro Buffer manages the immediate context while Vera reads documentation, runs code examples, and tests understanding
Meta Buffer updates Vera's knowledge base with new blockchain capabilities
Macro Buffer connects this new knowledge to existing financial and security concepts
Micro Buffer applies the integrated knowledge to solve the original problem

This hierarchical buffer system enables Vera to operate simultaneously at tactical, operational, and strategic levels—maintaining focus while building comprehensive understanding and planning for future challenges.

This creates a coherent hierarchy where:

Micro = Immediate working memory and attention
Macro = Cross-sessional associative memory
Meta = Strategic reasoning and self-modeling

Each buffer operates at a different temporal and conceptual scale while working together to enable sophisticated, multi-layered cognitive processing.

Advanced Capability: Memory Lifecycle

Discovery - Promotion - Recall - Enrinchment - Continuous Evaluation - Decay - Archiving

Planned feature

Memory Explorer UI

The Cartographer of Consciousness: Mapping the Labyrinth of Thought

Memory Explorer Documentation
Knowledge Graph Documentation
Knowledge Bases Documentation

Start with:

python3 memory_explorer.py

Web UI for traversing the knowledge graph:

The Memory Explorer (MX) is an operational web UI enabling broad or targeted traversal of the knowledge graph. It is a powerful administrative and analytical tool that serves as the observatory for Vera's cognitive landscape. The graph contains far more than internal memories; it maps the living topology of networks of relationships extracted from all data Vera processes—including networks, databases, codebases, and external websites. By visualizing this deep structure, the Explorer provides users with unprecedented serviceability and observability over the ingested data, enabling the direct derivation of complex insights and auditing of Vera's operational history.

Purpose: To transform complex, multi-layered memory structures into interactive, navigable knowledge graphs. It bridges the abstract relationships within Vera's mind with tangible visual representations, making the architecture of intelligence both accessible and explorable.

Capabilities (Serviceability and Administration):

Enabling LLM Questioning and Deep Reasoning:
The Explorer visually maps the full relational context retrieved from the Neo4j graph, allowing users to understand and audit the scope of knowledge available for an LLM query. This ensures the agent is enabled to perform deep, multi-hop reasoning.
Insight Derivation:
Facilitates both macro-scale pattern recognition and micro-scale relationship analysis. Users can trace idea genealogies across sessions and identify emerging knowledge clusters.
Cognitive Observability and Auditing:
Reveals the living topology of memory, exposing how concepts connect, how knowledge evolves over time, and how different memory layers interact. It also provides a window into Version-Aware Telemetry, allowing monitoring of performance metrics (e.g., vector search latency, memory usage) tagged with code versions.
Cross-Sessional Exploration:
Supports the retrieval of relevant knowledge and historical sessions via Graph-Accelerated Search, effectively breaking down isolation between contexts for comprehensive associative recall.

Features (Interactive Mapping of Ingested Data):

Website and API Mapping:
Visualizing Layer 5 External Knowledge Bases, which include dynamic networked data stores like Web documentation, APIs, and Git repos. This allows users to navigate the resources and relationships Vera has extracted from external websites and APIs.
Code and Data Structure Graphing:
Rendering relational trees parsed from code, where the Micro Buffer's NLP processing extracts key relationships (like triplets, URLs, filepaths, references, entities) and stores them in the graph.
Database and Schema Exploration:
Visualizing relationships between entities and insights stored within the Neo4j Knowledge Graph. The graph stores entities (Project, Document, Person, Feature) and the rich, typed relationships between them (e.g., USES, AUTHORED_BY, CONTAINS).
Historical Timeline Review:
Displaying the contents of the Layer 4 Temporal Archive, which is an immutable record of activity logs, metrics, codebase changes, and graph changes. This allows users to 'scroll back in time' through the entire history of Vera.
Self-Modification Traceability:
Visualizing the rationale and impact of autonomous changes, including change records, test results, and performance impact managed by the Self Modification Engine (SME).

4. ToolChain Engine

Automated Multi-Step Tool Orchestration

Warning

Vera has unrestricted access to Bash & Python execution out of the box
Please be very careful with what you ask for. There is nothing stopping it from running rm -rf /. Or Disable these two tools.

ToolChain Engine Documentation

The ToolChain orchestrates the planning and execution of complex workflows by chaining together multiple tools available to the agent. It leverages a deep language model (LLM) to dynamically generate, execute, and verify a sequence of tool calls tailored to solving a user query.

This forms the core of an intelligent, multi-tool orchestration framework that empowers the agent to decompose complex queries into manageable actions, execute them with error handling, and iteratively improve results through self-reflection.

Overview

Planning: Generates a structured plan in JSON format, specifying which tools to call and what inputs to provide, based on the query and historical context.
Execution: Runs each tool in sequence, supports referencing outputs from previous steps ({prev}, {step_n}), and handles errors with automatic replanning.
Memory Integration: Saves intermediate outputs and execution context to the agent's memory for continuity and accountability.
Result Validation: Uses the LLM to verify if the final output meets the original goal, triggering replanning if necessary.
Reporting: Summarizes all executed tool chains, providing insight into past queries, plans, and outcomes.

Key Components and Methods

Method	Description
`__init__(agent, tools)`	Initializes the planner with a reference to the agent and its toolset. Loads chat history for context.
`plan_tool_chain(query, history_context="")`	Generates a JSON-formatted plan of tool calls for the given query, optionally incorporating prior step outputs as context.
`execute_tool_chain(query)`	Executes the planned tool chain step-by-step, resolves references to previous outputs, manages errors, and ensures the goal is met via iterative replanning if needed.
`save_to_memory(user_msg, ai_msg="")`	Stores interactions and outputs to the agent’s memory buffer for context continuity.
`report_history()`	Produces a summarization report of all tool chains executed so far, highlighting queries, plans, results, and patterns.

How It Works

Planning Phase:
It decides the best style of plan for the problem, then constructs a prompt describing available tools and the user query, requesting the LLM to generate a JSON array that outlines the sequence of tool calls and their inputs.
Execution Phase:
Each tool is invoked in order. Inputs referencing outputs from prior steps (e.g., {step_1}, {prev}) are resolved to the actual results. Errors in execution trigger automatic recovery plans via replanning.
Validation & Retry:
After all steps, the planner prompts the LLM to review whether the final output meets the query’s goal. If not, the planner automatically retries with a revised plan.
Memory & Reporting:
All intermediate results and plans are saved to memory for transparency and to aid future planning. The report function provides a concise summary of past activity for audit or review.

Benefits

Dynamic, Context-Aware Planning:
Selects the type of plan & plans tool usage tailored to the problem, reusing historical outputs intelligently.
Error Resilience:
Automatically detects and recovers from tool failures or incomplete results.
Extensible & Modular:
Works with any tool exposed by the agent, provided they follow a callable interface.
Traceability:
Detailed logging and memory save steps ensure all decisions and outputs are recorded.

Purpose: Breaks down complex tasks into achievable steps and executes plans using integrated tools.

Capabilities:

Generates structured JSON execution plans
Supports multiple planning strategies: Batch, Step, and Hybrid
Multiple execution strategies: Sequential, Parallel, Speculative
Error handling with automatic replanning
Result validation against original goal
Full execution history logging

Planning Strategies:

Batch Planning: Generate entire plan upfront

[
  { "tool": "WebSearch", "input": "latest AI trends 2024" },
  { "tool": "WebSearch", "input": "generative AI applications" },
  { "tool": "SummarizerLLM", "input": "{step_1}\n{step_2}" }
]

Step Planning: Generate next step based on prior results

[
  { "tool": "WebSearch", "input": "authenticate with OAuth2" }
]
// After step 1 completes, generate next step

Hybrid Planning: Mix of upfront and adaptive planning

Execution Strategies:

Sequential: Execute steps one at a time (safe, traceable) Parallel: Execute independent steps concurrently (faster) Speculative: Run multiple possible next steps, then prune based on validation (advanced)

Example usage:

planner = ToolChainPlanner(agent, agent.tools)

# Execute a multi-tool workflow
query = "Retrieve latest weather for New York and generate a report"
final_output = planner.execute_tool_chain(query)
print("Result:", final_output)

# Generate execution history report
history = planner.report_history()
print(history)

Plan format expected from LLM:

[
  { "tool": "SearchAPI", "input": "latest weather New York" },
  { "tool": "SummarizerLLM", "input": "{step_1}" }
]

Placeholders like {step_1} or {prev} are replaced with actual outputs during execution.

Example Usage

# Assume you have an initialized agent with tools and a deep LLM model

# Create the planner instance
planner = ToolChainPlanner(agent, agent.tools)

# Simple query example: plan and execute a multi-tool workflow
query = "Retrieve the latest weather for New York and generate a summary report."

final_output = planner.execute_tool_chain(query)
print("Final Output:", final_output)

# Generate a report summarizing all past toolchain executions
history_report = planner.report_history()
print("Execution History Report:\n", history_report)

Tools

Internal Tools

Local Tools

MCP Tools

Integration Notes

Agent & Tools Setup:
ToolChainPlanner expects an agent object that exposes:
- deep_llm: a language model instance with an invoke(prompt: str) -> str method for prompt completion.
- tools: a list of tool objects, each having a name attribute and a callable interface (e.g., run(), func(), or __call__).
- buffer_memory: an object that manages short-term chat history, providing context for planning and execution.
- save_to_memory(user_msg, ai_msg): method to record interaction steps and outputs.
Tool Interface:
Tools can be any callable entity that takes a single string input and returns a string output. This abstraction allows mixing LLM-based tools, APIs, or custom functions.
Plan Format:
The planner expects the LLM to output a pure JSON list of objects like:
```
[
  { "tool": "SearchAPI", "input": "latest weather New York" },
  { "tool": "SummarizerLLM", "input": "{step_1}" }
]
```
The planner replaces placeholders like {step_1} or {prev} with actual outputs during execution.
Error Handling:
If a tool execution fails or an output is missing, the planner automatically triggers a replanning phase to recover and retry.
Extensibility:
To add new tools, simply ensure they conform to the callable interface and add them to the agent’s tools list. The planner will dynamically list them and can invoke them in plans.
Logging & Debugging:
The planner prints detailed step-by-step execution logs, useful for debugging the tool chain behavior and inspecting intermediate results.

This comprehensive toolset architecture enables Vera to break down high-level goals into concrete, manageable steps executed with precision across multiple domains, making it a powerful assistant in diverse environments.

Tools can be chained together dynamically by Vera’s Tool Chain Planner, which uses deep reasoning to break down complex queries into executable sequences.

5. API Integration Shim

A compatability layer and API endpoint for Vera. Allows vera to take the pace of other LLM APIs like OpenAIs Chat GPT or Anthropics Claude. It also allows these APIs to interface with the Vera framework

Purpose: Compatibility layer allowing Vera to mimic other LLM APIs (OpenAI, Anthropic) and allowing those APIs to interface with Vera systems.

Capabilities:

Vera responds as if it were ChatGPT, Claude, or other LLM APIs
Drop-in replacement for OpenAI's Chat Completions API
Stream and batch inference
Token counting
Embedding generation

Use case: Use existing OpenAI client libraries with Vera running locally:

# This code thinks it's talking to OpenAI, but it's using local Vera
from openai import OpenAI

client = OpenAI(base_url="http://localhost:8000", api_key="vera")

response = client.chat.completions.create(
    model="vera-default",
    messages=[{"role": "user", "content": "Explain quantum entanglement"}]
)

print(response.choices[0].message.content)

The IAS means you can use Vera with any tool expecting an OpenAI-compatible API.

6. Babelfish

Babelfih Documentation

a universal communication toolkit for AI agents and distributed systems. It enables your agent to speak any digital protocol — from HTTP and WebSockets, to MQTT, SSH, IRC, LoRa, Matrix, Slack, and even experimental transports like WebRTC and QUIC/HTTP3.

At its core, Babelfish acts like a networking “translator”:

Every protocol looks the same to the agent (open → send → receive → close).
The agent can freely combine multiple carriers into hybrid tunnels (multi-modal VPNs).
Protocols are grouped into layers, similar to a networking stack, for modularity and extensibility.

7. Self-Modification Engine

Autonomous Evolution Through Continuous Integration

Vera's self-modification capability represents a paradigm shift in AI architecture—enabling continuous, autonomous evolution of its own codebase through a sophisticated CI/CD pipeline that ensures reliability, traceability, and controlled innovation. This isn't mere code generation; it's a complete software development lifecycle managed by the AI itself.

Autonomous Development Workflow

Code Synthesis & Generation

# Vera analyzes its own performance and identifies improvement opportunities
improvement_plan = vera.analyze_performance_gaps()
new_module = vera.generate_optimized_code(improvement_plan)

# Example: Vera identifies a bottleneck in memory retrieval
# Generates optimized vector search algorithm with proper error handling

Pattern Recognition: Identifies inefficiencies, bugs, or missing features through continuous self-monitoring
Context-Aware Generation: Creates code that integrates seamlessly with existing architecture and follows established patterns
Multi-LLM Validation: Uses different LLM specializations for code generation, review, and optimization

Testing & Validation Pipeline

1. Unit Test Generation → Auto-creates comprehensive test cases for new code
2. Integration Testing → Validates compatibility with existing modules
3. Performance Benchmarking → Ensures improvements meet efficiency targets
4. Safety & Security Scanning → Checks for vulnerabilities and ethical concerns

Automated Test Suite:

class SelfModificationTestSuite:
    def test_backwards_compatibility(self):
        """Ensure new code doesn't break existing functionality"""
        assert existing_workflows_still_function()
    
    def test_performance_improvement(self):
        """Verify generated code meets performance targets"""
        assert new_algorithm.faster_than(previous_version)
    
    def test_memory_safety(self):
        """Check for memory leaks and resource management"""
        assert no_memory_leaks_detected()

Version-Controlled Evolution

Git-Integrated Workflow

Every autonomous code modification follows a structured version control process:

# Automated commit messages with context
git commit -m "feat(memory-optimizer): Vector search optimization v2.1.3
- Reduced latency by 42% through improved indexing
- Added fallback mechanisms for corrupted vector stores
- Maintains full backwards compatibility
- Generated by Vera-Agent #session-1756642265"

Version Tagging System:

Workflow Versions: Every autonomous modification cycle receives a unique version tag
Session Linking: Code changes reference the session and reasoning that prompted them
Rollback Capability: Automatic snapshots enable instant reversion if issues detected

Change Management

# Every modification is logged with full context
change_record = {
    "version": "memory-optimizer-v2.1.3",
    "timestamp": "2024-01-15T14:30:00Z",
    "trigger": "performance_analysis_session_1756642265",
    "rationale": "Vector search latency exceeding 200ms threshold",
    "changes": {
        "files_modified": ["/core/memory/vector_search.py"],
        "tests_added": ["test_vector_search_optimization.py"],
        "performance_impact": "42% latency reduction",
        "compatibility": "full_backwards_compatible"
    },
    "validation_results": {
        "unit_tests": "passed",
        "integration_tests": "passed", 
        "performance_tests": "exceeded_targets",
        "security_scan": "clean"
    }
}

Comprehensive Change Logging

Archive Integration

All self-modification activities are immutably logged to Layer 4 Archive with forensic-level detail:

Modification Records Include:

Pre-modification State: Complete snapshot of codebase before changes
Generation Context: LLM prompts, reasoning chains, and alternative approaches considered
Validation Evidence: Test results, performance metrics, security scans
Rollback Procedures: Automated scripts for reverting changes if needed
Impact Analysis: Predicted and actual effects on system performance

{
  "self_modification_event": {
    "event_id": "sm-20240115-143000-1756642265",
    "version_tag": "memory-optimizer-v2.1.3",
    "initiating_session": "session-1756642265",
    "trigger_condition": "vector_search_latency > 200ms",
    "code_generation": {
      "llm_used": "deep-reasoning-llm",
      "prompt_context": "Optimize vector search while maintaining accuracy...",
      "reasoning_chain": ["identified bottleneck", "researched algorithms", "selected approach"],
      "alternatives_considered": 3
    },
    "testing_results": {
      "unit_tests": {"passed": 15, "failed": 0},
      "integration_tests": {"compatibility": "verified", "performance": "improved"},
      "security_scan": {"vulnerabilities": 0, "warnings": 1}
    },
    "deployment_impact": {
      "performance_change": "+42% speed",
      "memory_usage": "-15%",
      "accuracy_change": "+0% maintained"
    }
  }
}

Observability & Monitoring

Real-time Modification Dashboard

Self-Modification Monitor 🛠️
──────────────────────────────
Current Version: memory-optimizer-v2.1.3
Active Modifications: 1
Tests Passing: 15/15
Performance Impact: +42% ✅
Rollback Ready: Yes

Recent Changes:
✅ 2024-01-15 14:30 - Vector search optimized
✅ 2024-01-15 11:20 - Memory caching improved  
✅ 2024-01-14 16:45 - Error handling enhanced

Version-Aware Telemetry

Every workflow execution includes version metadata for precise performance tracking:

# All tool executions tagged with code versions
execution_context = {
    "workflow_id": "weather-analysis-1756642300",
    "code_versions": {
        "memory_layer": "v3.2.1",
        "vector_search": "v2.1.3",  # Newly optimized version
        "tool_orchestrator": "v1.5.2"
    },
    "performance_metrics": {
        "vector_search_latency": "116ms",  # Track improvement
        "memory_usage": "45MB",
        "accuracy_score": 0.94
    }
}

Safety & Control Mechanisms

Multi-Layer Approval Process

Automated Validation: Comprehensive test suites must pass
Performance Gates: New code must meet or exceed performance thresholds
Security Scanning: Static analysis and vulnerability detection
Human-in-the-Loop (Optional): Critical changes can require human approval
Gradual Rollout: Can deploy to staging environment first

Emergency Rollback Protocols

def emergency_rollback(detected_issue):
    """Automated rollback if issues detected post-deployment"""
    if performance_degradation_detected() or errors_spiking():
        revert_to_previous_version()
        log_rollback_event(detected_issue)
        trigger_analysis_for_fix()

Adaptive Learning Cycle

The self-modification system creates a virtuous cycle of improvement:

Performance Monitoring 
    → Gap Identification
    → Code Generation
    → Validation Testing
    → Versioned Deployment
    → Impact Measurement
    → Further Optimization

Continuous Evolution Metrics:

Code Quality: Test coverage, complexity metrics, documentation completeness
Performance Trends: Latency, accuracy, resource usage over versions
Stability Indicators: Error rates, crash frequency, recovery times
Adaptation Speed: Time from problem identification to deployed solution

This sophisticated self-modification framework transforms Vera from a static AI system into a continuously evolving intelligence that can adapt to new challenges, optimize its own performance, and maintain robust reliability through rigorous version control and comprehensive change tracking—all while providing complete observability into its evolutionary journey.

8. Perceptron Forge (PF)

Purpose: Allow Vera to build new models from fundamental building blocks—enabling specialized micro-models for specific tasks.

Concept: Rather than only using pre-existing models, Vera can create custom models optimized for specific domains or tasks.

Planned capabilities:

Automatic model architecture search
Fine-tuning on domain-specific data
Quantization and optimization
Deployment as specialized agents

9. Edit Pipeline (EP)

Purpose: Version control for all edits Vera makes to files, settings, and configurations.

Planned capabilities:

Track all file modifications with timestamps and reasoning
Enable rollback to previous file states
Audit trail for compliance
Collaborative merging if multiple agents edit same files

Agents

Vera's agent roster includes specialized sub-agents, each with defined responsibilities:

Triage Agent – Routes incoming tasks, prioritizes requests, delegates to appropriate agents or tools.

Planner Agent – Decomposes complex goals into actionable steps, generates execution plans.

Scheduler Agent – Manages task scheduling, handles dependencies, optimizes execution order.

Optimizer Agent – Refines workflows, improves performance, tunes parameters.

Evaluator Agent – Validates outputs, checks goal attainment, triggers refinement if needed.

Extractor Agent – Pulls structured information from unstructured text (documents, web pages, etc.).

Researcher Agent – Conducts information gathering, synthesizes findings, identifies trends.

Summarizer Agent – Condenses large texts into concise summaries at various detail levels.

Editor Agent – Refines writing, checks grammar, improves clarity and tone.

Model Trainer Agent – Fine-tunes models on domain-specific data (in development).

Model Builder Agent – Creates new model architectures from scratch (in development).

Security Analyzer Agent – Dynamic security analysis, penetration testing, vulnerability detection.

Ingestors

Ingestors work at the micro level, pulling data into Vera's memory systems:

Corpus Crawler – Maps corpuses (internet, local files, APIs) into memory structure. Analogous to "reading."

Network Ingestor – Scans networks, ingests topology and service information into memory.

Database Ingestor – Extracts schema and data from databases into Neo4j.

Context Ingestor – Gathers context from Layer 0 & 1 (short-term buffers) for enrichment.

Advanced Usage and Features

Proactive Focus Management

Vera can be configured to trigger background thinking cycles during idle time:

# Trigger proactive background cognition
vera.focus_manager.run_proactive_cycle()

This generates new goals or alerts based on recent conversations and system state.

Streaming Responses for Real-Time Interaction

Vera supports streaming partial results from LLMs, improving user experience during long or complex queries:

for chunk in vera.stream_llm(vera.deep_llm, "Explain quantum computing."):
    print(chunk, end="")

Extending Vera

Tools

You can add new tools by extending the load_tools method with new Tool objects, defining their name, function, and description.

Simple Example:

Adding Custom Tools

You can extend Vera's capabilities by adding new tools:

def load_tools(self):
    tools = super().load_tools()
    
    tools.append(
        Tool(
            name="WeatherAPI",
            func=lambda location: fetch_weather(location),
            description="Fetches current weather for a given location."
        )
    )
    
    return tools

Adding Custom Agents

Create specialized agents for your domain:

class DomainExpertAgent(Agent):
    """An agent specialized in your domain"""
    
    def __init__(self, name, llm, memory):
        super().__init__(name, llm, memory)
        self.expertise = "domain-specific-knowledge"
    
    def process_query(self, query):
        # Custom reasoning logic
        return self.llm.invoke(f"As a {self.expertise} expert: {query}")

# Register with Vera
vera.register_agent(DomainExpertAgent("expert", vera.deep_llm, vera.memory))

Adding Custom Ingestors

Create ingestors to pull data into Vera's memory:

class CustomIngestor(Ingestor):
    def ingest(self, source):
        """Extract data from source and insert into memory"""
        data = self.fetch_from_source(source)
        entities = self.parse_entities(data)
        relationships = self.extract_relationships(data)
        
        self.memory.bulk_insert_nodes(entities)
        self.memory.bulk_insert_relationships(relationships)

# Use it
vera.ingestors.append(CustomIngestor(vera.memory))

Contributing

Vera is designed to be extensible and modular. Here are ways to contribute:

Add new tools: Implement new Tool objects with clearly defined inputs and outputs.
Improve memory models: Experiment with alternative vector DBs or memory encoding strategies.
Enhance planning algorithms: Optimize or replace the tool chain planner for more efficient workflows.
Expand self-modification capabilities: Enable more robust and safe code generation and auto-updating.
Improve UX: Add richer streaming output, UI components, or integrations.

Safeguarding & Data Privacy

Vera runs entirely locally by default—no data is sent to external servers unless you explicitly configure external tools (APIs, web scraping, etc.).

Local Security Considerations:

Vera has unrestricted Bash/Python execution. Only allow trusted users access.
Memory databases (Neo4j, ChromaDB) should be behind authentication in multi-user deployments.
Disable external tool access if processing sensitive data.
Regularly audit the Neo4j audit logs.

External Tool Integration:

If using external APIs:

Secure API keys in .env files
Use read-only API tokens where possible
Monitor API call logs for unusual patterns
Consider VPN/proxy for outbound connections

Data Retention:

Layer 4 (Archive) stores everything permanently. Consider retention policies.
Use /memory-clear to purge all long-term memories (irreversible).
Export and backup Neo4j regularly: neo4j-admin dump --database=neo4j --to=/backup/vera-backup.dump

FAQ

Q: What LLMs does Vera support?
A: Vera is currently built around Ollama models (gemma2, gemma3, gpt-oss), but you can adapt it for any compatible LLM with a Python SDK or API.

Q: Can Vera run headless?
A: Yes. Vera is designed for command-line and backend automation, but can be integrated into GUIs or web apps.

Q: Is Vera safe to run self-modifying code?
A: Self-modification is sandboxed and requires careful review. Vera includes safeguards, but users should always review generated code before production use.

License

See the LICENSE file in the root directory of tis project.

Model Compatibility

Model Type	Example Models	Memory	Use Case	Status
Fast LLM	Mistral 7B, Gemma2 2B	4-8GB	Triage, quick tasks	✅ Supported
Intermediate	Gemma2 9B, Llama 8B	8-16GB	Tool execution	✅ Supported
Deep LLM	Gemma3 27B, GPT-OSS 20B	16-32GB	Complex reasoning	✅ Supported
Specialized	CodeLlama, Math models	Varies	Domain-specific	🔄 Partial

Known Issues

TTS Pacing - On slower hardware the TTS engine may talk faster than the LLM can generate

LLM Reasoning not Visible - If an LLM has reasoning built in ( i.e. deepseek, gpt-oss) it will not display the reasoning in the web or terminal UI leading to a large gap between a query being accepted and answer being given.

Windows configuration requires manual adaptation
TTS pacing issues on slower hardware
LLM reasoning not visible in UI for some models
Resource-intensive on large knowledge graphs

Contact & Support

For questions, feature requests, or help:

GitHub Issues: https://github.com/BoeJaker/Vera-AI/issues
GitHub Discussions: https://github.com/BoeJaker/Vera-AI/discussions
Agentic Stack POC: https://github.com/BoeJaker/AgenticStack-POC

Roadmap

Planned Features:

For detailed tracking, see the GitHub Projects board.

Last Updated: January 2025
Version: 1.0.0 (POC)

Name		Name	Last commit message	Last commit date
Latest commit History 95 Commits
.github		.github
Agents		Agents
ChatUI		ChatUI
Configuration		Configuration
Logging		Logging
Memory		Memory
Ollama		Ollama
Orchestration		Orchestration
Output		Output
ProactiveFocus		ProactiveFocus
Speech		Speech
TerminalUI		TerminalUI
Toolchain		Toolchain
Vera Assistant Docs		Vera Assistant Docs
images		images
performance		performance
plugins		plugins
.env		.env
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
COMPONENTS_REFERENCE.md		COMPONENTS_REFERENCE.md
DEVELOPER_GUIDE.md		DEVELOPER_GUIDE.md
Demo_Questions.md		Demo_Questions.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
TODO.md		TODO.md
__init__.py		__init__.py
config.yaml		config.yaml
docker-compose-orchestration.yml		docker-compose-orchestration.yml
docker-compose.yml		docker-compose.yml
makefile		makefile
makefile.md		makefile.md
requirements.in		requirements.in
requirements.txt		requirements.txt
standards.md		standards.md
start.sh		start.sh
temp_pyvis_graph.html		temp_pyvis_graph.html
vera.py		vera.py
vera_models.json		vera_models.json

License

BoeJaker/Vera-AI

Folders and files

Latest commit

History

Repository files navigation

Vera: Intelligence sans Frontières

A Distributed Multi-Agent Architecture for Autonomous Orchestration and Complex Execution

Contents:

Getting Started

Usage & Configuration

Architecture & Deep Dives

Extension & Contribution

Reference

What is Vera?

Why Vera?

How does vera work?

Use Cases:

1. Automate Planning and Execution

Execute Comprehensive Workflows

Handle Failures Seamlessly

Rapidly Add New Capabilities

2. Access Deep, Multi-Dimensional Context

Ask Associative Questions

Explore Cognitive History

Maintain Focus on Active Tasks

3. Benefit from Proactive Intelligence

Receive Context-Aware Alerts

Witness System Learning

4. Manage System Evolution and Improvement

Experience Self-Healing Software

Guide Internal Learning

5. Communicate Universally

Interface with Legacy Systems

Build Custom Networking

System Requirements

Minimum Recommended Hardware

Understanding the Requirements

Model Tier Recommendations

Minimum Viable Setup

Installation

System Dependencies

Quick Installation (Recommended)

Or use the single-command installation:

Manual Installation (Alternative)

Makefile Installation Commands

Environment Configuration

Verification

Troubleshooting

Quick Start

Download

Terminal Interface

Web Interface

Python API

Docker Stack

Usage: Commands & Flags

vera.py System Flags

In-Chat Commands

Configuration

Core Concepts

Agents vs LLMs vs Encoders in Vera

Encoders

Large Language Models (LLMs)

How Levels Interact

Agents

Micro Models

Model Overlays

LLM Flexibility and Seamless Integration

Key Advantages of Local Model Management:

External API Integration and History Retention

Core Components

Top level components:

User Interfaces

Component Deep Dive

1. Central Executive Orchestrator

2. Proactive Background Cognition

3. Composite Knowledge Graph

Architecture Overview

Layer 1: Short-Term Context Buffer

Layer 2: Working Memory (Session Context)

`vera.py` System Flags