Important
🚀 Help Wanted!
I'm looking for contributors to help with Vera.
Please:
- Clone the repo
- Make improvements or fixes
- Push them back by opening a Pull Request (PR)
Any help is appreciated — thank you!
Vera is still very-much in development any issues running or using the code please post an issue, I will get back to you asap. Please don't be surprised if something doesn't work or is unfinished.
Core Mechanisms: Self-Modification Engine, Multi Agent Congnition, Proactive Reflection, and Execution Engine (SMMAC-PBR-XE)
📺 Follow the above link for an 8 minute video overview of Vera.
🎧 Listen to the Podcast - A 40 minute deep-dive podcast discussing the architecture of Vera.
Warning
Vera has high system requirements
At least 16Gb of idle system RAM, or A GPU with 12Gb+ VRAM and 12 real cores (24 hyperthreaded) running at 3Ghz+.
Please check the System Requirements section for more info
Note
Vera utilises the Agentic-Stack-POC repository
To bootstrap the various services required for vera we have built an AI development framework called Agentic-Stack-POC its not required but recommended
- Core Components
- Component Deep Dive
- Agents
Vera is an advanced, model-agnostic, multi-agent AI architecture inspired by principles from cognitive science and agent-based systems. It integrates a framework combining short-term, long-term, and archival memory, token prediction, task triage, reasoning, proactive background cognition, self-modification, and modular tool execution to deliver flexible, intelligent automation.
While many AI tools exist online, Vera was created to address a specific gap: a self-hosted, locally-running AI system that doesn't require cloud infrastructure or external API dependencies. The motivation has multiple dimensions:
Self-Sovereignty: Run everything locally on hardware you control. No data leaves your machine unless you explicitly configure external integrations. This provides privacy and independence from third-party service availability.
Local Compute Exploration: Vera is an exploration of how far modern LLMs can be pushed when given proper context, memory systems, and tool integration—all running on local hardware. The architecture demonstrates that sophisticated autonomous behavior doesn't require massive cloud resources; it requires smart architecture.
Cost Efficiency: After initial hardware investment, there are no ongoing API costs or subscription fees. For users running intensive workloads, this can represent significant savings compared to cloud-based solutions.
Customization & Control: Full source code control allows deep customization for your specific needs. Add custom tools, agents, and memory structures without vendor restrictions. Self-modification capabilities enable the system to evolve autonomously within your environment.
Research & Experimentation: Vera serves as a testbed for exploring multi-agent architectures, memory systems, and reasoning patterns. The modular design makes it suitable for academic research and experimental AI development.
The Art of the Possible: Ultimately, Vera exists to answer the question: given unrestricted context, persistent memory, proper tool integration, and architectural sophistication—how far can we push local AI systems? The answer is further than many assume.
Vera orchestrates multiple large language models (LLMs), specialized AI sub-agents and tools synchronously to tackle complex, high-level user requests. It decomposes broad tasks into discrete, manageable steps, then dynamically plans and executes these steps through various external and internal tools to achieve comprehensive outcomes.
This distributed agent design enables parallel specialization—some agents focus on rapid query response, others on strategic forward planning—while sharing a unified memory and goal system to maintain coherence across operations.
A hallmark of Vera's architecture is its capacity for proactive background processing. Autonomous sub-agents continuously monitor context and system state, coordinating via dynamic focus prioritization. This allows Vera to process perceptual inputs, data processing, and environmental interactions adaptively, even without direct user prompts, enabling it to enrich its own memories and progress toward long-term goals.
Vera grounds its intelligence in a highly structured, multi-layered memory system (Layers 1-4) that mirrors human cognition by separating volatile context from persistent knowledge. This memory uses a hybrid storage model: the Neo4j Knowledge Graph stores entities and rich, typed relationships, while ChromaDB serves as a vector database for the full text content of documents, notes, and code, binding the textual information to its contextual network. All the while postgres is keeping an immutable, versioned record of everything.
Vera is fundamentally designed for extensibility and seamless interaction with external systems, achieved through the Integration API Shim (IAS) and the Babelfish Translator (BFT), complemented by the ToolChain Executor (TCE). The IAS serves as a compatibility layer and API endpoint that allows Vera to mimic other LLM APIs (such as those provided by OpenAI's ChatGPT or Anthropic's Claude), enabling Vera to effectively take their place in existing workflows. Crucially, the IAS also allows those external LLM APIs to interface with Vera’s systems, which means Vera can effectively share and pull in context from the external models while simultaneously allowing them access to Vera’s own structured memory and context.
All task execution, whether internal or external, is orchestrated by the ToolChain Executor (TCE), which dynamically plans and executes sequences using available tools. Complementing this execution framework is Babelfish, a universal communication toolkit that is protocol agnostic, enabling the agent to speak any digital protocol—from HTTP and WebSockets, to IRC, MQTT, and more—and to combine multiple carriers into hybrid tunnels. This robust tooling architecture ensures that Vera can operate within virtually any digital environment, providing flexibility for external service integration while preserving comprehensive context and system coherence.
Complementing Vera's cognitive capabilities is a comprehensive suite for autonomous evolution, which ensures the system transcends static programming and continuously improves itself. This suite is anchored by the Self Modification Engine (SME), which acts as a full CI/CD pipeline enabling program synthesis, empowering Vera to autonomously review, generate, and iteratively improve its own codebase, thereby extending its functionality without requiring manual reprogramming.
Further augmenting self-improvement are advanced tools like the Perceptron Forge (PF), which allows Vera to build new models from the fundamental building blocks of all AI models (perceptrons), alongside Model Overlays (currently in development) that provide the capability to overlay additional training onto existing models. By integrating these tools for code evolution and model synthesis, Vera achieves self-reflection and continuous evolution, maintaining adaptability and resilience across rapidly changing task demands and environments.
You can delegate multi-step, complex goals with a single command, confident that Vera will handle the underlying workflow:
- Single Command Delegation: Instead of manually running a sequence of tools or scripts, you can ask Vera to "Research the new compliance requirements, compare them against our current project code, and draft an executive summary of necessary changes."
- ToolChain Executor (TCE): Breaks vague goals into discrete steps (research, analysis, comparison, writing) and runs them across various tools dynamically
- Unified Output: Provides a single, comprehensive final output from multi-step processes
- Automatic Failure Detection: Initiate complex tasks knowing that if an external service or tool fails halfway through, Vera will automatically detect the failure
- Intelligent Replanning: Triggers replanning to recover and retry execution without requiring user intervention
- Resilient Execution: Ensures complex tasks continue despite temporary service disruptions
- Instant Toolkit Expansion: Expand Vera's operational toolkit instantly by defining new data ingestion methods or specialized analysis tools
- Immediate Integration: New tools are immediately incorporated into Vera's toolset for use in any future complex plan
- Flexible System Growth: Adapt Vera's capabilities to evolving project requirements without system overhaul
You can query Vera for knowledge that requires bridging information across months or years of separate interactions, mimicking comprehensive associative memory:
- Cross-Temporal Queries: Pose questions that require connecting information from disparate sources, such as, "Review our Q4 debugging logs and connect any memory consumption issues to the initial architectural discussions we had back in Q1."
- Graph-Accelerated Search: Leverages search across the Knowledge Graph Memory (KGM) to retrieve full relational context
- Contextual Understanding: Goes beyond simple text matches to provide comprehensive relational insights
- Memory Explorer (MX): Visually traverse Vera's entire knowledge graph to audit or explore relationships between entities, concepts, and projects. Visualise entire systems as dynamic relational graphs, from ip ranges to codebases, you can break down the architecture and properties of any digital system vera has interacted with.
- Relationship Mapping: Enables broad or targeted traversal of the knowledge graph for comprehensive understanding
- Visual Knowledge Navigation: Provides intuitive exploration of complex information relationships
- Micro Buffer Filtering: Ensures Vera's attention is filtered and focused only on the current working set of information (e.g., current function, recent variables)
- Real-Time Cognitive Processing: Enables efficient, real-time processing during complex reasoning tasks
- Context-Aware Focus: Maintains relevant context while filtering out unrelated information
You can rely on Vera to monitor long-term goals and offer guidance or intervention during system downtime:
- Long-Term Goal Tracking: Set long-term goals and trust Vera to track them in the background
- Proactive Background Cognition (PBC): Generates proactive thoughts (reminders, hypotheses, or plans) when critical deadlines approach or inconsistencies are detected
- Timely Intervention: Delivers alerts without requiring user prompting, ensuring proactive support
- Idle Time Optimization: Benefits from Vera using idle time to enrich its own memories, detect inconsistencies, or prepare for future complex operations
- Continuous Improvement: Leads to more contextually aware and timely responses through ongoing system learning
- Autonomous Knowledge Enrichment: Observes Vera's self-directed learning and preparation activities
You can initiate or observe continuous, autonomous improvement within the AI itself:
- Self Modification Engine (SME): Creates a non-static system that evolves over time
- Autonomous Optimization: Detects performance bottlenecks (e.g., vector search latency) and autonomously generates optimized code
- Validated Deployment: Tests and validates improvements before deployment, resulting in an AI that fixes its own bugs and improves its own speed over time
- Meta Buffer Utilization: Observe Vera using the Meta Buffer to recognize its own knowledge gaps when faced with novel problems
- Strategic Learning Plans: Generates targeted learning strategies like creating research roadmaps or identifying necessary research papers before problem-solving
- Informed Problem-Solving: Ensures Vera approaches novel challenges with appropriate preparation and research
You can integrate Vera into virtually any digital environment:
- Babelfish (BFT) Integration: Integrate data streams from any digital protocol, allowing Vera to manage complex environments
- Protocol Versatility: Works with both modern APIs (HTTP, WebSockets) and older protocols (IRC, MQTT)
- Comprehensive Environment Management: Handles mixed-technology environments seamlessly
- Multi-Modal Tunnels: Leverage Babelfish to create hybrid tunnels combining different communication protocols
- Resilient Data Paths: Build robust networked data paths that can adapt to various protocol requirements
- Novel Network Architectures: Create custom communication solutions tailored to specific operational needs
CPU Build (Linux)
- CPU: 12+ cores (24 hyperthreaded) @ 3GHz+
- RAM: 16GB–32GB (or up to 150GB for large deployments)
- HDD: 100GB
- GPU: None
GPU Build (Linux)
- CPU: Varies by workload
- RAM: 8GB system + 14–150GB VRAM
- HDD: 100GB
- GPU: 14–150GB VRAM (NVIDIA recommended)
- RAM: Determines how many models can run simultaneously. 16GB minimum runs single models; 32GB+ enables parallel agent execution.
- CPU cores: Each agent requires ~1–2 cores. More cores allow higher parallelism. Hyperthreading counts as 0.5 cores each for planning purposes.
- VRAM: GPU builds allow running larger models (20B–70B parameters). CPU-only builds use quantized models (3B–13B).
- Storage: Accommodates Neo4j database, ChromaDB vector store, and model weights.
| Tier | CPU | RAM | Storage | VRAM | Use Case |
|---|---|---|---|---|---|
| Basic | 8 cores | 16GB | 100GB | — | Development & testing |
| Standard | 12+ cores | 32GB | 100GB | — | Production (CPU-only) |
| Advanced | 16+ cores | 64GB | 200GB | 14GB+ | GPU-accelerated |
| Enterprise | 24+ cores | 150GB+ | 500GB+ | 80GB+ | Large-scale deployment |
If you have fewer resources, Vera runs with reduced capability:
- 16GB RAM + CPU only: Single fast model, no parallelism
- 8+ physical cores: Suitable for background processing, not real-time queries
- Smaller SSD: Start with one small model (3–7B parameters)
Note
Vera is compatible with Windows; however, detailed configuration instructions are provided only for Linux, WSL, and macOS. Windows users may need to adapt the setup process accordingly.
Vera requires several external services. You have two options:
Option A: Automated Setup (Recommended)
Use the companion Agentic Stack POC to bootstrap all services via Docker:
git clone https://github.com/BoeJaker/AgenticStack-POC
cd AgenticStack-POC
docker compose upThis starts:
- Neo4j server (port 7687)
- Ollama with pre-configured models (port 11434)
- ChromaDB (port 8000)
- Supporting UIs
Option B: Manual Setup
Install required services individually:
# Install Ollama
curl https://ollama.ai/install.sh | sh
# Start Ollama and pull models
ollama serve &
ollama pull gemma2
ollama pull mistral:7b
ollama pull gpt-oss:20b
# Install Neo4j (see: https://neo4j.com/download)
# Installation varies by OS
# ChromaDB installs via pip (see below)# Clone the repository
git clone https://github.com/BoeJaker/Vera-AI
cd Vera-AI
# Use Makefile for automated installation
make install-system # Install system dependencies
make install-python # Create virtual environment
make install-deps # Install Python dependencies
make install-browsers # Install browser drivers
make setup-env # Create environment configuration
make verify-install # Verify installationmake full-install # Complete installation process- Clone the repository
git clone https://github.com/BoeJaker/Vera-AI
cd Vera-AI- Create virtual environment
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies
pip install -r requirements.txtKey dependencies:
chromadb– Vector database for semantic memoryplaywright– Browser automation and web scrapingrequests– HTTP clienttqdm,rich– Terminal UI enhancementspython-dotenv– Environment configurationllama-index– LLM framework
- Install browser drivers
playwright install- Configure environment variables
cp .env.example .env
# Edit .env with your settings:
# - Neo4j connection URL and credentials
# - Ollama API endpoint
# - API keys for external services (optional)Complete Installation:
make full-install # One-command full installationStep-by-Step Installation:
make install-system # Install system dependencies
make install-python # Setup Python virtual environment
make install-deps # Install Python packages
make install-browsers # Install Playwright browsers
make setup-env # Create environment file
make verify-install # Validate installationDevelopment Installation:
make dev-install # Includes development dependenciesVerification & Troubleshooting:
make verify-install # Check all components
make check-services # Verify required services
make install-fix-permissions # Fix file permissions if neededAfter installation, configure your environment:
# Copy and edit environment template
make setup-env
# Edit the generated .env file
nano .envRequired environment variables:
# Neo4j Database
NEO4J_URL=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=your_password
# Ollama LLM Service
OLLAMA_BASE_URL=http://localhost:11434
# ChromaDB Vector Database
CHROMADB_HOST=localhost
CHROMADB_PORT=8000Verify installation:
make verify-installCheck individual services:
make check-servicesTest imports:
python3 -c "from vera import Vera; print('✓ Vera imported successfully')"If you see the checkmark, installation succeeded. If you get an error, check that all system services are running:
# Check Neo4j
curl http://localhost:7474
# Check Ollama
curl http://localhost:11434/api/tags
# Check ChromaDB
curl http://localhost:8000/api/v1/heartbeatCommon issues:
# Permission errors
make install-fix-permissions
# Missing dependencies
make install-system
# Browser installation issues
make install-browsers-force
# Environment setup
make setup-env-resetService health checks:
make check-health # Comprehensive health check
make check-neo4j # Check Neo4j connection
make check-ollama # Check Ollama service
make check-chromadb # Check ChromaDB statusmkdir ./VeraAI
cd ./VeraAI
git clone https://github.com/BoeJaker/Vera-AI.git
cp ./Vera-AI ./Veracd <your/path/to/VeraAI>
python -m Vera.veracd <your/path/to/VeraAI>
python -m Vera.ChatUI.api.vera_api.py
# Opens on localhost:8000Open a prowser and visit http://localhost:8000
from vera import Vera
# Initialize Vera agent system
vera = Vera(chroma_path="./vera_agent_memory")
# Query Vera with a simple prompt
for chunk in vera.stream_llm(vera.fast_llm, "What is the capital of France?"):
print(chunk, end="")
# Use toolchain engine for complex queries
complex_query = "Schedule a meeting tomorrow and send me the list of projects."
result = vera.execute_tool_chain(complex_query)
print(result)docker compose upUse these flags at the command line when starting Vera:
| Flag | Description |
|---|---|
--triage-memory |
Enable triage agent memory of past interactions |
--forgetful |
No memories will be saved or recalled this session |
--dumbledore |
Won't respond to questions. |
--replay |
Replays the last plan |
Use these commands with / prefix in chat:
| Command | Purpose |
|---|---|
/help |
Display available commands |
/status |
Show system status and resource usage |
/memory-stats |
Display memory layer statistics |
/agents-list |
List active agents and their status |
/tools-list |
Show available tools |
/config |
Display current configuration |
Vera's configuration is controlled via environment variables in .env:
LLM Configuration
FAST_LLM_MODEL=mistral:7b
INTERMEDIATE_LLM_MODEL=gemma2
DEEP_LLM_MODEL=gpt-oss:20b
OLLAMA_API_BASE=http://localhost:11434
Memory Configuration
NEO4J_URL=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=your_password
CHROMA_PATH=./vera_agent_memory
Performance Configuration
MAX_PARALLEL_TASKS=4
CPU_PINNING=false
NUMA_ENABLED=false
A Multi-Level Hierarchy
Vera’s architecture distinguishes between LLMs and Agents, each operating at multiple levels of complexity and capability to handle diverse tasks efficiently.
- Encoders: Extremely light models specialized to encode text. Parses all data sent the vectorstore.
LLMs are the foundational language engines performing natural language understanding and generation. Vera uses several LLMs, each specialized by size, speed, and reasoning ability:
-
Fast LLMs: Smaller, generalized text models, optimized for quick straightforward responses.
-
Intermediate LLMs: Larger generalized text models that balance speed and reasoning capacity.
-
Deep LLMs: Large, resource-intensive text models suited for complex reasoning and extended dialogues.
-
Specialized Reasoning LLMs: Models fine-tuned or architected specifically for heavy logical textual processing and multi-step deduction.
Each LLM level provides different trade-offs between speed, resource use, and depth of reasoning. Models can be upgraded in-place meaning when a new model is released it is plug-and-play so to speak. The memories will carry over as if nothing changed.
-
Lower-level LLMs handle quick, direct responses and routine tasks.
-
Higher-level LLMs monitor overall goals, manage focus, and coordinate lower-level LLMs activities.
-
LLMs at different levels are selected dynamically depending on task complexity and required depth of reasoning.
This multi-level, hierarchical approach allows Vera to balance responsiveness with deep cognitive abilities, making it a flexible and powerful autonomous AI system.
Agents are LLM instances configured with augmented capabilities, including memory management, tool integration, task triage, and autonomous goal setting. Vera’s agents also exist at multiple levels:
-
Triage Agents: Lightweight agents responsible for prioritizing tasks and delegating work among other agents or tools.
-
Tool Agents: Lightweight agents using fast LLMs to handle immediate simple tool invocations.
-
Strategic Agents: Deep-level agents running large LLMs tasked with long-term planning, proactive reflection, and orchestrating complex tool chains.
-
Specialized Agents: Agents with domain-specific expertise or enhanced reasoning modules, capable of focused tasks like code generation, calendar management, or data analysis.
These LLMs & Agents can communicate via shared memory and coordinate through a dynamic
- Micro Models: Tiny models, specialized to complete one task or read a particular dataset. Can be built and trained on a case-by-case basis. Capable of massive parallel reasoning.
Allows you to overlay additional training onto existing models
For users running models locally, Vera is built for seamless integration with Ollama, which is required as a system dependency.
- Plug-and-Play Architecture: Vera features a plug-and-play design, meaning that models can be upgraded in-place. When a new LLM is released (such as a newer version of
gemma2orgpt-oss), it can be swapped in without system downtime. - Memory Continuity: Crucially, when an underlying LLM is swapped or upgraded, the system’s deep knowledge base remains intact. The framework ensures that memories will carry over as if nothing changed. This is managed by Vera's highly structured, multi-layered memory system (Layers 1-4) that uses Neo4j for contextual relationships and ChromaDB as a vector database for full text content.
Vera provides comprehensive compatibility with external LLM services, enabling users to employ their existing API keys or favourite third-party models, such as OpenAI's ChatGPT or Anthropic's Claude.
This capability is facilitated by the Integration API Shim (IAS). The IAS is a compatibility layer and API endpoint that serves two critical functions for external integration:
- Working In-Place of External APIs: The IAS allows Vera to mimic other LLM APIs. This means Vera can effectively take the place of services like ChatGPT or Claude in existing workflows, routing requests through the Vera framework while providing the expected API response format.
- Handover and Chat History Retention: The IAS also permits these external LLM APIs to interface with Vera’s systems. This allows you to hand over all LLM tasks to an external service while ensuring that the external model retains access to the vast, structured knowledge base and persistent chat history managed by the Vera framework (Knowledge Graph Memory and vector stores). This prevents the loss of complex, cross-sessional context, enabling deep reasoning regardless of the model or api currently in use.
All top level components are designed to run stand-alone or togehter as a complete framework.
CEO - Central Executive Orchestrator
#in-development #poc-working
Responsible for routing requests to the correct agent and creating, destroying & allocating system resources via workers.
PBC- Proactive Background Cognition
#in-development #poc-working
Responsible for co-ordinating long term goals, short term focus and delivering actionables during downtime
TCE - Toolchain Engine
#in-development #poc-working
Breaks down complex tasks into achievable steps then executes the plan using tools, built in or called via an MCP server.
CKG - Composite Knowledge Graph
#in-development #poc-working
Stores memories and their relationships in vector and graph stores.
Systematically enriches information stored within the graph
BFT - Babelfish Translator
#production #poc-working
A protocol agnostic communication tool with encryption. Facilitates arbitrary webserver creation, ad-hoc network protocol comms. And VPN construction.
IAS - Integration API Shim
#production #poc-working
Allows Vera to mimic other LLM APIs. Also allows those same APIs to interface with Veras systems.
SME - Self Modification Engine
#in-development
A full CI/CD pipeline for Ver to review and edit its own code.
PF - Perceptron Forge
#in-development
Allows Vera to build new models from the fundamental building blocks of all AI models - perceptrons.
EP - Edit Pipeline
#in-development
Version control for edits the AI makes to files, settings etc
CUI - Chat UI
A web UI to chat with the triage agent. With full duplex speech synthesis and chat logs
SUI - Schedule UI A web UI for the scheduling agent. view & manage your calendar, veras calendar, chat with the scheduling agent
OUI - Orchestrator UI
A web UI for management of the orchestrator
TCEUI - ToolChain Engine UI
A standalone UI for managing the ToolChain Engine
MX - Memory Explorer
A web UI enabling broad or targeted traversal of the knowledge graph. The Graph contains more than just memories, its a networks of relationships. for example if vera has interacted with a network, a map of the network will be navigable in the explorer. If vera has navigated a website, it and all its resources will be mapped into the graph. This allows you to navigate these systems ina visually appealing and data rich form.
GUI - Graph UI A web compponent for monitoring graph events of any scale
Task scheduler & worker orchestrator
Purpose: Heart of Vera. Collects performance data, queues user input, and allocates or creates resources locally or in remote worker pools.
Capabilities:
- Identifies tasks and steps that can execute in parallel
- Schedules execution when resources available
- Manages local and remote worker pools
- Queues requests when resources exhausted
- Provides real-time performance metrics and resource utilization
Example workflow:
1. Query received: "Scan network and analyze results"
2. Query triaged to Toolchain Engine
3. TCE requests: network scanner + analysis LLM
4. CEO: network scanner available → allocate
5. CEO: all analysis LLMs busy → queue request
6. Network scan runs
7. Analysis LLM becomes free
8. CEO dequeues and provides resources
9. TCE receives scan results for analysis
Configuration:
ceo = CEOOrchestrator(
max_parallel_tasks=4,
max_queue_depth=20,
resource_polling_interval=0.5 # seconds
)Proactive Background Cognition Documentation
Vera maintains a Proactive Focus Manager that continuously evaluates system priorities, context, and pending goals. During idle moments, it generates proactive thoughts—such as reminders, hypotheses, or plans—that enhance its understanding and readiness for future interactions.
Purpose: Autonomous background thinking engine generating actionable tasks during idle moments.
Capabilities:
- Continuously monitors project context and pending goals
- Generates proactive thoughts (reminders, hypotheses, plans)
- Validates proposed actions using fast LLM
- Executes validated actions through toolchain
- Maintains focus board tracking progress, ideas, actions, issues
- Non-blocking scheduling with configurable intervals
- Detect inconsistencies or gaps in knowledge
- Anticipate user needs
- Prepare for complex multi-step operations
- Improve self-awareness and performance over time
Features:
- Context-aware task generation: Pulls context from multiple providers (conversation history, focus board, custom sources)
- LLM-driven reasoning: Uses deep LLM to generate actionable next steps
- Action validation: Fast LLM validates executability before acting
- Distributed execution: Integrates with local pools, remote HTTP workers, and Proxmox nodes
- Focus tracking: Maintains board showing progress, next steps, ideas, actions, issues
- Non-blocking scheduling: Periodic autonomous ticks with configurable intervals
It is designed to integrate seamlessly with local, remote, and Proxmox-based worker nodes, providing a distributed, scalable, and high-throughput execution environment.
Configuration:
pbc = ProactiveBackgroundCognition(
tick_interval=60, # Check every 60 seconds
context_providers=[ConversationProvider(), FocusBoardProvider()],
max_parallel_thoughts=3,
action_validation_threshold=0.8
)Memory Documentation ⚠
Memory Schema
The Vera agent is powered by a sophisticated, multi-layered memory system known as the Composite Knowledge Graph. Designed to mirror human cognition, this architecture separates volatile context from persistent knowledge, enabling both coherent real-time dialogue and deep, relational reasoning over a vast, self-curated knowledge base. The system is built on a core principle: ChromaDB vectorstores hold the raw textual content, the Neo4j graph maps the relationships and context between them, while the Postgres database stotes an imutable ledger of changes over time, system logs and telemetry records.**
Vera's memory is structured into four distinct storage layers, excluding Layer 5 each layer contains or is derived from data in the previous layer, each serving a specific purpose in the cognitive process:
- Layer 1: Short-Term Buffer - The agent's immediate conversational context.
- Layer 2: Working Memory - Its private scratchpad for a single task, session or memory. Gives vera a place to think, make notes, plan.
- Layer 3: Long-Term Knowledge - A persistent snapshot of Veras entire mind, an interconnected library of interactions, facts and insights. This is how Vera can quickly derive insights from large datasets.
- Layer 4: Temporal Archive - A complete, immutable record of activity logs, metrics, codebase changes, graph changes. Allowing you to 'scroll' back through the entire history of Vera.
- Layer 5: External Knowledge Bases - Dynamic networked data stores. Web documentation, APIs, Git repos. Allows Vera to extend its graph beyond its own boundaries.
A key advanced capability, the Memory Buffer, can dynamically bridge Layers to enable unified, cross-sessional, highly enriched reasoning.
- Purpose: To maintain the immediate context of the active conversation, ensuring smooth and coherent multi-turn dialogue. This is a volatile, rolling window of recent events. It will contain systenm prompts, user input, the last n chat history entries, vector store matches & nlp data.
- Implementation: A simple in-memory buffer (e.g., a list of the last 10-20 message exchanges). This data is transient and is not persisted to any database.
- Content: Raw chat history between the user and the agent.
- Purpose: To provide an isolated "scratchpad" for the agent's internal monologue, observations, and findings during a specific task, problem, or session or recollection. This allows for exploratory thinking.
- Implementation:
- Neo4j (Structure): A
Sessionnode is created and linked to relevant entities in the main graph (e.g.,(Session)-[:FOCUSED_ON]->(ProjectX)). - ChromaDB (Content): A dedicated Chroma collection (
session_<id>) is created to store the full text of the agent's thoughts, notes, and relevant snippets generated during this session.
- Neo4j (Structure): A
- Content: Agent's "thoughts," observed facts, code snippets, and summarizations. All data is scoped to the session's task.
-
Purpose: To serve as the agent's persistent, semantically searchable library of validated knowledge. This is the core of its "intelligence," built over time through a careful process of promotion and curation.
-
Implementation: Layers 1 and two are continually promoted into Layer 3 before session end
- Vector Database - ChromaDB (Content & Semantic Search): The primary
long_term_docscollection stores the full text of all important information: documents, code examples, notes, and promoted "thoughts." Each entry contains metadata that points back to the Neo4j graph. - Knowledge Graph - Neo4j (Context & Relationships): The graph stores all memories, entities & insights (e.g.,
Project,Document,Person,Feature,Memory) and the rich, typed relationships between them (e.g.,USES,AUTHORED_BY,CONTAINS). It does not store large text bodies, only pointers to them in Chroma. See Memory Schema for more information on types.
- Vector Database - ChromaDB (Content & Semantic Search): The primary
-
How It Works (Basic Retrieval):
- A semantic query is performed on the
long_term_docsChroma collection. - The search returns the most relevant text passages and their metadata, including a
neo4j_id. - This ID is used to fetch the corresponding node and its entire network of relationships from Neo4j.
- The agent receives both the retrieved text and its full relational context, enabling deep, multi-hop reasoning.
- A semantic query is performed on the
- Purpose: To provide an immutable, historical record of all agent interactions for auditing, debugging, and future model training. It also allows the system to 'scroll back in time' for the entire graph, just a particular subgraph, section or node.
- Implementation: Postgres captures and archives all data and changes flowing through the system.sessions, Queries, memory creations, links, unlinks, deletions, promotion events and more. An optional JSONL stream can act as a backup log.
- Content: Raw, timestamped logs of all system activity.
- Purpose: External source of truth
- Implementation: HTTP / API calls to external services, via requests to resolve data from archives like Wikipedia, DNS Records, OHLCV Data, OWSAP, etc
- Content: Typically json blobs
Promotion is the key mechanism for learning. It transforms ephemeral session data into permanent, connected knowledge.
- Identification: At the moment all content is promoted to Layer 3, selective promotion is on the roadmap
- Curation: The agent creates a new
Memory,EntityorInsightnode in the Neo4j graph. - Linking: This new node is parsed with nlp & linked via relationships to all relevant entities (e.g.,
(Insight)-[:ABOUT]->(Project), (Insight)-[:DERIVED_FROM]->(Document)). - Storage: The full text of the "thought" is inserted into the sessions Chroma collection. The metadata for this entry includes the ID of the new Neo4j node (
neo4j_id: <memory_node_id>), permanently binding the text to its contextual graph.
- Conversation happens -> Stored in Layer 1 (Short-Term Buffer).
- Agent thinks/acts -> Thoughts stored in Layer 2 (Working Memory Chroma + Graph links).
- Valuable insight is made -> Promoted to Layer 3 (LTM Chroma + Graph context).
- Cross-sessional query asked -> Macro Buffer orchestrates a search across LTM and relevant Session stores via Graph-Accelerated Search.
- Everything is recorded -> Logged to Layer 4 (Archive).
This architecture ensures Vera can fluidly operate in the moment while continuously building a structured, retrievable, and intelligent knowledge base, capable of learning from its entire lived experience.
Vera employs a sophisticated three-tier memory buffer system that operates at different scales of retrieval and reasoning, enabling seamless cognitive processing across temporal and conceptual dimensions.
Think of them as three zoom lenses focusing memory retrieval and processing to the required scale
#in-development
The Micro Buffer is always active and serves as the real-time cognitive workspace—managing the immediate context and attention span during active reasoning and task execution.
-
Purpose: To maintain optimal cognitive load by dynamically managing the active working set of information. It filters, prioritizes, and sequences relevant memories for the current task moment-by-moment.
-
How it Works:
- Attention Scoring: Continuously scores available memories based on recency, relevance to current task, and relationship strength
- Cognitive Load Management: Limits active context to 7±2 chunks to prevent overload (Miller's Law implementation)
- Real-time Pruning: Drops low-relevance information and promotes high-value context as tasks evolve
- Focus Tracking: Maintains attention on the most salient entities and relationships during complex reasoning
- NLP Processing: Extracts key information and meaning from text and stores them as relationships in the knowledge graph. i.e. triplets, URLs, filepaths, references, entities like person, technology. It can also parse code into relational trees.
-
Technical Implementation:
// Micro Buffer maintains focus stack during reasoning
MATCH (current:Task {id: $task_id})
MATCH (current)-[:HAS_FOCUS]->(focus_entity)
WITH focus_entity
MATCH (focus_entity)-[r*1..2]-(related)
WHERE r.relevance_score > 0.7
RETURN related
ORDER BY r.relevance_score DESC
LIMIT 15 // Working memory constraint- Example Usage: When debugging code, the Micro Buffer automatically maintains focus on the current function, related variables, and recent stack traces while filtering out unrelated project documentation.
#in-development
The Macro Buffer serves as the connective tissue between cognitive sessions—enabling holistic reasoning across time and context boundaries.
-
Purpose: To break down the isolation between sessions, allowing Vera to connect ideas, hypotheses, and information that were originally recorded in different contexts. This is the foundation for associative reasoning and holistic problem-solving.
-
How it Works:
- Graph-Accelerated Search: Uses Neo4j to efficiently find relevant sessions and entities across time
- Multi-Collection Vector Search: Performs targeted semantic search across relevant session collections
- Temporal Pattern Recognition: Identifies sequences and evolution of ideas across sessions
- Context Bridging: Creates conceptual bridges between seemingly disconnected sessions
-
Technical Implementation:
// Macro Buffer: Cross-sessional associative retrieval
MATCH (s:Session)-[:HAS_TOPIC|FOCUSED_ON]->(topic)
WHERE topic.name =~ "(?i).*authentication.*"
WITH collect(DISTINCT s.session_id) as relevant_sessions
MATCH (idea:Concept)-[r:EVOLVED_FROM|RELATED_TO*1..3]-(connected)
WHERE idea.session_id IN relevant_sessions
RETURN idea, connected, r
ORDER BY r.temporal_weight DESC- Benefit: It allows Vera to answer complex, cross-sessional questions like, "What were all the challenges we faced when integrating service X?" by pulling together notes from initial research, debugging logs, and the final summary document.
#in-development
The Meta Buffer operates as the executive control system—managing higher-order reasoning about reasoning itself, strategic planning, and self-modeling.
-
Purpose: To enable Vera to reason about its own cognitive processes, identify knowledge gaps, and strategically plan learning and problem-solving approaches.
-
How it Works:
- Cognitive Pattern Recognition: Identifies recurring reasoning patterns, successful strategies, and common failure modes
- Knowledge Gap Analysis: Detects missing information, contradictory knowledge, and underspecified concepts
- Strategic Planning: Generates learning agendas, research plans, and problem-solving roadmaps
- Self-Modeling: Maintains and updates Vera's understanding of its own capabilities and limitations
-
Technical Implementation:
// Meta Buffer: Strategic reasoning and gap analysis
MATCH (capability:Capability {name: $current_task})
MATCH (capability)-[r:REQUIRES|BENEFITS_FROM]->(required_knowledge)
OPTIONAL MATCH (vera:SelfModel)-[has:HAS_KNOWLEDGE]->(required_knowledge)
WITH required_knowledge,
CASE WHEN has IS NULL THEN 1 ELSE 0 END as knowledge_gap,
r.importance as importance
WHERE knowledge_gap = 1
RETURN required_knowledge.name as gap,
importance,
"Learning priority: " + toString(importance) as recommendation
ORDER BY importance DESC- Example Usage: When faced with a novel problem, the Meta Buffer might identify that Vera lacks understanding of quantum computing concepts, then generate and execute a learning plan that includes reading research papers, running simulations, and seeking expert knowledge.
The three buffers work in concert to create a balanced & comprehensive cognitive experience:
Micro Buffer (Tactical) → Manages immediate working context
↑ ↓
Macro Buffer (Operational) → Connects cross-sessional knowledge
↑ ↓
Meta Buffer (Strategic) → Guides long-term learning and reasoning
Real-world Example: Complex Problem-Solving
- Meta Buffer identifies Vera needs to learn about blockchain for a new project
- Macro Buffer retrieves all past sessions mentioning cryptography, distributed systems, and related concepts
- Micro Buffer manages the immediate context while Vera reads documentation, runs code examples, and tests understanding
- Meta Buffer updates Vera's knowledge base with new blockchain capabilities
- Macro Buffer connects this new knowledge to existing financial and security concepts
- Micro Buffer applies the integrated knowledge to solve the original problem
This hierarchical buffer system enables Vera to operate simultaneously at tactical, operational, and strategic levels—maintaining focus while building comprehensive understanding and planning for future challenges.
This creates a coherent hierarchy where:
- Micro = Immediate working memory and attention
- Macro = Cross-sessional associative memory
- Meta = Strategic reasoning and self-modeling
Each buffer operates at a different temporal and conceptual scale while working together to enable sophisticated, multi-layered cognitive processing.
Discovery - Promotion - Recall - Enrinchment - Continuous Evaluation - Decay - Archiving
Planned feature
The Cartographer of Consciousness: Mapping the Labyrinth of Thought
Memory Explorer Documentation
Knowledge Graph Documentation
Knowledge Bases Documentation
Start with:
python3 memory_explorer.pyWeb UI for traversing the knowledge graph:
The Memory Explorer (MX) is an operational web UI enabling broad or targeted traversal of the knowledge graph. It is a powerful administrative and analytical tool that serves as the observatory for Vera's cognitive landscape. The graph contains far more than internal memories; it maps the living topology of networks of relationships extracted from all data Vera processes—including networks, databases, codebases, and external websites. By visualizing this deep structure, the Explorer provides users with unprecedented serviceability and observability over the ingested data, enabling the direct derivation of complex insights and auditing of Vera's operational history.
Purpose: To transform complex, multi-layered memory structures into interactive, navigable knowledge graphs. It bridges the abstract relationships within Vera's mind with tangible visual representations, making the architecture of intelligence both accessible and explorable.
Capabilities (Serviceability and Administration):
- Enabling LLM Questioning and Deep Reasoning:
The Explorer visually maps the full relational context retrieved from the Neo4j graph, allowing users to understand and audit the scope of knowledge available for an LLM query. This ensures the agent is enabled to perform deep, multi-hop reasoning. - Insight Derivation:
Facilitates both macro-scale pattern recognition and micro-scale relationship analysis. Users can trace idea genealogies across sessions and identify emerging knowledge clusters. - Cognitive Observability and Auditing:
Reveals the living topology of memory, exposing how concepts connect, how knowledge evolves over time, and how different memory layers interact. It also provides a window into Version-Aware Telemetry, allowing monitoring of performance metrics (e.g., vector search latency, memory usage) tagged with code versions. - Cross-Sessional Exploration:
Supports the retrieval of relevant knowledge and historical sessions via Graph-Accelerated Search, effectively breaking down isolation between contexts for comprehensive associative recall.
Features (Interactive Mapping of Ingested Data):
- Website and API Mapping:
Visualizing Layer 5 External Knowledge Bases, which include dynamic networked data stores like Web documentation, APIs, and Git repos. This allows users to navigate the resources and relationships Vera has extracted from external websites and APIs. - Code and Data Structure Graphing:
Rendering relational trees parsed from code, where the Micro Buffer's NLP processing extracts key relationships (like triplets, URLs, filepaths, references, entities) and stores them in the graph. - Database and Schema Exploration:
Visualizing relationships between entities and insights stored within the Neo4j Knowledge Graph. The graph stores entities (Project,Document,Person,Feature) and the rich, typed relationships between them (e.g.,USES,AUTHORED_BY,CONTAINS). - Historical Timeline Review:
Displaying the contents of the Layer 4 Temporal Archive, which is an immutable record of activity logs, metrics, codebase changes, and graph changes. This allows users to 'scroll back in time' through the entire history of Vera. - Self-Modification Traceability:
Visualizing the rationale and impact of autonomous changes, including change records, test results, and performance impact managed by the Self Modification Engine (SME).
Automated Multi-Step Tool Orchestration
Warning
Vera has unrestricted access to Bash & Python execution out of the box
Please be very careful with what you ask for. There is nothing stopping it from running rm -rf /. Or Disable these two tools.
ToolChain Engine Documentation
The ToolChain orchestrates the planning and execution of complex workflows by chaining together multiple tools available to the agent. It leverages a deep language model (LLM) to dynamically generate, execute, and verify a sequence of tool calls tailored to solving a user query.
This forms the core of an intelligent, multi-tool orchestration framework that empowers the agent to decompose complex queries into manageable actions, execute them with error handling, and iteratively improve results through self-reflection.
-
Planning: Generates a structured plan in JSON format, specifying which tools to call and what inputs to provide, based on the query and historical context.
-
Execution: Runs each tool in sequence, supports referencing outputs from previous steps (
{prev},{step_n}), and handles errors with automatic replanning. -
Memory Integration: Saves intermediate outputs and execution context to the agent's memory for continuity and accountability.
-
Result Validation: Uses the LLM to verify if the final output meets the original goal, triggering replanning if necessary.
-
Reporting: Summarizes all executed tool chains, providing insight into past queries, plans, and outcomes.
| Method | Description |
|---|---|
__init__(agent, tools) |
Initializes the planner with a reference to the agent and its toolset. Loads chat history for context. |
plan_tool_chain(query, history_context="") |
Generates a JSON-formatted plan of tool calls for the given query, optionally incorporating prior step outputs as context. |
execute_tool_chain(query) |
Executes the planned tool chain step-by-step, resolves references to previous outputs, manages errors, and ensures the goal is met via iterative replanning if needed. |
save_to_memory(user_msg, ai_msg="") |
Stores interactions and outputs to the agent’s memory buffer for context continuity. |
report_history() |
Produces a summarization report of all tool chains executed so far, highlighting queries, plans, results, and patterns. |
-
Planning Phase:
It decides the best style of plan for the problem, then constructs a prompt describing available tools and the user query, requesting the LLM to generate a JSON array that outlines the sequence of tool calls and their inputs. -
Execution Phase:
Each tool is invoked in order. Inputs referencing outputs from prior steps (e.g.,{step_1},{prev}) are resolved to the actual results. Errors in execution trigger automatic recovery plans via replanning. -
Validation & Retry:
After all steps, the planner prompts the LLM to review whether the final output meets the query’s goal. If not, the planner automatically retries with a revised plan. -
Memory & Reporting:
All intermediate results and plans are saved to memory for transparency and to aid future planning. The report function provides a concise summary of past activity for audit or review.
-
Dynamic, Context-Aware Planning:
Selects the type of plan & plans tool usage tailored to the problem, reusing historical outputs intelligently. -
Error Resilience:
Automatically detects and recovers from tool failures or incomplete results. -
Extensible & Modular:
Works with any tool exposed by the agent, provided they follow a callable interface. -
Traceability:
Detailed logging and memory save steps ensure all decisions and outputs are recorded.
Purpose: Breaks down complex tasks into achievable steps and executes plans using integrated tools.
Capabilities:
- Generates structured JSON execution plans
- Supports multiple planning strategies: Batch, Step, and Hybrid
- Multiple execution strategies: Sequential, Parallel, Speculative
- Error handling with automatic replanning
- Result validation against original goal
- Full execution history logging
Planning Strategies:
Batch Planning: Generate entire plan upfront
[
{ "tool": "WebSearch", "input": "latest AI trends 2024" },
{ "tool": "WebSearch", "input": "generative AI applications" },
{ "tool": "SummarizerLLM", "input": "{step_1}\n{step_2}" }
]Step Planning: Generate next step based on prior results
[
{ "tool": "WebSearch", "input": "authenticate with OAuth2" }
]
// After step 1 completes, generate next stepHybrid Planning: Mix of upfront and adaptive planning
Execution Strategies:
Sequential: Execute steps one at a time (safe, traceable) Parallel: Execute independent steps concurrently (faster) Speculative: Run multiple possible next steps, then prune based on validation (advanced)
Example usage:
planner = ToolChainPlanner(agent, agent.tools)
# Execute a multi-tool workflow
query = "Retrieve latest weather for New York and generate a report"
final_output = planner.execute_tool_chain(query)
print("Result:", final_output)
# Generate execution history report
history = planner.report_history()
print(history)Plan format expected from LLM:
[
{ "tool": "SearchAPI", "input": "latest weather New York" },
{ "tool": "SummarizerLLM", "input": "{step_1}" }
]Placeholders like {step_1} or {prev} are replaced with actual outputs during execution.
# Assume you have an initialized agent with tools and a deep LLM model
# Create the planner instance
planner = ToolChainPlanner(agent, agent.tools)
# Simple query example: plan and execute a multi-tool workflow
query = "Retrieve the latest weather for New York and generate a summary report."
final_output = planner.execute_tool_chain(query)
print("Final Output:", final_output)
# Generate a report summarizing all past toolchain executions
history_report = planner.report_history()
print("Execution History Report:\n", history_report)Internal Tools
Local Tools
MCP Tools
-
Agent & Tools Setup:
ToolChainPlannerexpects anagentobject that exposes:-
deep_llm: a language model instance with aninvoke(prompt: str) -> strmethod for prompt completion. -
tools: a list of tool objects, each having anameattribute and a callable interface (e.g.,run(),func(), or__call__). -
buffer_memory: an object that manages short-term chat history, providing context for planning and execution. -
save_to_memory(user_msg, ai_msg): method to record interaction steps and outputs.
-
-
Tool Interface:
Tools can be any callable entity that takes a single string input and returns a string output. This abstraction allows mixing LLM-based tools, APIs, or custom functions. -
Plan Format:
The planner expects the LLM to output a pure JSON list of objects like:[ { "tool": "SearchAPI", "input": "latest weather New York" }, { "tool": "SummarizerLLM", "input": "{step_1}" } ]The planner replaces placeholders like
{step_1}or{prev}with actual outputs during execution. -
Error Handling:
If a tool execution fails or an output is missing, the planner automatically triggers a replanning phase to recover and retry. -
Extensibility:
To add new tools, simply ensure they conform to the callable interface and add them to the agent’stoolslist. The planner will dynamically list them and can invoke them in plans. -
Logging & Debugging:
The planner prints detailed step-by-step execution logs, useful for debugging the tool chain behavior and inspecting intermediate results.
This comprehensive toolset architecture enables Vera to break down high-level goals into concrete, manageable steps executed with precision across multiple domains, making it a powerful assistant in diverse environments.
Tools can be chained together dynamically by Vera’s Tool Chain Planner, which uses deep reasoning to break down complex queries into executable sequences.
A compatability layer and API endpoint for Vera. Allows vera to take the pace of other LLM APIs like OpenAIs Chat GPT or Anthropics Claude. It also allows these APIs to interface with the Vera framework
Purpose: Compatibility layer allowing Vera to mimic other LLM APIs (OpenAI, Anthropic) and allowing those APIs to interface with Vera systems.
Capabilities:
- Vera responds as if it were ChatGPT, Claude, or other LLM APIs
- Drop-in replacement for OpenAI's Chat Completions API
- Stream and batch inference
- Token counting
- Embedding generation
Use case: Use existing OpenAI client libraries with Vera running locally:
# This code thinks it's talking to OpenAI, but it's using local Vera
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8000", api_key="vera")
response = client.chat.completions.create(
model="vera-default",
messages=[{"role": "user", "content": "Explain quantum entanglement"}]
)
print(response.choices[0].message.content)The IAS means you can use Vera with any tool expecting an OpenAI-compatible API.
a universal communication toolkit for AI agents and distributed systems. It enables your agent to speak any digital protocol — from HTTP and WebSockets, to MQTT, SSH, IRC, LoRa, Matrix, Slack, and even experimental transports like WebRTC and QUIC/HTTP3.
At its core, Babelfish acts like a networking “translator”:
-
Every protocol looks the same to the agent (
open → send → receive → close). -
The agent can freely combine multiple carriers into hybrid tunnels (multi-modal VPNs).
-
Protocols are grouped into layers, similar to a networking stack, for modularity and extensibility.
Autonomous Evolution Through Continuous Integration
Vera's self-modification capability represents a paradigm shift in AI architecture—enabling continuous, autonomous evolution of its own codebase through a sophisticated CI/CD pipeline that ensures reliability, traceability, and controlled innovation. This isn't mere code generation; it's a complete software development lifecycle managed by the AI itself.
# Vera analyzes its own performance and identifies improvement opportunities
improvement_plan = vera.analyze_performance_gaps()
new_module = vera.generate_optimized_code(improvement_plan)
# Example: Vera identifies a bottleneck in memory retrieval
# Generates optimized vector search algorithm with proper error handling- Pattern Recognition: Identifies inefficiencies, bugs, or missing features through continuous self-monitoring
- Context-Aware Generation: Creates code that integrates seamlessly with existing architecture and follows established patterns
- Multi-LLM Validation: Uses different LLM specializations for code generation, review, and optimization
1. Unit Test Generation → Auto-creates comprehensive test cases for new code
2. Integration Testing → Validates compatibility with existing modules
3. Performance Benchmarking → Ensures improvements meet efficiency targets
4. Safety & Security Scanning → Checks for vulnerabilities and ethical concerns
Automated Test Suite:
class SelfModificationTestSuite:
def test_backwards_compatibility(self):
"""Ensure new code doesn't break existing functionality"""
assert existing_workflows_still_function()
def test_performance_improvement(self):
"""Verify generated code meets performance targets"""
assert new_algorithm.faster_than(previous_version)
def test_memory_safety(self):
"""Check for memory leaks and resource management"""
assert no_memory_leaks_detected()Every autonomous code modification follows a structured version control process:
# Automated commit messages with context
git commit -m "feat(memory-optimizer): Vector search optimization v2.1.3
- Reduced latency by 42% through improved indexing
- Added fallback mechanisms for corrupted vector stores
- Maintains full backwards compatibility
- Generated by Vera-Agent #session-1756642265"Version Tagging System:
- Workflow Versions: Every autonomous modification cycle receives a unique version tag
- Session Linking: Code changes reference the session and reasoning that prompted them
- Rollback Capability: Automatic snapshots enable instant reversion if issues detected
# Every modification is logged with full context
change_record = {
"version": "memory-optimizer-v2.1.3",
"timestamp": "2024-01-15T14:30:00Z",
"trigger": "performance_analysis_session_1756642265",
"rationale": "Vector search latency exceeding 200ms threshold",
"changes": {
"files_modified": ["/core/memory/vector_search.py"],
"tests_added": ["test_vector_search_optimization.py"],
"performance_impact": "42% latency reduction",
"compatibility": "full_backwards_compatible"
},
"validation_results": {
"unit_tests": "passed",
"integration_tests": "passed",
"performance_tests": "exceeded_targets",
"security_scan": "clean"
}
}All self-modification activities are immutably logged to Layer 4 Archive with forensic-level detail:
Modification Records Include:
- Pre-modification State: Complete snapshot of codebase before changes
- Generation Context: LLM prompts, reasoning chains, and alternative approaches considered
- Validation Evidence: Test results, performance metrics, security scans
- Rollback Procedures: Automated scripts for reverting changes if needed
- Impact Analysis: Predicted and actual effects on system performance
{
"self_modification_event": {
"event_id": "sm-20240115-143000-1756642265",
"version_tag": "memory-optimizer-v2.1.3",
"initiating_session": "session-1756642265",
"trigger_condition": "vector_search_latency > 200ms",
"code_generation": {
"llm_used": "deep-reasoning-llm",
"prompt_context": "Optimize vector search while maintaining accuracy...",
"reasoning_chain": ["identified bottleneck", "researched algorithms", "selected approach"],
"alternatives_considered": 3
},
"testing_results": {
"unit_tests": {"passed": 15, "failed": 0},
"integration_tests": {"compatibility": "verified", "performance": "improved"},
"security_scan": {"vulnerabilities": 0, "warnings": 1}
},
"deployment_impact": {
"performance_change": "+42% speed",
"memory_usage": "-15%",
"accuracy_change": "+0% maintained"
}
}
}Self-Modification Monitor 🛠️
──────────────────────────────
Current Version: memory-optimizer-v2.1.3
Active Modifications: 1
Tests Passing: 15/15
Performance Impact: +42% ✅
Rollback Ready: Yes
Recent Changes:
✅ 2024-01-15 14:30 - Vector search optimized
✅ 2024-01-15 11:20 - Memory caching improved
✅ 2024-01-14 16:45 - Error handling enhanced
Every workflow execution includes version metadata for precise performance tracking:
# All tool executions tagged with code versions
execution_context = {
"workflow_id": "weather-analysis-1756642300",
"code_versions": {
"memory_layer": "v3.2.1",
"vector_search": "v2.1.3", # Newly optimized version
"tool_orchestrator": "v1.5.2"
},
"performance_metrics": {
"vector_search_latency": "116ms", # Track improvement
"memory_usage": "45MB",
"accuracy_score": 0.94
}
}- Automated Validation: Comprehensive test suites must pass
- Performance Gates: New code must meet or exceed performance thresholds
- Security Scanning: Static analysis and vulnerability detection
- Human-in-the-Loop (Optional): Critical changes can require human approval
- Gradual Rollout: Can deploy to staging environment first
def emergency_rollback(detected_issue):
"""Automated rollback if issues detected post-deployment"""
if performance_degradation_detected() or errors_spiking():
revert_to_previous_version()
log_rollback_event(detected_issue)
trigger_analysis_for_fix()The self-modification system creates a virtuous cycle of improvement:
Performance Monitoring
→ Gap Identification
→ Code Generation
→ Validation Testing
→ Versioned Deployment
→ Impact Measurement
→ Further Optimization
Continuous Evolution Metrics:
- Code Quality: Test coverage, complexity metrics, documentation completeness
- Performance Trends: Latency, accuracy, resource usage over versions
- Stability Indicators: Error rates, crash frequency, recovery times
- Adaptation Speed: Time from problem identification to deployed solution
This sophisticated self-modification framework transforms Vera from a static AI system into a continuously evolving intelligence that can adapt to new challenges, optimize its own performance, and maintain robust reliability through rigorous version control and comprehensive change tracking—all while providing complete observability into its evolutionary journey.
Purpose: Allow Vera to build new models from fundamental building blocks—enabling specialized micro-models for specific tasks.
Concept: Rather than only using pre-existing models, Vera can create custom models optimized for specific domains or tasks.
Planned capabilities:
- Automatic model architecture search
- Fine-tuning on domain-specific data
- Quantization and optimization
- Deployment as specialized agents
Purpose: Version control for all edits Vera makes to files, settings, and configurations.
Planned capabilities:
- Track all file modifications with timestamps and reasoning
- Enable rollback to previous file states
- Audit trail for compliance
- Collaborative merging if multiple agents edit same files
Vera's agent roster includes specialized sub-agents, each with defined responsibilities:
Triage Agent – Routes incoming tasks, prioritizes requests, delegates to appropriate agents or tools.
Planner Agent – Decomposes complex goals into actionable steps, generates execution plans.
Scheduler Agent – Manages task scheduling, handles dependencies, optimizes execution order.
Optimizer Agent – Refines workflows, improves performance, tunes parameters.
Evaluator Agent – Validates outputs, checks goal attainment, triggers refinement if needed.
Extractor Agent – Pulls structured information from unstructured text (documents, web pages, etc.).
Researcher Agent – Conducts information gathering, synthesizes findings, identifies trends.
Summarizer Agent – Condenses large texts into concise summaries at various detail levels.
Editor Agent – Refines writing, checks grammar, improves clarity and tone.
Model Trainer Agent – Fine-tunes models on domain-specific data (in development).
Model Builder Agent – Creates new model architectures from scratch (in development).
Security Analyzer Agent – Dynamic security analysis, penetration testing, vulnerability detection.
Ingestors work at the micro level, pulling data into Vera's memory systems:
Corpus Crawler – Maps corpuses (internet, local files, APIs) into memory structure. Analogous to "reading."
Network Ingestor – Scans networks, ingests topology and service information into memory.
Database Ingestor – Extracts schema and data from databases into Neo4j.
Context Ingestor – Gathers context from Layer 0 & 1 (short-term buffers) for enrichment.
Vera can be configured to trigger background thinking cycles during idle time:
# Trigger proactive background cognition
vera.focus_manager.run_proactive_cycle()This generates new goals or alerts based on recent conversations and system state.
Vera supports streaming partial results from LLMs, improving user experience during long or complex queries:
for chunk in vera.stream_llm(vera.deep_llm, "Explain quantum computing."):
print(chunk, end="")You can add new tools by extending the load_tools method with new Tool objects, defining their name, function, and description.
Simple Example:
You can extend Vera's capabilities by adding new tools:
def load_tools(self):
tools = super().load_tools()
tools.append(
Tool(
name="WeatherAPI",
func=lambda location: fetch_weather(location),
description="Fetches current weather for a given location."
)
)
return toolsCreate specialized agents for your domain:
class DomainExpertAgent(Agent):
"""An agent specialized in your domain"""
def __init__(self, name, llm, memory):
super().__init__(name, llm, memory)
self.expertise = "domain-specific-knowledge"
def process_query(self, query):
# Custom reasoning logic
return self.llm.invoke(f"As a {self.expertise} expert: {query}")
# Register with Vera
vera.register_agent(DomainExpertAgent("expert", vera.deep_llm, vera.memory))Create ingestors to pull data into Vera's memory:
class CustomIngestor(Ingestor):
def ingest(self, source):
"""Extract data from source and insert into memory"""
data = self.fetch_from_source(source)
entities = self.parse_entities(data)
relationships = self.extract_relationships(data)
self.memory.bulk_insert_nodes(entities)
self.memory.bulk_insert_relationships(relationships)
# Use it
vera.ingestors.append(CustomIngestor(vera.memory))Vera is designed to be extensible and modular. Here are ways to contribute:
-
Add new tools: Implement new
Toolobjects with clearly defined inputs and outputs. -
Improve memory models: Experiment with alternative vector DBs or memory encoding strategies.
-
Enhance planning algorithms: Optimize or replace the tool chain planner for more efficient workflows.
-
Expand self-modification capabilities: Enable more robust and safe code generation and auto-updating.
-
Improve UX: Add richer streaming output, UI components, or integrations.
Vera runs entirely locally by default—no data is sent to external servers unless you explicitly configure external tools (APIs, web scraping, etc.).
Local Security Considerations:
- Vera has unrestricted Bash/Python execution. Only allow trusted users access.
- Memory databases (Neo4j, ChromaDB) should be behind authentication in multi-user deployments.
- Disable external tool access if processing sensitive data.
- Regularly audit the Neo4j audit logs.
External Tool Integration:
If using external APIs:
- Secure API keys in
.envfiles - Use read-only API tokens where possible
- Monitor API call logs for unusual patterns
- Consider VPN/proxy for outbound connections
Data Retention:
- Layer 4 (Archive) stores everything permanently. Consider retention policies.
- Use
/memory-clearto purge all long-term memories (irreversible). - Export and backup Neo4j regularly:
neo4j-admin dump --database=neo4j --to=/backup/vera-backup.dump
Q: What LLMs does Vera support?
A: Vera is currently built around Ollama models (gemma2, gemma3, gpt-oss), but you can adapt it for any compatible LLM with a Python SDK or API.
Q: Can Vera run headless?
A: Yes. Vera is designed for command-line and backend automation, but can be integrated into GUIs or web apps.
Q: Is Vera safe to run self-modifying code?
A: Self-modification is sandboxed and requires careful review. Vera includes safeguards, but users should always review generated code before production use.
See the LICENSE file in the root directory of tis project.
| Model Type | Example Models | Memory | Use Case | Status |
|---|---|---|---|---|
| Fast LLM | Mistral 7B, Gemma2 2B | 4-8GB | Triage, quick tasks | ✅ Supported |
| Intermediate | Gemma2 9B, Llama 8B | 8-16GB | Tool execution | ✅ Supported |
| Deep LLM | Gemma3 27B, GPT-OSS 20B | 16-32GB | Complex reasoning | ✅ Supported |
| Specialized | CodeLlama, Math models | Varies | Domain-specific | 🔄 Partial |
TTS Pacing - On slower hardware the TTS engine may talk faster than the LLM can generate
LLM Reasoning not Visible - If an LLM has reasoning built in ( i.e. deepseek, gpt-oss) it will not display the reasoning in the web or terminal UI leading to a large gap between a query being accepted and answer being given.
- Windows configuration requires manual adaptation
- TTS pacing issues on slower hardware
- LLM reasoning not visible in UI for some models
- Resource-intensive on large knowledge graphs
For questions, feature requests, or help:
- GitHub Issues: https://github.com/BoeJaker/Vera-AI/issues
- GitHub Discussions: https://github.com/BoeJaker/Vera-AI/discussions
- Agentic Stack POC: https://github.com/BoeJaker/AgenticStack-POC
Planned Features:
- Multi-agent orchestration (POC complete)
- Persistent memory system (POC complete)
- Tool integration and planning (POC complete)
- Selective memory promotion (in development)
- Memory lifecycle management (planned)
- Micro models for specialized tasks (in development)
- Model overlay training system (in development)
- Self-modification engine (in development)
- Security analyzer agent (in development)
- Multi-modal I/O (whisper TTS, OCR, image recognition) (planned)
- Windows native support (planned)
- Distributed deployment (Kubernetes) (planned)
For detailed tracking, see the GitHub Projects board.
Last Updated: January 2025
Version: 1.0.0 (POC)

