KB4096D

A personal research experiment for knowledge that lives where the model thinks

Why this exists

Modern LLMs can sound knowledgeable while being structurally hard to update, audit, or share. Most “knowledge” workflows still force everything through text: prompts, documents, summaries, chains of thought. But the model does not think in text. It thinks in hidden states.

KB4096D is a proposal: treat knowledge as native activations and build a modular, shareable knowledge base directly inside a 4096-dimensional latent space.

Not a library. Not a product. An exploration.

What Currently Works

Everything here works for TinyLlama and Llama 3.2; we can read and modify “knowledge” directly within each model. I am limited by hardware and can't test the same things on more complex models, but I am sure this will work. I need more GPU power and RAM to test with more advanced models, but the base is the same.

Current status: https://github.com/Desarius/KB4096D/blob/main/CurrentStatus.md

The hypothesis

If a model consistently represents meaning inside a stable latent space, then knowledge can be:

stored as vectors (not strings)
queried by geometry (not keywords)
edited without retraining
injected at runtime, reversibly
shared as compact .pt modules between users of the same base model

KB4096D explores and works on that hypothesis with a very explicit constraint:

Knowledge artifacts must be readable, composable, and operable in 4096D.

The core problems we are attacking

1) The semantic gap

Text is a lossy interface. The more we rely on text to represent knowledge, the more we lose the structure the model actually uses.

2) The extraction problem

We can ask a model to “explain” what it knows, but the explanation is not the knowledge. It is a narration. Extraction is not equivalent to representation.

3) The dimensionality problem

Knowledge in a network is not a list of facts. It is distributed. It has geometry. If we want a modular KB, we need a stable coordinate system where those distributions can live.

The stance

KB4096D rejects the idea that “knowledge” is best expressed as a paragraph.

Instead, KB4096D treats knowledge as:

vectors (activation patterns)
relations (directions and deltas)
clusters (centroids and neighbourhoods)
routes (which modules matter right now)
interventions (runtime biasing or weight edits)

What “4096D” means here

4096D is not a sacred number. It is a pragmatic anchor:

Many transformer backbones expose hidden states of size 4096
That space is where a large fraction of internal semantics becomes linearly accessible
It is a practical target for saving, indexing, and reinjecting meaning

KB4096D is about using the model’s native representation width as the “filesystem format” of knowledge.

What KB4096D is (conceptually)

A modular knowledge system operating on hidden states, built around these building blocks:

1) Knowledge Modules (.pt) Each module is a standalone package of vectors:

concept vectors
relation vectors (deltas)
optional metadata for provenance and evaluation

2) Router A routing mechanism that decides which modules are relevant for the current context by comparing the current hidden state neighborhood to module centroids.

3) Query Engine Similarity search and compositional retrieval:

nearest-neighbor concepts
relation chaining (with explicit awareness of degradation)
merge and projection operations

4) Injection Layer Runtime interventions that steer the model’s internal trajectory:

additive bias on activations
gated injection based on routing confidence
reversible, inspectable behavior changes

The full loop (the cycle we care about)

KB4096D is not “store vectors and hope”. It is a closed loop:

Observe a target layer’s hidden states on real prompts
Extract candidate concept vectors (and deltas for relations)
Package them into a module (.pt)
Index the module (centroid, variance, tags, evaluation notes)
Route at inference time to select modules dynamically
Inject knowledge signals into the forward pass
Evaluate (does the behavior actually change, and is it stable)
Iterate with incremental updates, not full retraining

What it is not

Not a RAG system
Not a prompt framework
Not “knowledge as text”
Not interpretability theater
Not a promise of perfect symbolic logic inside a neural net

KB4096D assumes imperfection and makes it explicit:

multi-hop relations degrade
interventions can have side effects
extraction is approximate
interpretability is partial

The goal is not purity. The goal is control, modularity, and repeatability.

Design principles

Native first Operate in the model’s latent space. Text is an interface, not a storage layer.
Composable knowledge Modules must be mergeable and shareable without rewriting the whole system.
Reversible interventions Runtime injection should be togglable and measurable.
Incremental updates Prefer small patches over monolithic retraining cycles.
Evaluation or it does not exist Every module needs measurable claims and failure cases.

A tiny, concrete taste (geometry over words)

Cosine similarity is the primitive. Routing is a policy over similarity.

#include <cmath>
#include <cstddef>

static float Dot(const float* a, const float* b, std::size_t n)
{
    float s = 0.0f;
    for (std::size_t i = 0; i < n; ++i)
    {
        s += a[i] * b[i];
    }
    return s;
}

static float Norm(const float* a, std::size_t n)
{
    return std::sqrt(Dot(a, a, n));
}

float CosineSimilarity4096(const float* a, const float* b)
{
    constexpr std::size_t N = 4096;
    const float na = Norm(a, N);
    const float nb = Norm(b, N);

    if (na <= 1e-12f || nb <= 1e-12f)
    {
        return 0.0f;
    }

    return Dot(a, b, N) / (na * nb);
}

Open questions (the real research)

Which layers yield the most stable “knowledge coordinates” for a given model family

How to prevent injection from becoming brittle prompt-hacking in disguise

How to represent relations robustly without exploding drift across hops

How to compare modules across checkpoints or quantization variants

When weight edits beat runtime steering, and when they are dangerous

How to make provenance, trust, and reproducibility first-class

Could we make a full AI and grow it only from multiple *.pt files

Roadmap (direction, not promises)

A minimal module format for concept vectors + relation deltas

A routing policy with confidence and fallback logic

A runtime injection interface with toggles and metrics

Benchmarks that measure “knowledge patch impact” under perturbations

A small zoo of modules that demonstrate compositional behavior

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
kb4096d		kb4096d
knowledge_bases		knowledge_bases
tests		tests
.gitignore		.gitignore
CurrentStatus.md		CurrentStatus.md
LICENSE		LICENSE
README.md		README.md
pytest.ini		pytest.ini
requirements.txt		requirements.txt
run.py		run.py
setup_env.sh		setup_env.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

KB4096D

A personal research experiment for knowledge that lives where the model thinks

Why this exists

What Currently Works

The hypothesis

The core problems we are attacking

1) The semantic gap

2) The extraction problem

3) The dimensionality problem

The stance

What “4096D” means here

What KB4096D is (conceptually)

The full loop (the cycle we care about)

What it is not

Design principles

A tiny, concrete taste (geometry over words)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

KB4096D

A personal research experiment for knowledge that lives where the model thinks

Why this exists

What Currently Works

The hypothesis

The core problems we are attacking

1) The semantic gap

2) The extraction problem

3) The dimensionality problem

The stance

What “4096D” means here

What KB4096D is (conceptually)

The full loop (the cycle we care about)

What it is not

Design principles

A tiny, concrete taste (geometry over words)

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages