Community PyTorch implementation and reproduction scaffold for Memory Caching: RNNs with Growing Memory.
Installable runtime modules, explicit claim boundaries, model-backed scientific artifacts, and publication-grade reproduction tooling live in one repository, with the stable package surface kept intentionally narrow.
memory_caching/
├── src/memory_caching/ Stable package surface published as memory-caching
│ ├── layer.py Memory Caching wrapper
│ ├── backends/ Linear, DLA, Titans, SWLA(c=2)
│ ├── bench/ Benchmark adapters, runners, manifests
│ ├── models.py Tiny model-backed scientific path
│ └── scientific_manifest.py Scientific artifact truthfulness checks
├── configs/ Train + benchmark + baseline-tracking configs
├── docs/ Reproduction, release, API, and claim-boundary docs
├── examples/ Stable public examples
├── scripts/ Train / eval / gate / packaging entrypoints
└── tests/ Backend, API, benchmark, and release-path coverage
| Area | Current State |
|---|---|
| Stable runtime package | memory-caching==0.1.0 on PyPI |
| Wrapper mechanisms | Residual / GRM / Soup / SSC |
| Segmentation | Constant and logarithmic |
| Backends | linear, dla, titans, swla(c=2) |
| Scientific artifact path | Model-backed, truthful manifests, non-smoke targets |
| Public release status | Publishable package surface with explicit release preflight |
| Full paper parity | Still blocked by incomplete baseline evidence and larger parity gaps |
| Scope | Status |
|---|---|
| Stable public PyTorch package | Active |
| Mechanism-faithful MC wrapper implementation | Implemented |
| Engineering scaffold and packaging integrity | Green |
| Scientific gate with model-backed evidence | Green |
| Full table-level paper parity | Blocked by missing baselines |
This is not official author code. See reproduction_report.md, CLAIM_TO_EVIDENCE_MATRIX.md, and PAPER_PARITY_BLOCKERS.md for the exact claim surface.
uv sync --extra dev
uv run mc list-variants
uv run mc smoke-eval --backend linear --device cpu --warmup-steps 1 --batch-size 1 --seq-len 8 --vocab-size 16 --d-model 8 --num-heads 2python -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
python -m pip install -e ".[dev]"
mc list-variantspython -m pip install memory-caching
python -c "from memory_caching import MCConfig, MemoryCachingLayer; print('ok')"- CPU-only example:
python -m pip install torch --index-url https://download.pytorch.org/whl/cpu
- CUDA 12.1 example:
python -m pip install torch --index-url https://download.pytorch.org/whl/cu121
Install a torch build that matches your local CUDA runtime and driver stack
before CUDA workflows.
The supported top-level runtime imports are:
memory_caching.MCConfigmemory_caching.MemoryCachingLayermemory_caching.SegmentCachememory_caching.LinearMemoryBackendmemory_caching.DLABackendmemory_caching.TitansBackendmemory_caching.SWLABackend
For runtime use, prefer:
layer(x)for the normal forward pathlayer.forward_with_cache(x)when cached segment checkpoints are neededlayer.inspect(x)when per-token routing/debug rows are needed
Repo tooling is intentionally broader than the public package API. CLI wiring, smoke helpers, benchmark runners, release gates, and report-generation scripts remain repo-level tooling rather than stable semver-tracked runtime surface.
Full API notes:
Namespaced experimental/reference modules now present in the package:
memory_caching.baselines.LogLinearPPmemory_caching.loglinear.LogLinearAttentionReferencememory_caching.loglinear.ChunkedLogLinearAttentionReference
| Topic | Link | Purpose |
|---|---|---|
| Documentation index | docs/README.md | Fast entrypoint to the full doc set |
| Reproduction status | reproduction_report.md | What is implemented, what is blocked |
| Public runtime API | PUBLIC_API.md | Stable import surface and boundaries |
| Log-linear terminology | LOG_LINEAR_TERMINOLOGY.md | Separates LogLinearPP from original LogLinearAttention |
| LogLinearPP baseline | LOG_LINEAR_PP_BASELINE.md | MC-paper baseline preset semantics |
| LogLinearAttention reference | LOG_LINEAR_ATTENTION_REFERENCE.md | Original mechanism reference-path status |
| Architecture | ARCHITECTURE.md | Layer flow, backend roles, artifact pipeline |
| Claim discipline | CLAIM_TO_EVIDENCE_MATRIX.md | Claim-to-evidence mapping |
| Claim boundaries | CLAIM_BOUNDARY.md | What is explicitly out of claim scope |
| Paper mapping | PAPER_TO_CODE.md | Paper mechanism to implementation map |
| Progress ledger | PROGRESS_LEDGER.md | Current weighted plan state |
| Paper parity blockers | PAPER_PARITY_BLOCKERS.md | What still blocks literal parity claims |
| Release runbook | PYPI_RELEASE_RUNBOOK.md | Package publishing path |
| Support matrix | CONSUMER_SUPPORT_MATRIX.md | User-facing environment support |
# List implemented backend/aggregation variants
mc list-variants
# Minimal CPU smoke eval
mc smoke-eval --backend linear --device cpu --warmup-steps 1 --batch-size 1 --seq-len 8 --vocab-size 16 --d-model 8 --num-heads 2
# Debug routing and cache behavior
uv run mc debug-layer --backend linear --aggregation grm --seq-len 8 --d-model 8 --num-heads 2 --out-json outputs/debug/debug_layer.json
# Repository engineering gate
uv run python scripts/reports/release_gate_v1.py --mode repo --out outputs/reports/release_gate_repo_v1.json
# Scientific gate
uv run python scripts/reports/release_gate_v1.py --mode scientific --out outputs/reports/release_gate_scientific_v1.jsonFor dense command coverage, use:
Canonical terminology used in this repository:
engineering scaffold: code quality, reproducibility, packaging, and report-generation integrityscientific evidence: model-backed artifacts with non-smoke targets and truthful manifestspaper parity: faithful reproduction of the paper's reported baselines, metrics, and missing comparison rows
scientific evidence is stricter than the engineering scaffold, but it is still
not the same as paper parity.
What a green scientific gate still does not prove:
- full paper parity
- full evaluation evidence for
LogLinearPP - original
LogLinearAttentionremains a separate future mechanism track - throughput parity or unpublished internal-author equivalence
Backend-specific limits also remain important:
linearis an unnormalized matrix-memory reference path, not a normalized linear-attention parity claimdla,titans, andswlaare mechanism-oriented reference implementationstitansandswlacurrently use constant scalar coefficients where the paper presents time-indexed coefficientssoupis only true state-space mixing when the backend supports state mixing; otherwise the repo uses an explicit output-mixture fallback
| Surface | Command | Output |
|---|---|---|
| Editable source install | python -m pip install -e . |
local runtime package |
| Dev install | python -m pip install -e ".[dev]" |
local dev + tests + packaging tools |
| Built wheel install | python -m pip install dist/*.whl |
release-like install path |
| Repo engineering gate | uv run python scripts/reports/release_gate_v1.py --mode repo ... |
package/repo integrity |
| Scientific gate | uv run python scripts/reports/release_gate_v1.py --mode scientific ... |
scientific artifact integrity |
| PyPI release preflight | uv run python scripts/checks/pypi_release_preflight.py |
publish-readiness report |
Stable published examples:
- examples/minimal_layer.py
- examples/inspect_layer.py
- examples/loglinear_reference.py
- examples/loglinear_chunked_reference.py
Pilot runner:
uv run ./scripts/checks/loglinear_pilot.shCurrent namespaced research/reference surfaces:
memory_caching.baselines.LogLinearPPmemory_caching.loglinear.LogLinearAttentionReferencememory_caching.loglinear.ChunkedLogLinearAttentionReference- tiny-model families:
tiny_loglinear_ref_lmtiny_loglinear_chunked_lm
Sample subset dataset files included for benchmark dry runs:
examples/longbench_subset.jsonlexamples/retrieval_subset.jsonl
| Resource | Location |
|---|---|
| Paper | arXiv:2602.24281 |
| PyPI | pypi.org/project/memory-caching |
| GitHub | github.com/kmccleary3301/memory_caching |
| Release | v0.1.0 |
If you use this repository, cite the original paper and this implementation.
@article{chandra2026memorycaching,
title={Memory Caching: RNNs with Growing Memory},
author={Chandra, ...},
journal={arXiv preprint arXiv:2602.24281},
year={2026}
}@software{memory_caching2026,
title={memory-caching: Community PyTorch Implementation of Memory Caching},
author={McCleary, Kyle},
url={https://github.com/kmccleary3301/memory_caching},
year={2026}
}Licensed under MIT. Public package surface is documented. Paper-parity limits are documented explicitly.