Skip to content

The deterministic execution layer for the agent economy — fail-closed policy enforcement, cryptographic receipts, replayable proofs.

License

Notifications You must be signed in to change notification settings

Mindburn-Labs/helm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HELM — Fail-Closed Tool Calling for AI Agents

Build Conformance L1 Conformance L2 SBOM Provenance

OpenAI-compatible proxy that enforces tool execution and emits verifiable cryptographic receipts.

Spec is broader than v0.1 by design — see docs/OSS_CUTLINE.md for exact shipped guarantees.

  • 1-line integration — swap base_url, keep everything else
  • EvidencePack export — deterministic .tar.gz, verify offline, sue-grade
  • Bounded compute — WASI sandbox with gas/time/memory caps, approval ceremonies with timelocks
  • Canonical StandardHELM Unified Canonical Standard v1.2

Quickest path to a receipt

docker compose up -d && curl -s localhost:8080/v1/chat/completions \
  -H 'Content-Type: application/json' \
  -d '{"model":"gpt-4","messages":[{"role":"user","content":"hello"}]}' | jq .id

Start from source

git clone https://github.com/Mindburn-Labs/helm.git && cd helm
docker compose up -d
curl -s http://localhost:8080/health   # → OK

Run the proof loop

make build && make crucible            # 12 use cases + conformance L1/L2
./bin/helm export --evidence ./data/evidence --out pack.tar.gz
./bin/helm verify --bundle pack.tar.gz # offline — no network

One-line integration

- client = openai.OpenAI()
+ client = openai.OpenAI(base_url="http://localhost:8080/v1")

That's it. Your app doesn't change. Every tool call now produces a signed receipt in an append-only DAG.


5-Minute Proof Loop

Goal: prove it works without trusting us. You can verify the EvidencePack and replay without network access.

# 1. Start
docker compose up -d

# 2. Trigger a deny (schema mismatch → fail-closed)
curl -s http://localhost:8080/v1/tools/execute \
  -H 'Content-Type: application/json' \
  -d '{"tool":"unknown_tool","args":{"bad_field":true}}' | jq .reason_code
# → "ERR_TOOL_NOT_FOUND"

# 3. View receipt
curl -s http://localhost:8080/api/v1/receipts?limit=1 | jq '.[0].receipt_hash'

# 4. Export EvidencePack
./bin/helm export --evidence ./data/evidence --out pack.tar.gz

# 5. Offline replay verify — no network required
./bin/helm verify --bundle pack.tar.gz
# → "verification: PASS"  (air-gapped safe)

# 6. Run conformance L1/L2
./bin/helm conform --profile L2 --json
# → {"profile":"L2","verdict":"PASS","gates":12}

Full walkthrough: docs/QUICKSTART.md · Copy-paste demo: docs/DEMO.md · 5-min micro-guide: docs/INTEGRATE_IN_5_MIN.md


Why Devs Should Care

Pain (postmortem you're preventing) HELM behavior Receipt reason code Proof
Tool-call overspend blows budget ACID budget locks, fail-closed on ceiling breach DENY_BUDGET_EXCEEDED UC-005
Schema drift breaks prod silently Fail-closed on input AND output schema mismatch DENY_SCHEMA_MISMATCH UC-002, UC-009
Untrusted WASM runs wild Sandbox: gas + time + memory budgets, deterministic traps DENY_GAS_EXHAUSTION UC-004
"Who approved that?" disputes Timelock + challenge/response ceremony, Ed25519 signed DENY_APPROVAL_REQUIRED UC-003
No audit trail for regulators Deterministic EvidencePack, offline verifiable, replay from genesis UC-008
Can't prove compliance to auditors Conformance L1 + L2 gates, 12 runnable use cases UC-012

Integrations

Python — OpenAI SDK

The only change:

- client = openai.OpenAI()
+ client = openai.OpenAI(base_url="http://localhost:8080/v1")

Full snippet:

import openai

client = openai.OpenAI(base_url="http://localhost:8080/v1")

response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "List files in /tmp"}]
)
print(response.choices[0].message.content)
# Response headers include:
#   X-Helm-Receipt-ID: rec_a1b2c3...
#   X-Helm-Output-Hash: sha256:7f83b1...
#   X-Helm-Lamport-Clock: 42

→ Full example: examples/python_openai_baseurl/main.py

TypeScript — Vercel AI SDK / fetch

The only change:

- const BASE = "https://api.openai.com/v1";
+ const BASE = "http://localhost:8080/v1";

Full snippet:

const response = await fetch("http://localhost:8080/v1/chat/completions", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({
    model: "gpt-4",
    messages: [{ role: "user", content: "What time is it?" }],
  }),
});
const data = await response.json();
console.log(data.choices[0].message.content);
// X-Helm-Receipt-ID: rec_d4e5f6...

→ Full example: examples/js_openai_baseurl/main.js

MCP Gateway

# List governed capabilities
curl -s http://localhost:8080/mcp/v1/capabilities | jq '.tools[].name'

# Execute a governed tool call
curl -s -X POST http://localhost:8080/mcp/v1/execute \
  -H 'Content-Type: application/json' \
  -d '{"method":"file_read","params":{"path":"/tmp/test.txt"}}' | jq .
# → { "result": ..., "receipt_id": "rec_...", "reason_code": "ALLOW" }
→ Full example: [examples/mcp_client/main.sh](examples/mcp_client/main.sh)

---

## SDKs

Typed clients for 5 languages. All generated from [api/openapi/helm.openapi.yaml](api/openapi/helm.openapi.yaml).

| Language | Install | Docs |
|----------|---------|------|
| TypeScript | `npm install @mindburn/helm-sdk` | [sdk/ts/README.md](sdk/ts/README.md) |
| Python | `pip install helm-sdk` | [sdk/python/README.md](sdk/python/README.md) |
| Go | `go get github.com/Mindburn-Labs/helm/sdk/go` | [sdk/go/README.md](sdk/go/README.md) |
| Rust | `cargo add helm-sdk` | [sdk/rust/README.md](sdk/rust/README.md) |
| Java | Maven `ai.mindburn.helm:helm-sdk:0.1.0` | [sdk/java/README.md](sdk/java/README.md) |

Every SDK exposes the same primitives: `chatCompletions`, `approveIntent`, `listSessions`, `getReceipts`, `exportEvidence`, `verifyEvidence`, `conformanceRun`.

Every error includes a typed `reason_code` (e.g. `DENY_TOOL_NOT_FOUND`).

**Go — 10-line denial-handling example:**

```go
c := helm.New("http://localhost:8080")
res, err := c.ChatCompletions(helm.ChatCompletionRequest{
    Model:    "gpt-4",
    Messages: []helm.ChatMessage{{Role: "user", Content: "List /tmp"}},
})
if apiErr, ok := err.(*helm.HelmApiError); ok {
    fmt.Println("Denied:", apiErr.ReasonCode) // DENY_TOOL_NOT_FOUND
}

Rust:

let c = HelmClient::new("http://localhost:8080");
match c.chat_completions(&req) {
    Ok(res) => println!("{:?}", res.choices[0].message.content),
    Err(e) => println!("Denied: {:?}", e.reason_code),
}

Java:

var helm = new HelmClient("http://localhost:8080");
try { helm.chatCompletions(req); }
catch (HelmApiException e) { System.out.println(e.reasonCode); }

Full examples: examples/ · SDK docs: docs/sdks/00_INDEX.md


OpenAPI Contract

api/openapi/helm.openapi.yaml — OpenAPI 3.1 spec.

Single source of truth. SDKs are generated from it. CI prevents drift.

Contract versioning


How It Works

Your App (OpenAI SDK)
       │
       │ base_url = localhost:8080
       ▼
   HELM Proxy ──→ Guardian (policy: allow/deny)
       │                │
       │           PEP Boundary (JCS canonicalize → SHA-256)
       │                │
       ▼                ▼
   Executor ──→ Tool ──→ Receipt (Ed25519 signed)
       │                        │
       ▼                        ▼
  ProofGraph DAG          EvidencePack (.tar.gz)
  (append-only)           (offline verifiable)
       │
       ▼
  Replay Verify
  (air-gapped safe)

What Ships vs What's Spec

Shipped in OSS v0.1 Spec (future / enterprise)
✅ OpenAI-compatible proxy 🔮 Multi-model gateway
✅ Schema PEP (input + output) 🔮 ZK-CPI (zero-knowledge proofs)
✅ ProofGraph DAG (Lamport + Ed25519) 🔮 Hardware TEE attestation
✅ WASI sandbox (gas/time/memory) 🔮 Post-quantum cryptography
✅ Approval ceremonies (timelock + challenge) 🔮 Multi-org federation
✅ Trust registry (event-sourced) 🔮 Formal verification (SMT/LTL)
✅ EvidencePack export + offline replay 🔮 Cross-tenant ProofGraph merge
✅ Conformance L1 + L2 🔮 Conformance L3 (enterprise)
✅ 11 CLI commands 🔮 Production key management (HSM)

Full cutline: docs/OSS_CUTLINE.md


Verification

make test       # 58 packages, 0 failures
make crucible   # 12 use cases + conformance L1/L2
make lint       # go vet, clean

Deploy

# Local demo
docker compose up -d

# Production (DigitalOcean / any Docker host)
docker compose -f docker-compose.demo.yml up -d

deploy/README.md — deploy your own in 3 minutes


Project Structure

helm/
├── api/openapi/         # OpenAPI 3.1 spec (single source of truth)
├── core/               # Go kernel (8-package TCB + executor + ProofGraph)
│   ├── cmd/helm/       # CLI: proxy, export, verify, replay, conform, ...
│   └── cmd/helm-node/  # Kernel API server
├── sdk/                # Multi-language SDKs (TS, Python, Go, Rust, Java)
│   ├── ts/             #   npm @mindburn/helm-sdk
│   ├── python/         #   pip helm-sdk
│   ├── go/             #   go get .../sdk/go
│   ├── rust/           #   cargo add helm-sdk
│   └── java/           #   mvn ai.mindburn.helm:helm-sdk
├── examples/           # Runnable examples per language + MCP
├── scripts/sdk/        # Type generator (gen.sh)
├── scripts/ci/         # SDK drift + build gates
├── deploy/             # Caddy config, demo compose, deploy guide
├── docs/               # Threat model, quickstart, demo, SDK docs
└── Makefile            # build, test, crucible, demo, release-binaries

Scope and Guarantees

OSS v0.1 targets L1/L2 core conformance. Spec contains L2/L3 and enterprise/2030 extensions — see docs/OSS_CUTLINE.md for the exact shipped-vs-spec boundary.


Security Posture

  • TCB isolation gate — 8-package kernel boundary, CI-enforced forbidden imports (TCB Policy)
  • Bounded compute gate — WASI sandbox with gas/time/memory caps, deterministic traps on breach (UC-005)
  • Schema drift fail-closed — JCS canonicalization + SHA-256 on every tool call, both input and output (UC-002)

See also: SECURITY.md (vulnerability reporting) · Threat Model (11 adversary classes)


Contributing

See CONTRIBUTING.md. Good first issues: conformance improvements, SDK enhancements, docs truth fixes.

Roadmap

See docs/ROADMAP.md. 10 items, no dates, each tied to a conformance level.

License

Business Source License 1.1 — converts to Apache 2.0 on 2030-02-15.


Built by Mindburn Labs.