Agent anchor mechanism for post-compaction state recovery

## Summary

Define a persistent "anchor" mechanism that ensures agents maintain coherent working state after context compaction. When an agent's context window is compressed, it risks losing track of its current task, decisions made with other agents, and progress markers. This mechanism provides a structured recovery point.

## Motivation

In the async team model (#1027, #1028, #1030), agents run longer sessions, coordinate via messaging, and make incremental progress across complex tasks. Context compaction is inevitable in long-running sessions, and without a recovery mechanism, compacted agents can:

- **Lose task focus** — forget what sub-task they're working on or repeat completed work
- **Forget decisions** — re-open questions already resolved with other agents (e.g., "coder and tester agreed on approach B" gets lost)
- **Lose coordination state** — miss that they're waiting on another agent, or that another agent is waiting on them
- **Repeat mistakes** — re-attempt approaches that already failed
- **Break consensus** — act contrary to team agreements they no longer remember

The existing `egg-contract` system is a precursor but is phase-scoped and pipeline-scoped. The new team model needs something more granular and agent-scoped.

## Proposed Design

### Agent Anchor File

Each running agent maintains a structured state file that serves as its post-compaction recovery point:

```
.egg-state/agent-anchors/<agent-id>.yaml
```

Contents (example):

```yaml
agent_id: coder-abc123
role: coder
team: issue-432
task: "Fix auth bypass in gateway/auth.py"
spawned_by: liaison-xyz789

status: in_progress
progress:
  - completed: "Identified root cause in token validation"
  - completed: "Fixed validate_token() to check expiry"
  - current: "Updating error handling for expired tokens"
  - pending: "Notify tester that fix is ready for coverage"

decisions:
  - with: tester-def456
    decided: "Use parametrized tests for token edge cases"
    timestamp: "2026-03-11T14:30:00Z"
  - with: mediator-ghi789
    decided: "Approach B (strict validation) over approach A (lenient)"
    timestamp: "2026-03-11T14:45:00Z"

waiting_on: []
blocked_by: []

files_modified:
  - gateway/auth.py
  - gateway/token_utils.py

key_context:
  - "Token validation was skipping expiry check when token had admin scope"
  - "Must maintain backward compatibility with v1 tokens"
```

### Update Triggers

The anchor file is updated:

- When an agent completes a sub-task or reaches a milestone
- When a decision is made with another agent
- When the agent's status changes (waiting, blocked, etc.)
- Periodically by the agent itself (self-checkpoint)
- By the cross-agent message bus when summarizing conversations (#1027)

### Post-Compaction Injection

After context compaction, the anchor file is injected into the agent's context (similar to how `CLAUDE.md` is always loaded). This gives the agent enough state to continue coherently without re-reading its entire conversation history.

### Team-Level Anchor

The mediator (or orchestrator, in automated flows) maintains a team-level anchor:

```
.egg-state/agent-anchors/team-<team-id>.yaml
```

This tracks:
- Which agents are active and their current status
- Team-level decisions and consensus state
- Cross-agent dependencies and handoff status
- Escalation history

## Key Design Questions

1. **Format**: YAML (human-readable) vs JSON (machine-parseable) vs both?
2. **Size budget**: Anchors must be small enough to inject post-compaction without consuming too much of the refreshed context window. What's the max size? (Proposal: 2KB per agent anchor, 4KB for team anchor)
3. **Update mechanism**: Should agents update their own anchors (self-report), or should the orchestrator/message bus update them (observed state)? Likely both.
4. **Conflict resolution**: If an agent's in-memory state diverges from its anchor (e.g., anchor says "waiting on tester" but the tester already responded before compaction), how is this reconciled?
5. **Anchor cleanup**: When should anchors be deleted? On agent termination? On team completion? Retained for checkpoint/audit?
6. **Gateway enforcement**: Should the gateway enforce that agents can only write their own anchor file?

## Dependencies

- #1027 — Cross-agent communication (message summaries feed into anchors)
- #1028 — Conversational coordinator (liaison maintains team-level context)
- #1030 — Agent roster and access controls (anchor schema varies by role)

## Success Criteria

- Agents maintain structured state files that are updated as work progresses
- After context compaction, agents can resume coherently using their anchor file
- Team-level anchors provide the mediator with a consistent view of team state
- Anchor files are small enough to inject without significant context cost
- No duplicate work or contradictory decisions after compaction events
- Anchor contents are included in checkpoints for audit/debugging

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Agent anchor mechanism for post-compaction state recovery #1032

Summary

Motivation

Proposed Design

Agent Anchor File

Update Triggers

Post-Compaction Injection

Team-Level Anchor

Key Design Questions

Dependencies

Success Criteria

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Agent anchor mechanism for post-compaction state recovery #1032

Description

Summary

Motivation

Proposed Design

Agent Anchor File

Update Triggers

Post-Compaction Injection

Team-Level Anchor

Key Design Questions

Dependencies

Success Criteria

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions