Enforce phase file restrictions via readonly mounts and commit-time validation by james-in-a-box[bot] · Pull Request #800 · jwbron/egg

james-in-a-box · 2026-02-17T02:31:51Z

Enforce phase file restrictions via readonly mounts and commit-time validation

Closes #644.

Agents can currently bypass phase file restrictions by modifying protected files
locally without pushing, and can switch branches freely — breaking the invariant
that each agent works on exactly one branch. This PR adds defense-in-depth
enforcement at three layers: readonly filesystem mounts, commit-time gateway
validation, and the existing push-time validation.

Branch lock: Pipeline sessions now store an assigned_branch on the gateway
session. When set, git checkout and git switch branch-changing operations are
blocked with a clear error message. File-targeting checkout (git checkout -- file)
remains allowed. This ensures deterministic post-agent commit and push.

Readonly mounts: During the implement phase, .egg-state/contracts/,
.egg-state/drafts/, .egg-state/pipelines/, and .egg-state/reviews/ are
bind-mounted as readonly into the agent container. Agents get an OS-level
"Read-only file system" error immediately, before spending tokens on git
operations. Marker files (.egg-readonly) in each directory explain the
restriction and suggest alternatives (e.g., egg-contract CLI).

Commit-time validation: The gateway now intercepts git commit and validates
staged files against PhaseFileRestriction.blocked_patterns. Violations are
rejected with actionable errors listing the blocked files and how to unstage them.
This catches fine-grained restrictions that readonly mounts cannot enforce.

Post-agent auto-commit: A new post_agent_commit.py module automatically
commits and pushes any uncommitted allowed changes after agent completion. Blocked
files are restored via git checkout, ensuring no phase-violating changes persist
in the worktree. This prevents silent work loss.

Agent instructions: Updated sandbox rules to inform agents about branch
locking, phase restrictions, and auto-commit behavior so they don't waste tokens
debugging expected restrictions.

Issue: #644

Test plan:

pytest gateway/tests/test_branch_switch.py — branch lock enforcement
pytest gateway/tests/test_assigned_branch.py — assigned branch session field
pytest gateway/tests/test_agent_restrictions_enforce.py — enforce mode restrictions
pytest gateway/tests/test_agent_restrictions_patterns.py — restriction pattern matching
pytest gateway/tests/test_phase_filter_restrictions.py — phase filter integration
pytest gateway/tests/test_pipeline_enforcement.py — end-to-end pipeline enforcement
pytest gateway/tests/test_post_agent_commit.py — auto-commit logic
pytest gateway/tests/test_post_agent_commit_extended.py — extended auto-commit scenarios
pytest tests/shared/egg_container/test_phase_mounts.py — readonly mount generation
pytest tests/shared/egg_container/test_marker_files.py — marker file creation

Authored-by: egg

…ts, auto-commit Implements four layers of pipeline container isolation: 1. Branch lock: Pipeline sessions record their assigned worktree branch in the Session dataclass. The gateway blocks checkout/switch operations that would change branches, while still allowing file-level operations like `git checkout -- file.txt`. 2. Commit-time staged-file validation: When agents commit during a pipeline session, the gateway validates staged files against phase restrictions before allowing the commit. This catches violations early rather than only at push time. 3. Phase-based readonly mounts: During the implement phase, .egg-state/ subdirectories (drafts/, contracts/, reviews/) are mounted readonly in the container to prevent direct filesystem modifications to plan/contract artifacts. 4. Post-agent auto-commit: When a pipeline container exits, any uncommitted work is automatically committed as a WIP to prevent work loss from timeouts, crashes, or agents forgetting to commit. Also adds warn-only agent-role file restrictions at push time (check_agent_restrictions) and updates agent instructions to document the new branch lock behavior. Issue: #644 Authored-by: egg

… mounts, auto-commit Update documentation across gateway README, shared README, architecture docs, SDLC pipeline guide, and STRUCTURE.md to cover the new enforcement layers added in issue #644: - Branch lock for pipeline sessions (gateway blocks branch switching) - Commit-time validation of staged files against phase restrictions - Phase-based readonly mounts for .egg-state/ directories during implement - Post-agent auto-commit for uncommitted worktree changes - Defense-in-depth enforcement model (OS → commit → push) Authored-by: egg

…ts, commit validation)

Fix three acceptance criteria gaps identified in contract review: 1. Task 4-1 (Critical): post_agent_commit.py now uses check_phase_file_restrictions to filter files, restores blocked files via git checkout, stages only allowed files (no git add -A), and pushes via gateway API when session credentials are available. 2. Task 3-3 (Moderate): ensure_egg_state_dirs now places .egg-readonly marker files in readonly directories during implement phase, explaining the restriction and current phase to agents. 3. Task 2-1 (Minor): Agent-role restrictions now support EGG_AGENT_RESTRICTIONS_ENFORCE env var (default false). When true, violations return 403 instead of just logging warnings. Additionally fixes Task 3-1: adds 'pipelines' to _IMPLEMENT_READONLY_DIRS to match the 4 blocked patterns in phase-permissions.json. Authored-by: egg

Update documentation across gateway README, architecture docs, SDLC pipeline guide, sandbox agent rules, and shared library README to accurately describe the pipeline isolation features: - Post-agent auto-commit: document phase filtering via check_phase_file_restrictions(), selective file staging, and push via gateway API (not git add -A) - Readonly mounts: add missing pipelines/ directory to tables, document .egg-readonly marker files - Agent restrictions: document EGG_AGENT_RESTRICTIONS_ENFORCE flag (default warn-only, set to true to block) - Sandbox environment: add readonly directory guidance and post-agent auto-commit explanation for agents - Add defense-in-depth inline comment in post_agent_commit.py Authored-by: egg

Tests cover agent_restrictions pattern matching (all 14 roles, path normalization, traversal prevention, blocked-over-allowed precedence), phase_filter restrictions (per-phase file rules, unknown phase fail-closed, operation filtering, role-based blocking), post_agent_commit extended cases (checkout failure handling, push-via-gateway integration, commit message format), and marker file content validation (.egg-readonly format, phase gating, multi-repo support). Authored-by: egg

- Apply ruff formatting to 24 files - Fix 82 mypy type errors: add type annotations, casts, type: ignore comments for import-not-found, fix incompatible types and missing type parameters across gateway/, shared/, sandbox/, and config/ - Fix 8 test failures in test_start_pipeline.py by mocking get_pipeline_state_lock and configuring mock_store.load_pipeline to return the actual pipeline object Authored-by: egg

Resolve three merge conflicts: - gateway/gateway.py: Keep both is_branch_switch and is_branch_switching_operation imports (both functions exist and are used) - gateway/session_manager.py: Preserve assigned_branch locking for pipeline sessions (PR feature) - shared/egg_contracts/checkpoint_cli.py: Include --repo filter from main alongside --limit

egg-reviewer

No agent-mode design concerns. This PR is well-aligned with the principles in docs/guides/agent-mode-design.md.

The design consistently follows the right pattern: sandbox enforcement + informational agent instructions + discoverable signals. Specifically:

Constraints are sandbox-enforced, not prompt-enforced. Readonly bind mounts produce kernel-level EROFS errors. Branch lock and commit-time validation are gateway-enforced. The agent cannot bypass any of these regardless of prompt content. This is the correct approach per guideline #5 ("let the sandbox be the constraint").
Agent instructions describe, they don't prescribe. The additions to sandbox/.claude/rules/environment.md and mission.md are informational ("branch switching is blocked", "directories may be readonly") rather than procedural ("you must not switch branches"). This aligns with guideline #4 (prefer what over how).
Marker files are a discoverable signal. The .egg-readonly files let agents discover why a write failed and what alternatives exist, rather than requiring them to memorize restrictions from prompts. This supports guideline #5 (let the agent explore and use judgment).
No pre-fetching, no structured output, no post-processing. The auto-commit module is transparent to the agent — it runs at session cleanup and requires no agent interaction. No new output format requirements are imposed.
Fail-open design at the right layers. Commit-time validation fails open (push-time is authoritative). Agent-role restrictions default to warn-only. This is a mature rollout approach that avoids over-constraining agents during initial deployment.

— Authored by egg

egg-reviewer

Review: Enforce phase file restrictions via readonly mounts and commit-time validation

This PR implements a four-layer defense-in-depth system for phase file restrictions in the SDLC pipeline. The architecture is sound and the implementation is thorough. I have one blocking concern and several non-blocking suggestions.

Summary

The PR adds:

L0 (Branch lock): is_branch_switch() heuristic in git_client.py, wired into git_execute() via Session.assigned_branch
L1 (Readonly mounts): ensure_egg_state_dirs() + phase_readonly_mounts() in shared/egg_container/__init__.py, wired into container_spawner.py
L2 (Commit-time validation): Staged-file check in git_execute() using check_phase_file_restrictions()
L4 (Post-agent auto-commit): New gateway/post_agent_commit.py module with auto_commit_worktree()
Agent-role restriction wiring: check_agent_restrictions() now called in git_push() with EGG_AGENT_RESTRICTIONS_ENFORCE flag
Extensive type hint improvements, documentation updates, and test coverage across ~50 files

Blocking

1. is_branch_switch() has a dead-code block for flags already handled (git_client.py)

In is_branch_switch(), the flags -b, -B, and --orphan are in branch_creating_flags and trigger an early return True on line ~1246. However, the "flags that consume the next arg" block on line ~1253 also checks for -b, -B, --orphan — but this code is unreachable because the earlier check already returned. This is confusing and suggests the author intended different behavior (perhaps the early return should only mark a flag rather than return, so the value-consuming logic runs). The current behavior is correct by accident (early return is the right answer), but the dead code is misleading. Remove the duplicate check on lines ~1253-1255 to avoid confusion.

# This block is unreachable — -b/-B/--orphan already returned True above
if arg in ("-b", "-B", "--orphan"):
    seen_flag_with_value = True
    continue

Non-blocking

2. _push_via_gateway() passes repo_path as the worktree path (post_agent_commit.py)

The push payload sends "repo_path": worktree_path to /api/v1/git/push. The gateway's git_push() handler uses repo_path to locate the repository. This should work since the worktree is a valid git working directory, but it's worth confirming that the push handler correctly resolves remotes and branches from a worktree path (not the main .git dir). If git_push() ever validates that repo_path matches a known repo root, this would break. Consider adding a comment clarifying this is intentional.

3. session.session_token is passed to auto_commit_worktree() (session_manager.py)

The Session dataclass stores the raw session_token for push-via-gateway. The token is passed from _capture_and_cleanup_session() to auto_commit_worktree() and then to _push_via_gateway() which sends it as a Bearer token over HTTP. This is fine since it's localhost traffic, but worth noting that the session token is the original raw token (not the hash). Ensure the session isn't deleted from the session manager before the auto-commit push completes — the current ordering (auto-commit before checkpoint capture, which is before session deletion) looks correct.

4. Commit-time validation fail-open logging is at debug level (gateway.py)

When the commit-time staged-file check fails (exception in the try/except), it logs at debug level. Since this is a security-relevant control silently failing open, consider logging at warning level instead, similar to how the push-time check handles errors. A silent failure here could mask configuration issues.

# Currently:
logger.debug("Staged-file check skipped due to error", ...)
# Suggest:
logger.warning("Staged-file check skipped due to error", ...)

5. _parse_changed_files() doesn't handle filenames with spaces correctly (post_agent_commit.py)

The porcelain format for renamed files uses -> as a separator. If a filename itself contains ->, the split would be incorrect. This is an extreme edge case but worth documenting. Additionally, path_part.strip() on the result could trim intentional trailing spaces in filenames (git porcelain uses quoting for such paths). For practical purposes this is fine, but a comment noting the limitation would help.

6. is_branch_switch() treats git checkout HEAD -- file.txt as a branch switch (git_client.py)

The test test_checkout_head_double_dash_file confirms that checkout HEAD -- file.txt is classified as a branch switch because HEAD is a positional arg before --. In practice, this is a file-restore operation (restore file.txt from HEAD). The branch lock only fires for pipeline sessions, and agents would typically use git checkout -- file.txt instead, so the practical impact is low. But this is a known false positive that could cause confusion. The architecture plan's "conservative alternative" heuristic acknowledged this trade-off, so the current behavior is intentional. Consider adding a comment in the code noting this specific case.

7. Readonly mounts only implemented for implement phase (egg_container/init.py)

phase_readonly_mounts() returns empty lists for all phases except implement. The architecture plan mentions refine/plan phases should have readonly source code. This is deferred (TODO comments exist), but worth tracking as a follow-up since it means refine/plan agents can still modify source code locally.

8. ensure_egg_state_dirs() uses os.makedirs + os.chown without atomicity (egg_container/init.py)

Between makedirs and chown, there's a brief window where directories exist with the wrong ownership. In practice this is fine since it runs before the container starts, but it's worth noting the ordering dependency.

9. Type hint improvements are mechanical and correct

The extensive cast() additions in config/repo_config.py, shared/egg_contracts/, shared/egg_config/, and sandbox/egg_lib/ are correct and improve type safety. The dict[str, Any] annotations, list[str] casts, and Path | None type hints are all appropriate. format_timestamp() fix introducing parsed variable is correct. transcript_extractor.py adding or datetime.now(UTC) fallback is a reasonable safety net.

Testing

Test coverage is comprehensive:

test_branch_switch.py — 18 cases covering all heuristic branches
test_assigned_branch.py — serialization roundtrip, pipeline vs non-pipeline
test_post_agent_commit.py — clean/dirty worktree, file parsing, phase filtering, push handling, errors
test_post_agent_commit_extended.py — checkout failure, multiple blocked files, HTTP push
test_agent_restrictions_enforce.py — warn/enforce modes, flag values, unknown/no role
test_agent_restrictions_patterns.py — 14 role permission matrices, path traversal, pattern matching
test_phase_filter_restrictions.py — per-phase restrictions, unknown phase fail-closed, path escape
test_phase_mounts.py — mount spec generation, missing dirs, phase filtering
test_marker_files.py — content validation, ownership, idempotency

One gap: no integration test for EROFS (readonly mount at OS level). The integrator output notes this was acknowledged as CI-only. Not a blocker but should be tracked.

Architecture

The defense-in-depth layering is well-designed:

L1 (mounts) prevents modification at OS level — zero token waste
L2 (commit-time) catches what mounts can't express (filename patterns) — low token waste
L3 (push-time, existing) remains the authoritative gate — fail-closed
L4 (auto-commit) preserves work on exit — fail-safe

The fail-open/fail-closed decisions are appropriate: commit-time fails open (push-time catches it), push-time fails closed, branch lock fails closed, auto-commit errors don't block cleanup.

Verdict

The blocking issue (#1) is a code clarity problem in is_branch_switch() that should be cleaned up to avoid future misunderstanding. The remaining items are suggestions and observations. Overall this is a solid, well-tested implementation of a critical security feature.

— Authored by egg

james-in-a-box

Contract Verification Report for Issue #644

Comprehensive review of all 12 tasks across 5 phases. I ran all new test files and verified implementations against acceptance criteria.

Verdict: Request Changes

One blocking issue found (broken test file), plus two minor AC gaps.

Blocking Issue

gateway/tests/test_agent_restrictions_patterns.py fails to collect — references AgentRole.REVIEWER_UNIFIED which does not exist in gateway/agent_restrictions.py. This enum value appears 5 times (lines 329, 344, 359, 374, 389). The AgentRole class only defines: CODER, TESTER, DOCUMENTER, INTEGRATOR, ARCHITECT, TASK_PLANNER, RISK_ANALYST, REFINER, REVIEWER_CODE, REVIEWER_CONTRACT, REVIEWER_AGENT_DESIGN, REVIEWER_REFINE, REVIEWER_PLAN. No REVIEWER_UNIFIED exists anywhere in the codebase.

This is a new file introduced by this PR. The collection error prevents all tests in this file from running, which means the "comprehensive unit tests" for agent restriction pattern matching claimed in the integrator output are not actually executing.

Fix: Either add REVIEWER_UNIFIED to AgentRole in gateway/agent_restrictions.py, or replace the 5 references with an existing role (e.g., AgentRole.REVIEWER_CODE).

Task-by-Task Verification

Task	Status	Notes
1-1 assigned_branch field	PASS	Field at `session_manager.py:241`, set during register_session (lines 496-500), persisted in `to_dict_for_persistence()` (lines 281-282), restored in `from_persistence()` (lines 305-306). Tests in `test_assigned_branch.py`.
1-2 is_branch_switch() + enforcement	PASS	Helper at `git_client.py:1213-1275`, wired into `git_execute()` at `gateway.py:1107-1128`. Conservative heuristic: blocks `-b`/`-B`/`--orphan`/`switch`, allows `--` separator and `-p`. Only enforced when `assigned_branch` is set. Tests in `test_branch_switch.py` (13 test classes).
1-3 Commit-time staged-file validation	PASS	Validation at `gateway.py:1165-1215`, 10-second timeout (line 1180), fail-open on exception (lines 1208-1215), non-pipeline commits unaffected (line 1171). Tests in `test_pipeline_enforcement.py:TestCommitTimePhaseValidation`.
1-4 Agent documentation updates	PASS (minor gap)	Branch lock section added at `mission.md:74` and `environment.md:45-46`. Phase restriction and auto-commit guidance present in `environment.md:68-70`. However, `git checkout -b` references remain at `mission.md:52,72` and `environment.md:44` — these are for non-pipeline mode where branch switching is legitimate, so the AC is arguably met for pipeline agents.
2-1 Agent restrictions wiring	PASS	`check_agent_restrictions()` called in `git_push()` at `gateway.py:688-718`. `EGG_AGENT_RESTRICTIONS_ENFORCE` env var at lines 691-695, accepts "true"/"1"/"yes". Default warn-only. Tests in `test_agent_restrictions_enforce.py` (7 test classes).
3-1 phase_readonly_mounts()	PASS	Function at `egg_container/__init__.py:182-220`. `_IMPLEMENT_READONLY_DIRS = ("drafts", "contracts", "pipelines", "reviews")` — exactly 4 dirs (line 132). Empty for unknown/None phases. Tests in `test_phase_mounts.py`.
3-2 ensure_egg_state_dirs()	PASS	Function at `egg_container/__init__.py:135-180`. Creates dirs idempotently. Called before mounts at `container_spawner.py:289`. `EGG_PHASE` set at `container_spawner.py:355`. Tests in `test_phase_mounts.py`.
3-3 Readonly mounts + marker files	PASS	Mounts added after repo and git shadow mounts at `container_spawner.py:291`. Marker files generated at `egg_container/__init__.py:168-179` with phase, directory, explanation content. Tests in `test_marker_files.py`. No Docker EROFS integration test (acknowledged as CI-only).
4-1 post_agent_commit.py	PASS	Reuses `check_phase_file_restrictions` (line 179, not reimplemented). Blocked files restored via `git checkout --` (lines 196-206). Selective staging with `git add -- *allowed_files` (line 220). Push via gateway API (`_push_via_gateway()` lines 68-112). Egg author (line 246). Audit logging. No-op when clean. Errors return None. Tests: 32 + 15 = 47 tests.
4-2 Auto-commit cleanup integration	PASS (minor gap)	Correct ordering: auto-commit (lines 84-121) → checkpoint (lines 123-156) in `session_manager.py`. Session alive during auto-commit. Failure doesn't block cleanup. Gap: return value from `auto_commit_worktree()` is NOT captured (line 102), so auto-commit SHA does NOT appear in checkpoint metadata. The AC explicitly requires "auto-commit SHA in checkpoint metadata."
5-1 Tests pass	FAIL	220/220 tests pass across 9 test files. However, `test_agent_restrictions_patterns.py` fails at collection with `AttributeError: AgentRole has no attribute REVIEWER_UNIFIED`. This is a new file from this PR.
5-2 Linter passes	PASS	ruff check, ruff format, mypy all pass per implement-results.json.

Test Results (Local Run)

9 test files: 220 passed in 1.05s
1 test file: COLLECTION ERROR (test_agent_restrictions_patterns.py)

Summary

11 of 12 tasks fully meet acceptance criteria
1 blocking issue: New test file test_agent_restrictions_patterns.py cannot collect due to non-existent AgentRole.REVIEWER_UNIFIED
2 minor gaps (non-blocking):
- Task 4-2: auto-commit SHA not captured in checkpoint metadata
- Task 1-4: git checkout -b references remain in docs (for non-pipeline mode — arguably intentional)

— Authored by egg

james-in-a-box · 2026-02-17T03:01:12Z

Review feedback addressed (`e7926df`)

Addressed the non-blocking finding from the re-review:

auto_commit_sha propagated to checkpoint metadata: capture_session_end_checkpoint() in checkpoint_handler.py now sets commit_sha=session.auto_commit_sha on the CheckpointV2 record. Session-end checkpoints from auto-commit sessions are now queryable by commit SHA via the checkpoint index.

Overlapping branch-switch checks: Acknowledged as a pre-existing maintenance item for a separate follow-up. The current layering is functionally correct.

All 1452 gateway tests pass. No regressions.

— Authored by egg

egg-reviewer

No agent-mode design concerns. The delta since 2002e84 is a single commit (e7926df1) that propagates auto_commit_sha to session-end checkpoint metadata — addressing the gap noted in my previous review. The remaining changes are cosmetic line-wrapping. No new agent-facing patterns, pre-fetching, structured output requirements, or prompt-level constraints introduced. Original approval stands.

— Authored by egg

james-in-a-box

Contract Re-Verification Report for Issue #644

Incremental review of changes since commit 2002e84b (previous verification). One new commit: e7926df ("Propagate auto_commit_sha to session-end checkpoint metadata").

Verdict: Approve

The single outstanding gap from the previous review has been closed. No regressions found.

Delta Review (1 commit since last review)

Commit e7926df addresses the non-blocking observation from the previous code review:

Previous Finding	Resolution	Status
`auto_commit_sha` not propagated to checkpoint metadata — SHA was stored on Session but never written to CheckpointV2 record	`checkpoint_handler.py:605` now passes `commit_sha=session.auto_commit_sha` to `CheckpointV2` constructor in `capture_session_end_checkpoint()`	Fixed

Verification details:

CheckpointV2.commit_sha is an existing optional field (str | None, default None, with SHA regex validation) — no schema change needed
Session.auto_commit_sha is populated by auto_commit_worktree() return value before checkpoint capture (session_manager.py:102-112)
Persistence roundtrip confirmed: to_dict_for_persistence() writes and from_persistence() reads the field
Sessions without auto-commits pass None, which is the field's default — no regression for non-pipeline sessions

The other two hunks in the diff are formatting-only changes (line re-wrapping by ruff) with no functional impact.

Regression Check

No previously verified criteria are affected:

The change adds data to an existing optional field on an existing checkpoint type
No control flow changes, no new error paths
Non-pipeline sessions and session-end checkpoints without auto-commits continue to pass None

Task Verification Summary (All Re-confirmed)

Task	AC Met?	Notes
1-1 assigned_branch field	Yes	Unchanged since last review
1-2 is_branch_switch()	Yes	Unchanged since last review
1-3 Commit-time validation	Yes	Unchanged since last review
1-4 Agent doc updates	Yes	Unchanged since last review
2-1 Agent restrictions wiring	Yes	Unchanged since last review
3-1 phase_readonly_mounts()	Yes	Unchanged since last review
3-2 ensure_egg_state_dirs()	Yes	Unchanged since last review
3-3 Readonly mounts + markers	Yes	Unchanged since last review
4-1 post_agent_commit.py	Yes	Unchanged since last review
4-2 Auto-commit integration	Yes	Previously gap — auto_commit_sha now propagated to checkpoint metadata, closing the loop
5-1 Tests pass	Yes	Unchanged since last review
5-2 Linter passes	Yes	Formatting-only changes are ruff-compliant

All 12 tasks across 5 contract phases have their acceptance criteria fully met. No contract violations found.

— Authored by egg

egg-reviewer

Re-review: Propagate auto_commit_sha to checkpoint metadata (commit `e7926df`)

Reviewed the delta since 2002e84b (one commit: e7926df1). The change is small — only checkpoint_handler.py modified, with one functional line and two formatting adjustments.

Previous Issues — All Verified Fixed

All blocking and non-blocking items from the previous two reviews remain properly addressed in the current HEAD. No regressions.

New Issue

1. "unknown" SHA from auto_commit_worktree() will break checkpoint creation (non-blocking)

The new line commit_sha=session.auto_commit_sha at checkpoint_handler.py:605 propagates the auto-commit SHA into CheckpointV2. However, auto_commit_worktree() in post_agent_commit.py:270 has:

sha = rev.stdout.strip() if rev.returncode == 0 else "unknown"

If git rev-parse HEAD fails after a successful commit, sha is set to "unknown", which passes the truthy check at session_manager.py:111 and gets stored on session.auto_commit_sha. When this reaches CheckpointV2(commit_sha="unknown"), Pydantic's field constraint pattern=r"^[a-f0-9]{7,40}$" rejects it with a ValidationError.

This ValidationError propagates up to the try/except Exception at checkpoint_handler.py:635, causing the entire session-end checkpoint to be lost — not just the SHA field. The checkpoint is the primary audit artifact for agent sessions.

The "unknown" fallback pre-dates this PR, but before this commit it was harmless (only logged and session-persisted, never validated against the hex pattern). This commit creates the path where it causes damage.

Fix options (either one):

In post_agent_commit.py:270: return None instead of "unknown" when rev-parse fails, so the checkpoint is created without a commit_sha rather than failing entirely.
In checkpoint_handler.py:605: guard with a pattern check before passing (less clean).

The rev-parse HEAD failure after a successful commit is unlikely but not impossible (e.g., filesystem issue, corrupted .git). The impact when it does happen is disproportionate — losing the entire checkpoint for a minor metadata field.

2. No test coverage for the new code path (non-blocking)

The _make_test_session() helper in test_checkpoint_handler.py does not set auto_commit_sha. No test verifies that capture_session_end_checkpoint correctly propagates session.auto_commit_sha to CheckpointV2.commit_sha. A test with session.auto_commit_sha = "abc1234" confirming the checkpoint's commit_sha field would close this gap.

Summary

The change is conceptually correct — it closes the gap identified in the previous review where auto-commit SHAs were not appearing in checkpoint metadata. The "unknown" fallback issue is low-probability but high-impact when triggered, and straightforward to fix. Both items are non-blocking.

— Authored by egg

When git rev-parse HEAD fails after a successful auto-commit, auto_commit_worktree() returned "unknown" which would fail the CheckpointV2 Pydantic pattern validation (^[a-f0-9]{7,40}$), causing the entire session-end checkpoint to be lost. Return None instead so the checkpoint is created without a commit_sha rather than failing entirely. Add test coverage for auto_commit_sha propagation to checkpoint metadata.

james-in-a-box · 2026-02-17T03:13:55Z

Review feedback addressed (`4815a65`)

Addressed both findings from the re-review of commit e7926df:

Fix: `"unknown"` SHA breaks checkpoint creation

post_agent_commit.py:270: Changed the rev-parse HEAD failure fallback from "unknown" to None. Previously, the string "unknown" would pass the truthy check in session_manager.py:111 and get stored on session.auto_commit_sha, then fail CheckpointV2's Pydantic pattern validation (^[a-f0-9]{7,40}$), causing the entire session-end checkpoint to be lost. Now None is returned, which means:

The truthy check in session_manager.py:111 prevents it from being stored on the session
CheckpointV2.commit_sha receives None (the field default), so the checkpoint is created successfully without a commit SHA

Updated the existing test test_rev_parse_failure_returns_unknown → test_rev_parse_failure_returns_none to assert None.

Test coverage for auto_commit_sha propagation

Added TestAutoCommitShaInCheckpoint with two tests that exercise CheckpointHandler.capture_session_end_checkpoint directly (not mocked):

test_auto_commit_sha_propagated_to_checkpoint — verifies session.auto_commit_sha = "abc1234" results in checkpoint.commit_sha == "abc1234"
test_none_auto_commit_sha_leaves_checkpoint_commit_sha_none — verifies default None propagates correctly

All 1454 gateway tests pass. No regressions.

— Authored by egg

egg-reviewer

No agent-mode design concerns in the update. The delta since e7926df is a single commit (4815a65) that fixes the "unknown" SHA fallback to return None instead — preventing a CheckpointV2 Pydantic validation failure that could silently lose session-end checkpoints. The commit also adds test coverage for auto_commit_sha propagation. None of these changes introduce new agent-facing patterns, pre-fetching, structured output requirements, or prompt-level constraints. Original approval stands.

— Authored by egg

egg-reviewer

Re-review: Fix unknown SHA fallback (commit `4815a65`)

Reviewed the delta since e7926df1 (one commit: 4815a657 — Fix unknown SHA fallback that breaks checkpoint creation). Both findings from the previous review are properly addressed.

Previous Issues — Verified Fixed

1. "unknown" SHA breaks checkpoint creation (post_agent_commit.py:270) — Fixed correctly. The rev-parse HEAD failure fallback changed from "unknown" to None. This prevents the invalid string from reaching CheckpointV2's Pydantic pattern validator (^[a-f0-9]{7,40}$), which would have caused a ValidationError and lost the entire session-end checkpoint.

The data flow is correct:

auto_commit_worktree() returns None on rev-parse failure (line 270)
session_manager.py:111 guards with if auto_commit_sha: — None is falsy, so session.auto_commit_sha stays at its default None
checkpoint_handler.py:605 passes session.auto_commit_sha (= None) to CheckpointV2(commit_sha=None)
CheckpointV2.validate_sha() returns None for None input — checkpoint created successfully without a SHA

2. No test coverage for auto_commit_sha propagation (test_checkpoint_handler.py) — Fixed correctly. Two tests added in TestAutoCommitShaInCheckpoint:

test_auto_commit_sha_propagated_to_checkpoint: sets session.auto_commit_sha = "abc1234", verifies checkpoint.commit_sha == "abc1234"
test_none_auto_commit_sha_leaves_checkpoint_commit_sha_none: verifies default None propagates correctly

The existing rev-parse failure test was updated: test_rev_parse_failure_returns_unknown → test_rev_parse_failure_returns_none, asserting None instead of "unknown".

Observations

No new issues found. The change is minimal and targeted — one line of production code (the fallback value), one renamed/updated test, and two new test cases. No regressions introduced.

One minor note: when rev-parse HEAD fails after a successful commit, the log message at line 272 will contain commit_sha=None. This is fine — the log still captures the event with all other metadata (worktree path, container ID, role, files), and the SHA being None in the log correctly reflects that it couldn't be determined. No action needed.

Verdict

All feedback from the previous review has been properly addressed. No blocking or non-blocking issues remain.

— Authored by egg

james-in-a-box

Contract Verification — Incremental Re-review

Previous review commit: e7926df17fe007a4f86bcec0d7f4ce9d57e32b98
Current HEAD: 4815a657bb365cda89d7fd66a2f04c5bfc621814

Delta Since Last Review

One commit (4815a65): "Fix unknown SHA fallback that breaks checkpoint creation"

Files changed:

gateway/post_agent_commit.py — Changed "unknown" to None for rev-parse failure fallback
gateway/tests/test_checkpoint_handler.py — Added TestAutoCommitShaInCheckpoint test class (2 tests)
gateway/tests/test_post_agent_commit.py — Updated test to expect None instead of "unknown"

Analysis

This is a correctness fix. CheckpointV2.commit_sha has a Pydantic pattern validator (^[a-f0-9]{7,40}$) that rejects non-hex strings. The old fallback "unknown" would fail this validation, causing the entire session-end checkpoint to be lost. Returning None is correct since the field is Optional[str].

The fix strengthens task-4-2 compliance ("auto-commit SHA in checkpoint metadata, failure does not prevent cleanup") — previously, a rev-parse failure would cascade into checkpoint creation failure, violating the "failure does not prevent cleanup" criterion.

Full Contract Verification Status

All tasks verified against current codebase:

Task	Description	Status
task-1-1	Session `assigned_branch` field	VERIFIED — Field exists on `Session` (line 243), set for pipeline sessions (line 506), persisted via `to_dict_for_persistence` (line 284-285), restored via `from_dict` (line 311)
task-1-2	`is_branch_switch()` + branch lock	VERIFIED — `is_branch_switch()` and `is_branch_switching_operation()` in `git_client.py`, wired into `gateway.py` (lines 1107, 1144), conservative heuristic for checkout vs file restore, tests in `test_branch_switch.py` and `test_git_client.py`
task-1-3	Commit-time staged-file validation	VERIFIED — Staged file check in `gateway.py` (lines 1165-1200) using `check_phase_file_restrictions`, 10s timeout with fail-open, non-pipeline commits unaffected, tests in `test_pipeline_enforcement.py`
task-1-4	Agent instructions updated	VERIFIED — `mission.md` has branch lock guidance (line 74), `environment.md` has readonly directory guidance (line 68), auto-commit explanation (line 70), `git checkout -b` retained only for initial branch creation (not switching)
task-2-1	Agent-role restrictions	VERIFIED — `check_agent_restrictions()` called in `gateway.py` (line 689) when session has agent_role, `EGG_AGENT_RESTRICTIONS_ENFORCE` flag (lines 686-694, default false/warn-only), tests in `test_agent_restrictions_enforce.py` covering warn, enforce, unknown role, no-role cases
task-3-1	`phase_readonly_mounts()`	VERIFIED — Returns 4 readonly `MountSpec` for implement phase (`_IMPLEMENT_READONLY_DIRS` includes drafts, contracts, pipelines, reviews), empty for non-implement/None, skips missing dirs, tests in `test_phase_mounts.py`
task-3-2	`ensure_egg_state_dirs()` + `EGG_PHASE`	VERIFIED — Idempotent dir creation using shared `_IMPLEMENT_READONLY_DIRS` constant, called before mounts in `container_spawner.py` (line 290), `EGG_PHASE` set (line 355), tests in `test_marker_files.py`
task-3-3	Readonly mount wiring + marker files	VERIFIED — `phase_readonly_mounts()` called after repo mounts (line 291), `.egg-readonly` marker files written with phase and restriction explanation (lines 167-179)
task-4-1	`auto_commit_worktree()`	VERIFIED — Reuses `check_phase_file_restrictions` (line 185-193), blocked files restored via git checkout (line 215-226), commits with egg author (line 256-257), pushes via gateway API (lines 286-301), audit logging, no-op when clean, errors caught, tests in `test_post_agent_commit.py` and `test_post_agent_commit_extended.py`
task-4-2	Auto-commit integration ordering	VERIFIED — Auto-commit runs before checkpoint in `_capture_and_cleanup_session` (lines 84-123 then 125+), session alive during auto-commit, `auto_commit_sha` propagated to checkpoint (line 605 in `checkpoint_handler.py`), failure caught without blocking cleanup (lines 118-123), new tests verify SHA propagation
task-5-1	Tests pass	VERIFIED — 5385 passed, 83 skipped, 4 warnings (implement-results.json)
task-5-2	Lint clean	VERIFIED — ruff check passed, ruff format clean (378 files), mypy clean (118 files)

New Commit Verification

The new commit (4815a65) does not regress any previously verified criteria. It fixes a bug that would have caused task-4-2 criterion violation when git rev-parse HEAD fails after auto-commit. Tests are properly updated.

Verdict

All contract tasks and acceptance criteria are verified. No regressions from the new commit.

— Authored by egg

james-in-a-box · 2026-02-17T03:21:14Z

egg is addressing review feedback...

james-in-a-box · 2026-02-17T03:23:40Z

egg feedback addressed. View run logs

18 previous review(s) hidden.

Update the 'What's Enforced' table in README.md to reflect the defense-in-depth enforcement added in PR #800: - Readonly filesystem mounts on .egg-state/ dirs during implement phase - Commit-time validation for staged restricted files - Push-time validation (existing, now framed as one of three layers) Triggered by: #800 Authored-by: egg

…er] (#804) * docs: Update enforcement table for phase file restrictions [doc-updater] Update the 'What's Enforced' table in README.md to reflect the defense-in-depth enforcement added in PR #800: - Readonly filesystem mounts on .egg-state/ dirs during implement phase - Commit-time validation for staged restricted files - Push-time validation (existing, now framed as one of three layers) Triggered by: #800 Authored-by: egg * docs: Use 'subdirs' for precision per review feedback --------- Co-authored-by: jwbron <8340608+jwbron@users.noreply.github.com> Co-authored-by: egg-reviewer[bot] <261018737+egg-reviewer[bot]@users.noreply.github.com>

egg-orchestrator added 24 commits February 15, 2026 23:19

Initialize SDLC contract for issue #644

f4ea7a9

Add analysis draft for issue #644: phase file restriction enforcement

c674c92

Add analysis for issue #644: phase file restriction enforcement

177022a

Add agent-design review verdict for issue #644 refine phase

505ef6f

Add refine review verdict for issue #644

aeb70f9

Persist statefiles after refine phase

8027e9a

Add architect analysis for issue #644 plan phase

80094b3

Add tests for pipeline isolation features (branch lock, readonly moun…

87d387a

…ts, commit validation)

Fix lint errors: remove unused imports in test files

ced8e22

Fix unused import lint errors in test files

72bd798

Update implement phase check results

77a39b8

Remove unused import of release_pipeline_state_lock

1037e5b

Update implement phase check results

dd841df

Update implement phase check results

76a8a43

Add contract review verdict for issue #644 implement phase (cycle 2)

eaff338

Add code review verdict for issue #644 implement phase (cycle 2)

46f71b8

Persist statefiles after implement phase

e60da4d

egg-reviewer bot approved these changes Feb 17, 2026

View reviewed changes