Skip to content

Enforce phase file restrictions via readonly mounts and commit-time validation#800

Merged
jwbron merged 27 commits intomainfrom
egg/issue-644
Feb 17, 2026
Merged

Enforce phase file restrictions via readonly mounts and commit-time validation#800
jwbron merged 27 commits intomainfrom
egg/issue-644

Conversation

@james-in-a-box
Copy link
Contributor

Enforce phase file restrictions via readonly mounts and commit-time validation

Closes #644.

Agents can currently bypass phase file restrictions by modifying protected files
locally without pushing, and can switch branches freely — breaking the invariant
that each agent works on exactly one branch. This PR adds defense-in-depth
enforcement at three layers: readonly filesystem mounts, commit-time gateway
validation, and the existing push-time validation.

Branch lock: Pipeline sessions now store an assigned_branch on the gateway
session. When set, git checkout and git switch branch-changing operations are
blocked with a clear error message. File-targeting checkout (git checkout -- file)
remains allowed. This ensures deterministic post-agent commit and push.

Readonly mounts: During the implement phase, .egg-state/contracts/,
.egg-state/drafts/, .egg-state/pipelines/, and .egg-state/reviews/ are
bind-mounted as readonly into the agent container. Agents get an OS-level
"Read-only file system" error immediately, before spending tokens on git
operations. Marker files (.egg-readonly) in each directory explain the
restriction and suggest alternatives (e.g., egg-contract CLI).

Commit-time validation: The gateway now intercepts git commit and validates
staged files against PhaseFileRestriction.blocked_patterns. Violations are
rejected with actionable errors listing the blocked files and how to unstage them.
This catches fine-grained restrictions that readonly mounts cannot enforce.

Post-agent auto-commit: A new post_agent_commit.py module automatically
commits and pushes any uncommitted allowed changes after agent completion. Blocked
files are restored via git checkout, ensuring no phase-violating changes persist
in the worktree. This prevents silent work loss.

Agent instructions: Updated sandbox rules to inform agents about branch
locking, phase restrictions, and auto-commit behavior so they don't waste tokens
debugging expected restrictions.

Issue: #644

Test plan:

  • pytest gateway/tests/test_branch_switch.py — branch lock enforcement
  • pytest gateway/tests/test_assigned_branch.py — assigned branch session field
  • pytest gateway/tests/test_agent_restrictions_enforce.py — enforce mode restrictions
  • pytest gateway/tests/test_agent_restrictions_patterns.py — restriction pattern matching
  • pytest gateway/tests/test_phase_filter_restrictions.py — phase filter integration
  • pytest gateway/tests/test_pipeline_enforcement.py — end-to-end pipeline enforcement
  • pytest gateway/tests/test_post_agent_commit.py — auto-commit logic
  • pytest gateway/tests/test_post_agent_commit_extended.py — extended auto-commit scenarios
  • pytest tests/shared/egg_container/test_phase_mounts.py — readonly mount generation
  • pytest tests/shared/egg_container/test_marker_files.py — marker file creation

Authored-by: egg

egg-orchestrator added 24 commits February 15, 2026 23:19
…ts, auto-commit

Implements four layers of pipeline container isolation:

1. Branch lock: Pipeline sessions record their assigned worktree branch in
   the Session dataclass. The gateway blocks checkout/switch operations
   that would change branches, while still allowing file-level operations
   like `git checkout -- file.txt`.

2. Commit-time staged-file validation: When agents commit during a pipeline
   session, the gateway validates staged files against phase restrictions
   before allowing the commit. This catches violations early rather than
   only at push time.

3. Phase-based readonly mounts: During the implement phase, .egg-state/
   subdirectories (drafts/, contracts/, reviews/) are mounted readonly in
   the container to prevent direct filesystem modifications to plan/contract
   artifacts.

4. Post-agent auto-commit: When a pipeline container exits, any uncommitted
   work is automatically committed as a WIP to prevent work loss from
   timeouts, crashes, or agents forgetting to commit.

Also adds warn-only agent-role file restrictions at push time (check_agent_restrictions)
and updates agent instructions to document the new branch lock behavior.

Issue: #644

Authored-by: egg
… mounts, auto-commit

Update documentation across gateway README, shared README, architecture docs,
SDLC pipeline guide, and STRUCTURE.md to cover the new enforcement layers
added in issue #644:

- Branch lock for pipeline sessions (gateway blocks branch switching)
- Commit-time validation of staged files against phase restrictions
- Phase-based readonly mounts for .egg-state/ directories during implement
- Post-agent auto-commit for uncommitted worktree changes
- Defense-in-depth enforcement model (OS → commit → push)

Authored-by: egg
Fix three acceptance criteria gaps identified in contract review:

1. Task 4-1 (Critical): post_agent_commit.py now uses
   check_phase_file_restrictions to filter files, restores blocked
   files via git checkout, stages only allowed files (no git add -A),
   and pushes via gateway API when session credentials are available.

2. Task 3-3 (Moderate): ensure_egg_state_dirs now places .egg-readonly
   marker files in readonly directories during implement phase,
   explaining the restriction and current phase to agents.

3. Task 2-1 (Minor): Agent-role restrictions now support
   EGG_AGENT_RESTRICTIONS_ENFORCE env var (default false). When true,
   violations return 403 instead of just logging warnings.

Additionally fixes Task 3-1: adds 'pipelines' to _IMPLEMENT_READONLY_DIRS
to match the 4 blocked patterns in phase-permissions.json.

Authored-by: egg
Update documentation across gateway README, architecture docs, SDLC
pipeline guide, sandbox agent rules, and shared library README to
accurately describe the pipeline isolation features:

- Post-agent auto-commit: document phase filtering via
  check_phase_file_restrictions(), selective file staging, and push
  via gateway API (not git add -A)
- Readonly mounts: add missing pipelines/ directory to tables,
  document .egg-readonly marker files
- Agent restrictions: document EGG_AGENT_RESTRICTIONS_ENFORCE flag
  (default warn-only, set to true to block)
- Sandbox environment: add readonly directory guidance and
  post-agent auto-commit explanation for agents
- Add defense-in-depth inline comment in post_agent_commit.py

Authored-by: egg
Tests cover agent_restrictions pattern matching (all 14 roles, path
normalization, traversal prevention, blocked-over-allowed precedence),
phase_filter restrictions (per-phase file rules, unknown phase fail-closed,
operation filtering, role-based blocking), post_agent_commit extended cases
(checkout failure handling, push-via-gateway integration, commit message
format), and marker file content validation (.egg-readonly format, phase
gating, multi-repo support).

Authored-by: egg
- Apply ruff formatting to 24 files
- Fix 82 mypy type errors: add type annotations, casts, type: ignore
  comments for import-not-found, fix incompatible types and missing
  type parameters across gateway/, shared/, sandbox/, and config/
- Fix 8 test failures in test_start_pipeline.py by mocking
  get_pipeline_state_lock and configuring mock_store.load_pipeline
  to return the actual pipeline object

Authored-by: egg
Resolve three merge conflicts:
- gateway/gateway.py: Keep both is_branch_switch and is_branch_switching_operation imports (both functions exist and are used)
- gateway/session_manager.py: Preserve assigned_branch locking for pipeline sessions (PR feature)
- shared/egg_contracts/checkpoint_cli.py: Include --repo filter from main alongside --limit
Copy link
Contributor

@egg-reviewer egg-reviewer bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No agent-mode design concerns. This PR is well-aligned with the principles in docs/guides/agent-mode-design.md.

The design consistently follows the right pattern: sandbox enforcement + informational agent instructions + discoverable signals. Specifically:

  • Constraints are sandbox-enforced, not prompt-enforced. Readonly bind mounts produce kernel-level EROFS errors. Branch lock and commit-time validation are gateway-enforced. The agent cannot bypass any of these regardless of prompt content. This is the correct approach per guideline #5 ("let the sandbox be the constraint").

  • Agent instructions describe, they don't prescribe. The additions to sandbox/.claude/rules/environment.md and mission.md are informational ("branch switching is blocked", "directories may be readonly") rather than procedural ("you must not switch branches"). This aligns with guideline #4 (prefer what over how).

  • Marker files are a discoverable signal. The .egg-readonly files let agents discover why a write failed and what alternatives exist, rather than requiring them to memorize restrictions from prompts. This supports guideline #5 (let the agent explore and use judgment).

  • No pre-fetching, no structured output, no post-processing. The auto-commit module is transparent to the agent — it runs at session cleanup and requires no agent interaction. No new output format requirements are imposed.

  • Fail-open design at the right layers. Commit-time validation fails open (push-time is authoritative). Agent-role restrictions default to warn-only. This is a mature rollout approach that avoids over-constraining agents during initial deployment.

— Authored by egg

@james-in-a-box

This comment has been minimized.

Copy link
Contributor

@egg-reviewer egg-reviewer bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: Enforce phase file restrictions via readonly mounts and commit-time validation

This PR implements a four-layer defense-in-depth system for phase file restrictions in the SDLC pipeline. The architecture is sound and the implementation is thorough. I have one blocking concern and several non-blocking suggestions.

Summary

The PR adds:

  • L0 (Branch lock): is_branch_switch() heuristic in git_client.py, wired into git_execute() via Session.assigned_branch
  • L1 (Readonly mounts): ensure_egg_state_dirs() + phase_readonly_mounts() in shared/egg_container/__init__.py, wired into container_spawner.py
  • L2 (Commit-time validation): Staged-file check in git_execute() using check_phase_file_restrictions()
  • L4 (Post-agent auto-commit): New gateway/post_agent_commit.py module with auto_commit_worktree()
  • Agent-role restriction wiring: check_agent_restrictions() now called in git_push() with EGG_AGENT_RESTRICTIONS_ENFORCE flag
  • Extensive type hint improvements, documentation updates, and test coverage across ~50 files

Blocking

1. is_branch_switch() has a dead-code block for flags already handled (git_client.py)

In is_branch_switch(), the flags -b, -B, and --orphan are in branch_creating_flags and trigger an early return True on line ~1246. However, the "flags that consume the next arg" block on line ~1253 also checks for -b, -B, --orphan — but this code is unreachable because the earlier check already returned. This is confusing and suggests the author intended different behavior (perhaps the early return should only mark a flag rather than return, so the value-consuming logic runs). The current behavior is correct by accident (early return is the right answer), but the dead code is misleading. Remove the duplicate check on lines ~1253-1255 to avoid confusion.

# This block is unreachable — -b/-B/--orphan already returned True above
if arg in ("-b", "-B", "--orphan"):
    seen_flag_with_value = True
    continue

Non-blocking

2. _push_via_gateway() passes repo_path as the worktree path (post_agent_commit.py)

The push payload sends "repo_path": worktree_path to /api/v1/git/push. The gateway's git_push() handler uses repo_path to locate the repository. This should work since the worktree is a valid git working directory, but it's worth confirming that the push handler correctly resolves remotes and branches from a worktree path (not the main .git dir). If git_push() ever validates that repo_path matches a known repo root, this would break. Consider adding a comment clarifying this is intentional.

3. session.session_token is passed to auto_commit_worktree() (session_manager.py)

The Session dataclass stores the raw session_token for push-via-gateway. The token is passed from _capture_and_cleanup_session() to auto_commit_worktree() and then to _push_via_gateway() which sends it as a Bearer token over HTTP. This is fine since it's localhost traffic, but worth noting that the session token is the original raw token (not the hash). Ensure the session isn't deleted from the session manager before the auto-commit push completes — the current ordering (auto-commit before checkpoint capture, which is before session deletion) looks correct.

4. Commit-time validation fail-open logging is at debug level (gateway.py)

When the commit-time staged-file check fails (exception in the try/except), it logs at debug level. Since this is a security-relevant control silently failing open, consider logging at warning level instead, similar to how the push-time check handles errors. A silent failure here could mask configuration issues.

# Currently:
logger.debug("Staged-file check skipped due to error", ...)
# Suggest:
logger.warning("Staged-file check skipped due to error", ...)

5. _parse_changed_files() doesn't handle filenames with spaces correctly (post_agent_commit.py)

The porcelain format for renamed files uses -> as a separator. If a filename itself contains ->, the split would be incorrect. This is an extreme edge case but worth documenting. Additionally, path_part.strip() on the result could trim intentional trailing spaces in filenames (git porcelain uses quoting for such paths). For practical purposes this is fine, but a comment noting the limitation would help.

6. is_branch_switch() treats git checkout HEAD -- file.txt as a branch switch (git_client.py)

The test test_checkout_head_double_dash_file confirms that checkout HEAD -- file.txt is classified as a branch switch because HEAD is a positional arg before --. In practice, this is a file-restore operation (restore file.txt from HEAD). The branch lock only fires for pipeline sessions, and agents would typically use git checkout -- file.txt instead, so the practical impact is low. But this is a known false positive that could cause confusion. The architecture plan's "conservative alternative" heuristic acknowledged this trade-off, so the current behavior is intentional. Consider adding a comment in the code noting this specific case.

7. Readonly mounts only implemented for implement phase (egg_container/init.py)

phase_readonly_mounts() returns empty lists for all phases except implement. The architecture plan mentions refine/plan phases should have readonly source code. This is deferred (TODO comments exist), but worth tracking as a follow-up since it means refine/plan agents can still modify source code locally.

8. ensure_egg_state_dirs() uses os.makedirs + os.chown without atomicity (egg_container/init.py)

Between makedirs and chown, there's a brief window where directories exist with the wrong ownership. In practice this is fine since it runs before the container starts, but it's worth noting the ordering dependency.

9. Type hint improvements are mechanical and correct

The extensive cast() additions in config/repo_config.py, shared/egg_contracts/, shared/egg_config/, and sandbox/egg_lib/ are correct and improve type safety. The dict[str, Any] annotations, list[str] casts, and Path | None type hints are all appropriate. format_timestamp() fix introducing parsed variable is correct. transcript_extractor.py adding or datetime.now(UTC) fallback is a reasonable safety net.

Testing

Test coverage is comprehensive:

  • test_branch_switch.py — 18 cases covering all heuristic branches
  • test_assigned_branch.py — serialization roundtrip, pipeline vs non-pipeline
  • test_post_agent_commit.py — clean/dirty worktree, file parsing, phase filtering, push handling, errors
  • test_post_agent_commit_extended.py — checkout failure, multiple blocked files, HTTP push
  • test_agent_restrictions_enforce.py — warn/enforce modes, flag values, unknown/no role
  • test_agent_restrictions_patterns.py — 14 role permission matrices, path traversal, pattern matching
  • test_phase_filter_restrictions.py — per-phase restrictions, unknown phase fail-closed, path escape
  • test_phase_mounts.py — mount spec generation, missing dirs, phase filtering
  • test_marker_files.py — content validation, ownership, idempotency

One gap: no integration test for EROFS (readonly mount at OS level). The integrator output notes this was acknowledged as CI-only. Not a blocker but should be tracked.

Architecture

The defense-in-depth layering is well-designed:

  • L1 (mounts) prevents modification at OS level — zero token waste
  • L2 (commit-time) catches what mounts can't express (filename patterns) — low token waste
  • L3 (push-time, existing) remains the authoritative gate — fail-closed
  • L4 (auto-commit) preserves work on exit — fail-safe

The fail-open/fail-closed decisions are appropriate: commit-time fails open (push-time catches it), push-time fails closed, branch lock fails closed, auto-commit errors don't block cleanup.

Verdict

The blocking issue (#1) is a code clarity problem in is_branch_switch() that should be cleaned up to avoid future misunderstanding. The remaining items are suggestions and observations. Overall this is a solid, well-tested implementation of a critical security feature.


— Authored by egg

Copy link
Contributor Author

@james-in-a-box james-in-a-box bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Contract Verification Report for Issue #644

Comprehensive review of all 12 tasks across 5 phases. I ran all new test files and verified implementations against acceptance criteria.

Verdict: Request Changes

One blocking issue found (broken test file), plus two minor AC gaps.


Blocking Issue

gateway/tests/test_agent_restrictions_patterns.py fails to collect — references AgentRole.REVIEWER_UNIFIED which does not exist in gateway/agent_restrictions.py. This enum value appears 5 times (lines 329, 344, 359, 374, 389). The AgentRole class only defines: CODER, TESTER, DOCUMENTER, INTEGRATOR, ARCHITECT, TASK_PLANNER, RISK_ANALYST, REFINER, REVIEWER_CODE, REVIEWER_CONTRACT, REVIEWER_AGENT_DESIGN, REVIEWER_REFINE, REVIEWER_PLAN. No REVIEWER_UNIFIED exists anywhere in the codebase.

This is a new file introduced by this PR. The collection error prevents all tests in this file from running, which means the "comprehensive unit tests" for agent restriction pattern matching claimed in the integrator output are not actually executing.

Fix: Either add REVIEWER_UNIFIED to AgentRole in gateway/agent_restrictions.py, or replace the 5 references with an existing role (e.g., AgentRole.REVIEWER_CODE).


Task-by-Task Verification

Task Status Notes
1-1 assigned_branch field PASS Field at session_manager.py:241, set during register_session (lines 496-500), persisted in to_dict_for_persistence() (lines 281-282), restored in from_persistence() (lines 305-306). Tests in test_assigned_branch.py.
1-2 is_branch_switch() + enforcement PASS Helper at git_client.py:1213-1275, wired into git_execute() at gateway.py:1107-1128. Conservative heuristic: blocks -b/-B/--orphan/switch, allows -- separator and -p. Only enforced when assigned_branch is set. Tests in test_branch_switch.py (13 test classes).
1-3 Commit-time staged-file validation PASS Validation at gateway.py:1165-1215, 10-second timeout (line 1180), fail-open on exception (lines 1208-1215), non-pipeline commits unaffected (line 1171). Tests in test_pipeline_enforcement.py:TestCommitTimePhaseValidation.
1-4 Agent documentation updates PASS (minor gap) Branch lock section added at mission.md:74 and environment.md:45-46. Phase restriction and auto-commit guidance present in environment.md:68-70. However, git checkout -b references remain at mission.md:52,72 and environment.md:44 — these are for non-pipeline mode where branch switching is legitimate, so the AC is arguably met for pipeline agents.
2-1 Agent restrictions wiring PASS check_agent_restrictions() called in git_push() at gateway.py:688-718. EGG_AGENT_RESTRICTIONS_ENFORCE env var at lines 691-695, accepts "true"/"1"/"yes". Default warn-only. Tests in test_agent_restrictions_enforce.py (7 test classes).
3-1 phase_readonly_mounts() PASS Function at egg_container/__init__.py:182-220. _IMPLEMENT_READONLY_DIRS = ("drafts", "contracts", "pipelines", "reviews") — exactly 4 dirs (line 132). Empty for unknown/None phases. Tests in test_phase_mounts.py.
3-2 ensure_egg_state_dirs() PASS Function at egg_container/__init__.py:135-180. Creates dirs idempotently. Called before mounts at container_spawner.py:289. EGG_PHASE set at container_spawner.py:355. Tests in test_phase_mounts.py.
3-3 Readonly mounts + marker files PASS Mounts added after repo and git shadow mounts at container_spawner.py:291. Marker files generated at egg_container/__init__.py:168-179 with phase, directory, explanation content. Tests in test_marker_files.py. No Docker EROFS integration test (acknowledged as CI-only).
4-1 post_agent_commit.py PASS Reuses check_phase_file_restrictions (line 179, not reimplemented). Blocked files restored via git checkout -- (lines 196-206). Selective staging with git add -- *allowed_files (line 220). Push via gateway API (_push_via_gateway() lines 68-112). Egg author (line 246). Audit logging. No-op when clean. Errors return None. Tests: 32 + 15 = 47 tests.
4-2 Auto-commit cleanup integration PASS (minor gap) Correct ordering: auto-commit (lines 84-121) → checkpoint (lines 123-156) in session_manager.py. Session alive during auto-commit. Failure doesn't block cleanup. Gap: return value from auto_commit_worktree() is NOT captured (line 102), so auto-commit SHA does NOT appear in checkpoint metadata. The AC explicitly requires "auto-commit SHA in checkpoint metadata."
5-1 Tests pass FAIL 220/220 tests pass across 9 test files. However, test_agent_restrictions_patterns.py fails at collection with AttributeError: AgentRole has no attribute REVIEWER_UNIFIED. This is a new file from this PR.
5-2 Linter passes PASS ruff check, ruff format, mypy all pass per implement-results.json.

Test Results (Local Run)

9 test files: 220 passed in 1.05s
1 test file: COLLECTION ERROR (test_agent_restrictions_patterns.py)

Summary

  • 11 of 12 tasks fully meet acceptance criteria
  • 1 blocking issue: New test file test_agent_restrictions_patterns.py cannot collect due to non-existent AgentRole.REVIEWER_UNIFIED
  • 2 minor gaps (non-blocking):
    • Task 4-2: auto-commit SHA not captured in checkpoint metadata
    • Task 1-4: git checkout -b references remain in docs (for non-pipeline mode — arguably intentional)

— Authored by egg

@james-in-a-box

This comment has been minimized.

@james-in-a-box

This comment has been minimized.

@james-in-a-box
Copy link
Contributor Author

Review feedback addressed (e7926df)

Addressed the non-blocking finding from the re-review:

auto_commit_sha propagated to checkpoint metadata: capture_session_end_checkpoint() in checkpoint_handler.py now sets commit_sha=session.auto_commit_sha on the CheckpointV2 record. Session-end checkpoints from auto-commit sessions are now queryable by commit SHA via the checkpoint index.

Overlapping branch-switch checks: Acknowledged as a pre-existing maintenance item for a separate follow-up. The current layering is functionally correct.

All 1452 gateway tests pass. No regressions.

— Authored by egg

@james-in-a-box

This comment has been minimized.

Copy link
Contributor

@egg-reviewer egg-reviewer bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No agent-mode design concerns. The delta since 2002e84 is a single commit (e7926df1) that propagates auto_commit_sha to session-end checkpoint metadata — addressing the gap noted in my previous review. The remaining changes are cosmetic line-wrapping. No new agent-facing patterns, pre-fetching, structured output requirements, or prompt-level constraints introduced. Original approval stands.

— Authored by egg

@james-in-a-box

This comment has been minimized.

Copy link
Contributor Author

@james-in-a-box james-in-a-box bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Contract Re-Verification Report for Issue #644

Incremental review of changes since commit 2002e84b (previous verification). One new commit: e7926df ("Propagate auto_commit_sha to session-end checkpoint metadata").

Verdict: Approve

The single outstanding gap from the previous review has been closed. No regressions found.


Delta Review (1 commit since last review)

Commit e7926df addresses the non-blocking observation from the previous code review:

Previous Finding Resolution Status
auto_commit_sha not propagated to checkpoint metadata — SHA was stored on Session but never written to CheckpointV2 record checkpoint_handler.py:605 now passes commit_sha=session.auto_commit_sha to CheckpointV2 constructor in capture_session_end_checkpoint() Fixed

Verification details:

  • CheckpointV2.commit_sha is an existing optional field (str | None, default None, with SHA regex validation) — no schema change needed
  • Session.auto_commit_sha is populated by auto_commit_worktree() return value before checkpoint capture (session_manager.py:102-112)
  • Persistence roundtrip confirmed: to_dict_for_persistence() writes and from_persistence() reads the field
  • Sessions without auto-commits pass None, which is the field's default — no regression for non-pipeline sessions

The other two hunks in the diff are formatting-only changes (line re-wrapping by ruff) with no functional impact.

Regression Check

No previously verified criteria are affected:

  • The change adds data to an existing optional field on an existing checkpoint type
  • No control flow changes, no new error paths
  • Non-pipeline sessions and session-end checkpoints without auto-commits continue to pass None

Task Verification Summary (All Re-confirmed)

Task AC Met? Notes
1-1 assigned_branch field Yes Unchanged since last review
1-2 is_branch_switch() Yes Unchanged since last review
1-3 Commit-time validation Yes Unchanged since last review
1-4 Agent doc updates Yes Unchanged since last review
2-1 Agent restrictions wiring Yes Unchanged since last review
3-1 phase_readonly_mounts() Yes Unchanged since last review
3-2 ensure_egg_state_dirs() Yes Unchanged since last review
3-3 Readonly mounts + markers Yes Unchanged since last review
4-1 post_agent_commit.py Yes Unchanged since last review
4-2 Auto-commit integration Yes Previously gap — auto_commit_sha now propagated to checkpoint metadata, closing the loop
5-1 Tests pass Yes Unchanged since last review
5-2 Linter passes Yes Formatting-only changes are ruff-compliant

All 12 tasks across 5 contract phases have their acceptance criteria fully met. No contract violations found.

— Authored by egg

@james-in-a-box

This comment has been minimized.

Copy link
Contributor

@egg-reviewer egg-reviewer bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-review: Propagate auto_commit_sha to checkpoint metadata (commit e7926df)

Reviewed the delta since 2002e84b (one commit: e7926df1). The change is small — only checkpoint_handler.py modified, with one functional line and two formatting adjustments.

Previous Issues — All Verified Fixed

All blocking and non-blocking items from the previous two reviews remain properly addressed in the current HEAD. No regressions.

New Issue

1. "unknown" SHA from auto_commit_worktree() will break checkpoint creation (non-blocking)

The new line commit_sha=session.auto_commit_sha at checkpoint_handler.py:605 propagates the auto-commit SHA into CheckpointV2. However, auto_commit_worktree() in post_agent_commit.py:270 has:

sha = rev.stdout.strip() if rev.returncode == 0 else "unknown"

If git rev-parse HEAD fails after a successful commit, sha is set to "unknown", which passes the truthy check at session_manager.py:111 and gets stored on session.auto_commit_sha. When this reaches CheckpointV2(commit_sha="unknown"), Pydantic's field constraint pattern=r"^[a-f0-9]{7,40}$" rejects it with a ValidationError.

This ValidationError propagates up to the try/except Exception at checkpoint_handler.py:635, causing the entire session-end checkpoint to be lost — not just the SHA field. The checkpoint is the primary audit artifact for agent sessions.

The "unknown" fallback pre-dates this PR, but before this commit it was harmless (only logged and session-persisted, never validated against the hex pattern). This commit creates the path where it causes damage.

Fix options (either one):

  • In post_agent_commit.py:270: return None instead of "unknown" when rev-parse fails, so the checkpoint is created without a commit_sha rather than failing entirely.
  • In checkpoint_handler.py:605: guard with a pattern check before passing (less clean).

The rev-parse HEAD failure after a successful commit is unlikely but not impossible (e.g., filesystem issue, corrupted .git). The impact when it does happen is disproportionate — losing the entire checkpoint for a minor metadata field.

2. No test coverage for the new code path (non-blocking)

The _make_test_session() helper in test_checkpoint_handler.py does not set auto_commit_sha. No test verifies that capture_session_end_checkpoint correctly propagates session.auto_commit_sha to CheckpointV2.commit_sha. A test with session.auto_commit_sha = "abc1234" confirming the checkpoint's commit_sha field would close this gap.

Summary

The change is conceptually correct — it closes the gap identified in the previous review where auto-commit SHAs were not appearing in checkpoint metadata. The "unknown" fallback issue is low-probability but high-impact when triggered, and straightforward to fix. Both items are non-blocking.

— Authored by egg

@james-in-a-box

This comment has been minimized.

@james-in-a-box

This comment has been minimized.

When git rev-parse HEAD fails after a successful auto-commit,
auto_commit_worktree() returned "unknown" which would fail the
CheckpointV2 Pydantic pattern validation (^[a-f0-9]{7,40}$),
causing the entire session-end checkpoint to be lost. Return None
instead so the checkpoint is created without a commit_sha rather
than failing entirely.

Add test coverage for auto_commit_sha propagation to checkpoint
metadata.
@james-in-a-box
Copy link
Contributor Author

Review feedback addressed (4815a65)

Addressed both findings from the re-review of commit e7926df:

Fix: "unknown" SHA breaks checkpoint creation

post_agent_commit.py:270: Changed the rev-parse HEAD failure fallback from "unknown" to None. Previously, the string "unknown" would pass the truthy check in session_manager.py:111 and get stored on session.auto_commit_sha, then fail CheckpointV2's Pydantic pattern validation (^[a-f0-9]{7,40}$), causing the entire session-end checkpoint to be lost. Now None is returned, which means:

  • The truthy check in session_manager.py:111 prevents it from being stored on the session
  • CheckpointV2.commit_sha receives None (the field default), so the checkpoint is created successfully without a commit SHA

Updated the existing test test_rev_parse_failure_returns_unknowntest_rev_parse_failure_returns_none to assert None.

Test coverage for auto_commit_sha propagation

Added TestAutoCommitShaInCheckpoint with two tests that exercise CheckpointHandler.capture_session_end_checkpoint directly (not mocked):

  • test_auto_commit_sha_propagated_to_checkpoint — verifies session.auto_commit_sha = "abc1234" results in checkpoint.commit_sha == "abc1234"
  • test_none_auto_commit_sha_leaves_checkpoint_commit_sha_none — verifies default None propagates correctly

All 1454 gateway tests pass. No regressions.

— Authored by egg

@james-in-a-box

This comment has been minimized.

Copy link
Contributor

@egg-reviewer egg-reviewer bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No agent-mode design concerns in the update. The delta since e7926df is a single commit (4815a65) that fixes the "unknown" SHA fallback to return None instead — preventing a CheckpointV2 Pydantic validation failure that could silently lose session-end checkpoints. The commit also adds test coverage for auto_commit_sha propagation. None of these changes introduce new agent-facing patterns, pre-fetching, structured output requirements, or prompt-level constraints. Original approval stands.

— Authored by egg

@james-in-a-box

This comment has been minimized.

Copy link
Contributor

@egg-reviewer egg-reviewer bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-review: Fix unknown SHA fallback (commit 4815a65)

Reviewed the delta since e7926df1 (one commit: 4815a657 — Fix unknown SHA fallback that breaks checkpoint creation). Both findings from the previous review are properly addressed.

Previous Issues — Verified Fixed

1. "unknown" SHA breaks checkpoint creation (post_agent_commit.py:270) — Fixed correctly. The rev-parse HEAD failure fallback changed from "unknown" to None. This prevents the invalid string from reaching CheckpointV2's Pydantic pattern validator (^[a-f0-9]{7,40}$), which would have caused a ValidationError and lost the entire session-end checkpoint.

The data flow is correct:

  • auto_commit_worktree() returns None on rev-parse failure (line 270)
  • session_manager.py:111 guards with if auto_commit_sha:None is falsy, so session.auto_commit_sha stays at its default None
  • checkpoint_handler.py:605 passes session.auto_commit_sha (= None) to CheckpointV2(commit_sha=None)
  • CheckpointV2.validate_sha() returns None for None input — checkpoint created successfully without a SHA

2. No test coverage for auto_commit_sha propagation (test_checkpoint_handler.py) — Fixed correctly. Two tests added in TestAutoCommitShaInCheckpoint:

  • test_auto_commit_sha_propagated_to_checkpoint: sets session.auto_commit_sha = "abc1234", verifies checkpoint.commit_sha == "abc1234"
  • test_none_auto_commit_sha_leaves_checkpoint_commit_sha_none: verifies default None propagates correctly

The existing rev-parse failure test was updated: test_rev_parse_failure_returns_unknowntest_rev_parse_failure_returns_none, asserting None instead of "unknown".

Observations

No new issues found. The change is minimal and targeted — one line of production code (the fallback value), one renamed/updated test, and two new test cases. No regressions introduced.

One minor note: when rev-parse HEAD fails after a successful commit, the log message at line 272 will contain commit_sha=None. This is fine — the log still captures the event with all other metadata (worktree path, container ID, role, files), and the SHA being None in the log correctly reflects that it couldn't be determined. No action needed.

Verdict

All feedback from the previous review has been properly addressed. No blocking or non-blocking issues remain.

— Authored by egg

@james-in-a-box

This comment has been minimized.

Copy link
Contributor Author

@james-in-a-box james-in-a-box bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Contract Verification — Incremental Re-review

Previous review commit: e7926df17fe007a4f86bcec0d7f4ce9d57e32b98
Current HEAD: 4815a657bb365cda89d7fd66a2f04c5bfc621814

Delta Since Last Review

One commit (4815a65): "Fix unknown SHA fallback that breaks checkpoint creation"

Files changed:

  • gateway/post_agent_commit.py — Changed "unknown" to None for rev-parse failure fallback
  • gateway/tests/test_checkpoint_handler.py — Added TestAutoCommitShaInCheckpoint test class (2 tests)
  • gateway/tests/test_post_agent_commit.py — Updated test to expect None instead of "unknown"

Analysis

This is a correctness fix. CheckpointV2.commit_sha has a Pydantic pattern validator (^[a-f0-9]{7,40}$) that rejects non-hex strings. The old fallback "unknown" would fail this validation, causing the entire session-end checkpoint to be lost. Returning None is correct since the field is Optional[str].

The fix strengthens task-4-2 compliance ("auto-commit SHA in checkpoint metadata, failure does not prevent cleanup") — previously, a rev-parse failure would cascade into checkpoint creation failure, violating the "failure does not prevent cleanup" criterion.

Full Contract Verification Status

All tasks verified against current codebase:

Task Description Status
task-1-1 Session assigned_branch field VERIFIED — Field exists on Session (line 243), set for pipeline sessions (line 506), persisted via to_dict_for_persistence (line 284-285), restored via from_dict (line 311)
task-1-2 is_branch_switch() + branch lock VERIFIEDis_branch_switch() and is_branch_switching_operation() in git_client.py, wired into gateway.py (lines 1107, 1144), conservative heuristic for checkout vs file restore, tests in test_branch_switch.py and test_git_client.py
task-1-3 Commit-time staged-file validation VERIFIED — Staged file check in gateway.py (lines 1165-1200) using check_phase_file_restrictions, 10s timeout with fail-open, non-pipeline commits unaffected, tests in test_pipeline_enforcement.py
task-1-4 Agent instructions updated VERIFIEDmission.md has branch lock guidance (line 74), environment.md has readonly directory guidance (line 68), auto-commit explanation (line 70), git checkout -b retained only for initial branch creation (not switching)
task-2-1 Agent-role restrictions VERIFIEDcheck_agent_restrictions() called in gateway.py (line 689) when session has agent_role, EGG_AGENT_RESTRICTIONS_ENFORCE flag (lines 686-694, default false/warn-only), tests in test_agent_restrictions_enforce.py covering warn, enforce, unknown role, no-role cases
task-3-1 phase_readonly_mounts() VERIFIED — Returns 4 readonly MountSpec for implement phase (_IMPLEMENT_READONLY_DIRS includes drafts, contracts, pipelines, reviews), empty for non-implement/None, skips missing dirs, tests in test_phase_mounts.py
task-3-2 ensure_egg_state_dirs() + EGG_PHASE VERIFIED — Idempotent dir creation using shared _IMPLEMENT_READONLY_DIRS constant, called before mounts in container_spawner.py (line 290), EGG_PHASE set (line 355), tests in test_marker_files.py
task-3-3 Readonly mount wiring + marker files VERIFIEDphase_readonly_mounts() called after repo mounts (line 291), .egg-readonly marker files written with phase and restriction explanation (lines 167-179)
task-4-1 auto_commit_worktree() VERIFIED — Reuses check_phase_file_restrictions (line 185-193), blocked files restored via git checkout (line 215-226), commits with egg author (line 256-257), pushes via gateway API (lines 286-301), audit logging, no-op when clean, errors caught, tests in test_post_agent_commit.py and test_post_agent_commit_extended.py
task-4-2 Auto-commit integration ordering VERIFIED — Auto-commit runs before checkpoint in _capture_and_cleanup_session (lines 84-123 then 125+), session alive during auto-commit, auto_commit_sha propagated to checkpoint (line 605 in checkpoint_handler.py), failure caught without blocking cleanup (lines 118-123), new tests verify SHA propagation
task-5-1 Tests pass VERIFIED — 5385 passed, 83 skipped, 4 warnings (implement-results.json)
task-5-2 Lint clean VERIFIED — ruff check passed, ruff format clean (378 files), mypy clean (118 files)

New Commit Verification

The new commit (4815a65) does not regress any previously verified criteria. It fixes a bug that would have caused task-4-2 criterion violation when git rev-parse HEAD fails after auto-commit. Tests are properly updated.

Verdict

All contract tasks and acceptance criteria are verified. No regressions from the new commit.

— Authored by egg

@james-in-a-box

This comment has been minimized.

@james-in-a-box
Copy link
Contributor Author

egg is addressing review feedback...

@james-in-a-box
Copy link
Contributor Author

egg feedback addressed. View run logs

18 previous review(s) hidden.

@jwbron jwbron merged commit 42a8d2f into main Feb 17, 2026
25 of 26 checks passed
github-actions bot pushed a commit that referenced this pull request Feb 17, 2026
Update the 'What's Enforced' table in README.md to reflect the
defense-in-depth enforcement added in PR #800:
- Readonly filesystem mounts on .egg-state/ dirs during implement phase
- Commit-time validation for staged restricted files
- Push-time validation (existing, now framed as one of three layers)

Triggered by: #800

Authored-by: egg
jwbron added a commit that referenced this pull request Feb 17, 2026
…er] (#804)

* docs: Update enforcement table for phase file restrictions [doc-updater]

Update the 'What's Enforced' table in README.md to reflect the
defense-in-depth enforcement added in PR #800:
- Readonly filesystem mounts on .egg-state/ dirs during implement phase
- Commit-time validation for staged restricted files
- Push-time validation (existing, now framed as one of three layers)

Triggered by: #800

Authored-by: egg

* docs: Use 'subdirs' for precision per review feedback

---------

Co-authored-by: jwbron <8340608+jwbron@users.noreply.github.com>
Co-authored-by: egg-reviewer[bot] <261018737+egg-reviewer[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enforce phase file restrictions via readonly mounts and commit-time validation

1 participant