Closed
Conversation
Implement Phase 1 of the egg extraction plan: - Create repository structure with gateway/, sandbox/, cli/, shared/, tests/ - Add pyproject.toml with uv for dependency management - Add GitHub Actions workflows (lint.yml, test.yml) - Add ./dev wrapper script for CI parity (runs act locally) - Add pre-commit configuration - Add Dependabot configuration - Write README.md with architecture overview - Write CONTRIBUTING.md with development guidelines - Add egg.yaml.example and secrets.yaml.example - Create module structure with __init__.py files - Add initial unit tests for module imports - Add docs/testing.md and docs/architecture.md This establishes the infrastructure before extracting gateway code in Phase 2. Authored-by: jib
Extract and create documentation for the egg repository: - docs/security.md: Core security model and threat mitigations - docs/adr/git-isolation-architecture.md: Git worktree isolation ADR - docs/adr/credential-injection.md: Credential injection ADR - docs/configuration.md: Configuration reference - docs/setup.md: Setup guide - docs/api.md: Gateway API reference - docs/troubleshooting.md: Common issues and solutions Security documentation extracted from james-in-a-box ADRs: - ADR-Gateway-Credential-Injection.md - ADR-Git-Isolation-Architecture.md - ADR-Anthropic-API-Credential-Injection.md All docs regenerated for egg (no james-specific references). Authored-by: jib
Add core shared modules: - shared/egg_config/loader.py: YAML config loading with env var expansion - shared/egg_config/validators.py: Config validation utilities - shared/egg_logging/logger.py: Structured JSON/console logging Add first gateway module: - gateway/policy.py: Policy engine for branch ownership and access control - Parameterized for egg (configurable bot_name, branch_prefix) - Removed james-in-a-box specific references - Uses EGG_TRUSTED_USERS instead of GATEWAY_TRUSTED_USERS Remaining Phase 2 tasks: - Port session_manager, git_client, github_client, worktree_manager - Port remaining supporting modules - Port gateway.py Flask app - Port test files Authored-by: jib
Port gateway sidecar modules from james-in-a-box: - git_client.py: Git CLI wrapper with path/arg validation - github_client.py: gh CLI wrapper with token management - session_manager.py: Thread-safe session storage with disk persistence - policy.py: Branch ownership and PR policy enforcement - rate_limiter.py: Sliding window rate limiting - worktree_manager.py: Git worktree lifecycle management - token_refresher.py: GitHub App token refresh - repo_parser.py: URL/path parsing for repo identification - repo_visibility.py: GitHub API visibility queries with caching - private_repo_policy.py: Per-session private/public mode enforcement - fork_policy.py: Fork operation restrictions - error_messages.py: User-friendly error messages - config_validator.py: Startup configuration validation - proxy_monitor.py: Squid proxy traffic monitoring - repo_config.py: Per-repo authentication mode configuration - gateway.py: Main Flask REST API server Also adds: - scripts/parse-git-mounts.py for repository mount parsing - pyproject.toml hatch build config for packages
This adds the sandbox and gateway container definitions: Sandbox container (container/sandbox/): - Dockerfile: Ubuntu 22.04 base with development tools, Python 3.11 - entrypoint.py: Container initialization and user setup - scripts/git: Git wrapper routing through gateway sidecar - scripts/gh: GitHub CLI wrapper routing through gateway - bin/git-credential-github-token: Git credential helper Gateway container (container/gateway/): - Dockerfile: Python 3.11 slim base with Squid proxy - entrypoint.sh: Starts Squid proxy and gateway API server - squid.conf: Restrictive config for private mode (api.anthropic.com only) - squid-allow-all.conf: Permissive config for public mode - allowed_domains.txt: Domain allowlist for private mode Key parameterization changes from james-in-a-box: - /home/jib -> /home/sandbox - jib-gateway -> egg-gateway - jib user -> sandbox user - JIB_* env vars -> EGG_* env vars - /opt/jib-runtime -> /opt/egg-runtime Authored-by: jib
df4e5a4 to
3fcfb7d
Compare
This was referenced Feb 3, 2026
Contributor
Author
Review Finding: Superseded by PR #48This PR has been analyzed against PR #48 (comprehensive implementation). All functionality from this PR is included in PR #48 with improvements. Comparison
RecommendationSafe to close. PR #48 includes all container infrastructure from this PR with the following improvements:
— Authored by jib |
This was referenced Feb 4, 2026
james-in-a-box bot
pushed a commit
that referenced
this pull request
Feb 6, 2026
Gap #4 (cmd[2:2] fragility): Add LIFECYCLE_FLAGS_INDEX constant to make the implicit contract explicit. Callers now reference the module constant instead of hardcoding the magic number. Added a test that verifies the constant value, so changes to the builder's structure will fail tests. Gap #5 (container_ip): run_claude_structured() now allocates a static IP for test containers via _allocate_test_container_ip(). This matches production where sessions are bound to specific container IPs for request verification. Gap #3 (public mode E2E): Created issue #168 to track adding E2E test coverage for public mode container launches. Authored-by: egg
jwbron
added a commit
that referenced
this pull request
Feb 6, 2026
* Extract shared build_sandbox_docker_cmd() for container launches Container-launch command construction was duplicated across three call sites (run_claude, exec_in_new_container, E2E conftest), causing divergence bugs like missing --add-host and wrong env var names. The test harness was also missing PRIVATE_MODE, DNS lockdown, proxy config, RUNTIME_UID/GID, CONTAINER_ID, and --security-opt. This extracts a single build_sandbox_docker_cmd() into a new shared/egg_container module that all three sites now call, with 26 unit tests covering private/public mode, optional params, and the caller insertion conventions. * Fix ruff format issues * Address review feedback: add ANTHROPIC_AUTH_METHOD and document coupling Addresses two gaps from the audit review: 1. Tests now set ANTHROPIC_AUTH_METHOD=oauth to match production behavior. Production callers always set this env var (api_key or oauth), but the test was only passing ANTHROPIC_OAUTH_TOKEN. This ensures sandbox startup code branches the same way in tests as in production. 2. Added docstring to run_claude_structured() documenting the network config coupling. The test constructs ContainerNetworkConfig manually from EggStack values rather than calling _get_container_network_config(). If that function changes, the test must be updated to match. Authored-by: egg * Address review gaps #4 and #5, create issue for #3 Gap #4 (cmd[2:2] fragility): Add LIFECYCLE_FLAGS_INDEX constant to make the implicit contract explicit. Callers now reference the module constant instead of hardcoding the magic number. Added a test that verifies the constant value, so changes to the builder's structure will fail tests. Gap #5 (container_ip): run_claude_structured() now allocates a static IP for test containers via _allocate_test_container_ip(). This matches production where sessions are bound to specific container IPs for request verification. Gap #3 (public mode E2E): Created issue #168 to track adding E2E test coverage for public mode container launches. Authored-by: egg --------- Co-authored-by: egg <egg@example.com> Co-authored-by: jwbron <8340608+jwbron@users.noreply.github.com>
This was referenced Feb 7, 2026
james-in-a-box bot
pushed a commit
that referenced
this pull request
Feb 7, 2026
Fix critical issues from review:
- Task ID convention mismatch: Parser now generates task-{phase}-{number}
format matching CLI expectations (Issue #3)
- Integer parsing vulnerability: Added parse_task_id() and parse_phase_id()
helpers with proper validation and error handling (Issue #1)
- URL parameter injection: Use urllib.parse.urlencode for query params (Issue #2)
- Placeholder task numbering: Use task_number=1 instead of 0 (Issue #5)
- Silent failure in shell script: gh_api_safe now sets GH_API_FAILED flag
and returns empty JSON for graceful fallback (Issue #6)
Additional improvements:
- Updated Task model pattern to accept task-P-N format
- Added comprehensive test coverage for error paths
- Added edge case tests for plan parser
- All 73 tests pass, ruff linter clean
Authored-by: egg
jwbron
added a commit
that referenced
this pull request
Feb 7, 2026
* Implement Phase 3: Agent CLI and prompt integration Add the agent-facing components for the SDLC pipeline: - Contract CLI (egg-contract) for agents to interact with contract state: - show: Display current contract state - add-commit: Link commits to tasks - update-notes: Add implementation notes - mark-task/mark-phase: Update status (reviewer role) - add-decision: Create HITL decision points - Agent rules file (contract.md) with CLI usage guidance - Document templates for refine and plan phases: - analysis.md: Problem analysis template - plan.md: Implementation plan template with task ID format - SDLC prompt builder that provides phase-specific context and instructions to the agent - Plan parser for extracting tasks from plan documents: - Parses [TASK-X-Y] patterns from markdown - Supports YAML front matter for structured task definitions - Generates placeholder tasks for empty phases - Produces warnings for parse issues - Unit tests for plan parser and contract CLI Issue: #133 Authored-by: egg * Fix bash syntax error in build-sdlc-prompt.sh Change the inner heredoc delimiter in the PR format example from 'EOF' to 'BODY' to avoid conflicting with the outer heredoc delimiter. The 'EOF' on line 363 was being interpreted as the end of the outer heredoc (starting at line 322), causing a syntax error on the closing ')'. Authored-by: egg * Fix lint and type errors in Phase 3 contracts code - Fix import sorting in shared/egg_contracts/__init__.py (I001) - Fix import sorting in tests/sandbox/test_contract_cli.py (I001) - Fix unused variable in tests/sandbox/test_contract_cli.py (F841) - Fix import sorting and remove unused import in tests/shared/egg_contracts/test_plan_parser.py (I001, F401) - Fix mypy type annotation in shared/egg_contracts/plan_parser.py - Fix mypy no-any-return errors in sandbox/egg_lib/contract_cli.py - Apply ruff formatting fixes Authored-by: egg * Address code review feedback for Phase 3 implementation Fix critical issues from review: - Task ID convention mismatch: Parser now generates task-{phase}-{number} format matching CLI expectations (Issue #3) - Integer parsing vulnerability: Added parse_task_id() and parse_phase_id() helpers with proper validation and error handling (Issue #1) - URL parameter injection: Use urllib.parse.urlencode for query params (Issue #2) - Placeholder task numbering: Use task_number=1 instead of 0 (Issue #5) - Silent failure in shell script: gh_api_safe now sets GH_API_FAILED flag and returns empty JSON for graceful fallback (Issue #6) Additional improvements: - Updated Task model pattern to accept task-P-N format - Added comprehensive test coverage for error paths - Added edge case tests for plan parser - All 73 tests pass, ruff linter clean Authored-by: egg * Remove unused GH_API_FAILED variable from build-sdlc-prompt.sh The GH_API_FAILED variable was declared and set but never read, triggering shellcheck SC2034 warning. The variable has been removed as the functionality was never implemented. Authored-by: egg * Address remaining PR review feedback - Fix add-decision RBAC: Add decisions.* to FIELD_OWNERSHIP with IMPLEMENTER role, allowing implementers to create decisions while only humans can resolve them. - Fix array append in validator: Handle appending to arrays when idx == len(array) by appending instead of raising IndexError. Also handle the case in apply_mutation where old_value doesn't exist yet for append operations. - Document race condition in add-decision: Added comments explaining the TOCTOU race condition between reading decision count and submitting mutation. Gateway should handle conflicts. - Fix mark-task/mark-phase role documentation: Updated docstrings and help text to clarify these operations require REVIEWER role. - Add commit SHA validation: New validate_commit_sha() function validates 7-40 character hex strings for git SHAs. - Fix short question ellipsis: Only append "..." if question is longer than 50 characters. - Add comprehensive tests for new functionality including commit SHA validation and array append operations. Authored-by: egg * Fix ruff format issue in contract_cli.py Collapse multi-line ValueError string to single line to satisfy ruff format requirements. Authored-by: egg * Address low-severity review feedback - Add integration tests verifying plan parser task IDs are compatible with the CLI's parse_task_id() function - Add RETURN trap in gh_api_safe() to clean up temp files on interruption - Refactor MockGatewayHandler to use a factory function that creates handler classes with isolated responses, ensuring thread safety when running tests in parallel - Generate unique HEREDOC delimiter using random suffix to prevent injection if issue body contains the delimiter string Authored-by: egg --------- Co-authored-by: jwbron <8340608+jwbron@users.noreply.github.com> Co-authored-by: james-in-a-box[bot] <246424927+james-in-a-box[bot]@users.noreply.github.com>
This was referenced Feb 17, 2026
Merged
This was referenced Feb 22, 2026
james-in-a-box bot
pushed a commit
that referenced
this pull request
Feb 22, 2026
- Fix #1: Resolve GitHub token from source_repo query param when repo_path is the scratch repo (no git remote). New _resolve_checkpoint_token helper uses get_token_for_repo(source_repo) as fallback, and all checkpoint endpoints pass the token explicitly. - Fix #2: Add threading lock to _ensure_checkpoint_scratch_repo to prevent TOCTOU race when concurrent requests create the scratch repo. - Fix #3: Add os.path.isdir check for session.last_repo_path so nonexistent sandbox paths fall through to the scratch repo. - Fix #4: Move scratch dir from /tmp/ to /home/egg/.egg-worktrees/ which is within ALLOWED_REPO_PATHS. - Fix #5: Refactor _get_checkpoint_repo_from_args to return (checkpoint_repo, source_repo) tuple, eliminating duplicate git remote calls in _add_checkpoint_resolution_params. - Fix #6: Add 11 gateway-side unit tests covering source_repo resolution, scratch repo creation, repo path fallthrough, and token resolution fallback. - Fix #7: Handle trailing slashes in _get_source_repo regex.
jwbron
added a commit
that referenced
this pull request
Feb 22, 2026
* Fix checkpoint CLI returning empty results in sandbox The checkpoint CLI failed to resolve checkpoint_repo inside sandbox containers because repositories.yaml is only mounted on the gateway. When the CLI couldn't resolve checkpoint_repo locally, it sent HTTP requests without it. The gateway then couldn't auto-detect it either because the sandbox's repo path doesn't exist on the gateway filesystem. Changes: - CLI: Extract _get_source_repo() helper and pass source_repo param to gateway when checkpoint_repo can't be resolved locally - Gateway: _resolve_checkpoint_repo() uses source_repo param as fallback for config lookup when path-based auto-detection fails - Gateway: _resolve_repo_path_for_checkpoints() falls through to session/env/scratch-repo fallbacks when sandbox path doesn't exist locally instead of hard-returning None Fixes #884 * Address review feedback for checkpoint sandbox fix - Fix #1: Resolve GitHub token from source_repo query param when repo_path is the scratch repo (no git remote). New _resolve_checkpoint_token helper uses get_token_for_repo(source_repo) as fallback, and all checkpoint endpoints pass the token explicitly. - Fix #2: Add threading lock to _ensure_checkpoint_scratch_repo to prevent TOCTOU race when concurrent requests create the scratch repo. - Fix #3: Add os.path.isdir check for session.last_repo_path so nonexistent sandbox paths fall through to the scratch repo. - Fix #4: Move scratch dir from /tmp/ to /home/egg/.egg-worktrees/ which is within ALLOWED_REPO_PATHS. - Fix #5: Refactor _get_checkpoint_repo_from_args to return (checkpoint_repo, source_repo) tuple, eliminating duplicate git remote calls in _add_checkpoint_resolution_params. - Fix #6: Add 11 gateway-side unit tests covering source_repo resolution, scratch repo creation, repo path fallthrough, and token resolution fallback. - Fix #7: Handle trailing slashes in _get_source_repo regex. * Address non-blocking review feedback on checkpoint CLI sandbox fix - Fix trailing slash regex in checkpoint_handler.py (3 instances) to match the CLI fix already applied — handles URLs like https://github.com/org/repo/ (review item A) - Add missing assertion for source_repo in test_returns_none_on_unexpected_error (review item B) - Tighten owner/repo regex in gateway.py to reject names starting with dots or hyphens, matching GitHub's actual naming rules (review item C) --------- Co-authored-by: egg <egg@localhost> Co-authored-by: egg-reviewer[bot] <261018737+egg-reviewer[bot]@users.noreply.github.com> Co-authored-by: jwbron <8340608+jwbron@users.noreply.github.com>
This was referenced Feb 23, 2026
james-in-a-box bot
pushed a commit
that referenced
this pull request
Feb 25, 2026
- Fix free-text command interception: use /q and /s prefixes instead of bare q/s to avoid swallowing legitimate single-letter answers (issue #4) - Add TOCTOU race condition comment on _resolve_contract_decision documenting the known limitation and future fix path (issue #1) - Add clarifying comment on pipeline_id dual role at call site (issue #5) - Add unit tests for _handle_contract_questions: option-based selection, free-text input, skip (/s), quit (/q), EOFError, KeyboardInterrupt, resolve failure, and literal q/s as valid answers (issues #3/#7) - Add integration test for full [q] → answer → approve flow through the phase gate (issue #7)
jwbron
pushed a commit
that referenced
this pull request
Feb 25, 2026
* Bridge contract HITL decisions to CLI phase gate Contract decisions created by agents via `egg-contract add-decision` were invisible in local mode because the CLI only polled orchestrator decisions. This adds a CLI-level bridge that reads pending decisions directly from the contract JSON file and surfaces them as an interactive [q] option in the phase gate menu. Approving with unanswered questions triggers a warning. Issue: #908 * Address review feedback on contract decision bridge - Fix free-text command interception: use /q and /s prefixes instead of bare q/s to avoid swallowing legitimate single-letter answers (issue #4) - Add TOCTOU race condition comment on _resolve_contract_decision documenting the known limitation and future fix path (issue #1) - Add clarifying comment on pipeline_id dual role at call site (issue #5) - Add unit tests for _handle_contract_questions: option-based selection, free-text input, skip (/s), quit (/q), EOFError, KeyboardInterrupt, resolve failure, and literal q/s as valid answers (issues #3/#7) - Add integration test for full [q] → answer → approve flow through the phase gate (issue #7) * Handle EOF in option-based contract questions, add multi-question test Address non-blocking suggestions from re-review: - Handle _prompt_choice returning "c" on EOF/KeyboardInterrupt in the option-based path of _handle_contract_questions. Previously this would fall through to int("c") and raise ValueError. Now treats "c" as quit. - Add test for EOF during option-based questions to prevent regression. - Add multi-question test (answer one, skip another) to exercise the iteration logic with 2+ pending decisions. --------- Co-authored-by: egg <egg@localhost> Co-authored-by: egg-reviewer[bot] <261018737+egg-reviewer[bot]@users.noreply.github.com>
This was referenced Feb 26, 2026
james-in-a-box bot
pushed a commit
that referenced
this pull request
Feb 26, 2026
- Fix operator precedence bug in test assertion (issue #1) - Restore auto_create_pr field as deprecated for backwards compat (issue #2) - Remove dead local-mode PR phase prompt code (issue #3) - Set work_started_at on phase_execution in auto-PR path (issue #4) - Remove unused mock handler variables (issue #5) - Document create_pr error contract difference from other temp-session methods (issue #7) - Upgrade pre-PR push failure log level from WARNING to ERROR (issue #8)
jwbron
pushed a commit
that referenced
this pull request
Feb 26, 2026
* Auto-create PR in orchestrator, skip agent spawn for PR phase * Fix checks: apply automated formatting fixes * Remove auto_create_pr opt-out, always auto-create PR in orchestrator * Fix lint: remove unused PipelineConfig import * Address review feedback on auto-PR creation - Fix operator precedence bug in test assertion (issue #1) - Restore auto_create_pr field as deprecated for backwards compat (issue #2) - Remove dead local-mode PR phase prompt code (issue #3) - Set work_started_at on phase_execution in auto-PR path (issue #4) - Remove unused mock handler variables (issue #5) - Document create_pr error contract difference from other temp-session methods (issue #7) - Upgrade pre-PR push failure log level from WARNING to ERROR (issue #8) --------- Co-authored-by: egg <egg@localhost> Co-authored-by: james-in-a-box[bot] <246424927+james-in-a-box[bot]@users.noreply.github.com> Co-authored-by: egg-reviewer[bot] <261018737+egg-reviewer[bot]@users.noreply.github.com>
james-in-a-box bot
pushed a commit
that referenced
this pull request
Feb 26, 2026
- Fix [v] hint missing from prompt strings in feedback handler's free-text and structured review loops (Bug #1) - Move [v] display in free-text feedback to after empty-input check where _prompt_choice is used, not before _prompt_text (UX #2) - Add pipeline API fallback to resolve_phase_draft() so the CLI watch loop and handle_hitl_checkpoint use the same phase resolution logic, preventing silent pager skips for unknown-phase decisions (#3) - Refactor handle_hitl_checkpoint to use resolve_phase_draft() internally, eliminating duplicate draft resolution (#4) - Add watch_pipeline integration tests covering draft pager deduplication, phase_gate exclusion, no-content case, and client passthrough (#5)
jwbron
pushed a commit
that referenced
this pull request
Mar 2, 2026
* Show analysis/plan draft before HITL choice/feedback decisions When a pipeline has multiple pending decisions (agent questions + phase_gate), the user sees choice/feedback questions before the phase_gate. Previously, only phase_gate decisions showed the analysis/plan document, leaving the user without context when answering agent questions. Changes: - Add resolve_phase_draft() public helper to extract draft resolution logic - Add [v] View full document option to _handle_choice() and _handle_feedback() - Pass draft_content from handle_hitl_checkpoint to choice/feedback handlers - In sdlc_cli.py watch_pipeline(), show draft in pager once before the first non-phase_gate decision so the human has context for agent questions - Add 150->150+19 new tests covering all new behavior * Fix checks: apply automated formatting fixes * Address review feedback: fix prompt hints, UX, and draft resolution - Fix [v] hint missing from prompt strings in feedback handler's free-text and structured review loops (Bug #1) - Move [v] display in free-text feedback to after empty-input check where _prompt_choice is used, not before _prompt_text (UX #2) - Add pipeline API fallback to resolve_phase_draft() so the CLI watch loop and handle_hitl_checkpoint use the same phase resolution logic, preventing silent pager skips for unknown-phase decisions (#3) - Refactor handle_hitl_checkpoint to use resolve_phase_draft() internally, eliminating duplicate draft resolution (#4) - Add watch_pipeline integration tests covering draft pager deduplication, phase_gate exclusion, no-content case, and client passthrough (#5) * Remove dead test helpers _make_sse_events and _make_stream --------- Co-authored-by: egg <egg@localhost> Co-authored-by: egg-reviewer[bot] <261018737+egg-reviewer[bot]@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Phase 3 of the egg extraction: extract container Dockerfiles and scripts.
Changes
Gateway container (
container/gateway/):Sandbox container (
container/sandbox/):Parameterization
/home/jib/home/sandboxjib-gatewayegg-gatewayJIB_*env varsEGG_*env varsjib-isolatednetworkegg-isolatednetworkValidation Plan
Authored-by: jib