Add egg-orchestrator for local SDLC pipeline execution by jwbron · Pull Request #524 · jwbron/egg

jwbron · 2026-02-11T04:29:24Z

Summary

Introduces egg-orchestrator, a new Docker container service that manages the
full SDLC pipeline lifecycle locally, replacing GitHub Actions orchestration.
The orchestrator stores state in git (consistent with contracts), manages
sandbox container lifecycle via Docker API, supports multi-agent parallel
execution, and integrates HITL decisions through Claude Code sessions.

This latest commit adds 14 integration tests (all passing) and fixes three
sandbox container spawning bugs that caused immediate failure.

Bug fixes

The orchestrator's _run_pipeline() was spawning sandbox containers that
immediately failed for three reasons:

Missing repo volume mount — Docker socket operates on the host daemon,
so volume mounts need host paths, not container paths. Added
EGG_HOST_REPOS_DIR env var to pass the host-side repo path.
Missing GATEWAY_URL — Sandbox entrypoint health check defaulted to
http://egg-gateway:9848 (DNS name) which doesn't resolve from a container
on the isolated network. Now passes the IP-based URL from the orchestrator.
No command passed — Without explicit args, the sandbox entrypoint
launched interactive Claude Code (no TTY attached → immediate error). Now
builds a claude --print command with phase-specific prompt matching the
GHA SDLC workflow flags (--verbose, --output-format stream-json,
--max-turns 200).

Additionally: container logs are now captured before cleanup, so failure
messages include the last 10 lines of sandbox output for debugging.

Integration tests (14 passing)

Test	Verifies
`test_create_local_pipeline`	POST creates pipeline with correct mode/prompt/status
`test_start_local_pipeline_completes`	Full lifecycle: create → start → poll → complete
`test_local_pipeline_no_pr_phase`	Local = 3 phases (refine/plan/implement), no PR
`test_local_pipeline_container_failure`	FORCE_FAIL prompt → exit 1, pipeline fails
`test_failure_error_includes_container_logs`	Error message contains sandbox log output
`test_sandbox_receives_pipeline_env`	Pipeline env vars passed (exit 2 if missing)
`test_sandbox_receives_gateway_url`	GATEWAY_URL passed (exit 3 if missing)
`test_sandbox_receives_repo_volume`	Repo volume mounted (exit 4 if missing)
`test_start_running_pipeline_returns_409`	Cannot start already-running pipeline
`test_start_completed_pipeline_returns_409`	Cannot start completed pipeline
`test_issue_pipeline_includes_pr_phase`	Issue mode runs 4 phases including PR
`test_pipeline_crud`	Full CRUD lifecycle
`test_gateway_local_mode_blocks_push`	Local session push → 403
`test_failed_pipeline_returns_409`	Failed pipeline → 409 on restart

Known issues (analysis, not yet fixed)

1. Missing `EGG_SESSION_TOKEN` for local-mode containers

Symptom: The sandbox agent hits ERROR: EGG_SESSION_TOKEN not set when
trying to use git operations inside the container.

Root cause: container_spawner.py:200 has if mode != "local" and repos:
which explicitly skips gateway session registration for local pipelines. The
original rationale was "local pipelines don't need gateway access," but:

The sandbox's git/gh wrapper scripts always require EGG_SESSION_TOKEN
The gateway enforces local-mode restrictions (push blocking) at the
session level — without a session, there's no enforcement
Even for local mode, the sandbox needs the session for worktree setup

Fix: Remove the mode != "local" guard and always register sessions.
For local pipelines without an explicit repo, detect the repo from the
mounted volume or pass a default. The gateway already handles mode="local"
sessions correctly (blocks push, allows read).

2. OOM kill on memory-constrained systems (exit code 137)

Symptom: Sandbox container exits with code 137, empty logs, container
already gone before log capture.

Root cause: Exit code 137 = SIGKILL (128 + 9), typically from the kernel
OOM killer. Spawning a full sandbox container (~400MB for Claude Code) on a
system that's already under memory pressure triggers OOM. Observed on a 32GB
system with ~1GB free RAM and swap nearly exhausted (Firefox was consuming
~15GB).

Analysis: The sandbox entrypoint runs 11 setup phases before launching
Claude Code. Most are necessary even for local mode (worktree isolation
prevents stomping on the user's working tree). Reducing peak memory without
skipping critical phases requires investigation into:

OOM detection: Check Docker's OOMKilled flag after exit code 137 for
a clear diagnostic message instead of mysterious "Container exited with
code 137"
Pre-flight memory check: Before spawning, verify available memory and
fail early with actionable guidance
Entrypoint optimization: Identify which setup phases can be deferred or
made lighter without compromising isolation guarantees

Closes: #496

james-in-a-box

No agent-mode design concerns.

This PR introduces the egg-orchestrator service which manages SDLC pipeline lifecycle locally. From an agent-mode design perspective, the architecture aligns well with the principles in docs/guides/agent-mode-design.md:

Good patterns observed:

Agent autonomy: Sandboxes are spawned with context (pipeline ID, issue number, agent role) but agents fetch their own data through tools rather than receiving pre-baked diffs or logs
No structured output requirements: The orchestrator communicates with agents via simple signals (completion, failure) rather than requiring parsed JSON output
Direct action: Agents post reviews, create PRs, and commit code directly via gateway—no post-processing pipelines parsing agent output
Objectives over procedures: The /sdlc skill provides the "what" (start a pipeline for issue N) and lets the orchestrator handle orchestration while agents handle execution

Architecture notes:

The SandboxConfig passes lightweight metadata (session tokens, role, issue number) which helps orient agents without constraining them
Container spawning provides environment setup (proxy, gateway URLs) rather than pre-fetched content
The decision queue for HITL provides objectives ("approve this draft") rather than procedural instructions

The design keeps orchestration logic in the orchestrator (where it belongs) and lets agents operate autonomously within their sandboxes.

— Authored by egg

james-in-a-box

Code Review: PR #524 - Egg Orchestrator

This is a comprehensive review of the orchestrator implementation. I've analyzed all files systematically for security vulnerabilities, correctness issues, race conditions, and design problems.

🔴 Critical Security Issues

1. Container ID Injection Vulnerability in `docker_client.py`

Location: orchestrator/docker_client.py - logs(), stop(), remove(), and other methods

The container ID is passed directly to Docker API calls without validation. An attacker who can control the container ID could potentially manipulate Docker operations.

def logs(self, container_id: str, ...) -> str:
    container = self.client.containers.get(container_id)  # No validation

Fix: Add container ID format validation before API calls:

def _validate_container_id(self, container_id: str) -> None:
    if not re.match(r'^[a-zA-Z0-9][a-zA-Z0-9_.-]*$', container_id):
        raise ValueError(f"Invalid container ID format: {container_id}")

2. Path Traversal Risk in `state_store.py`

Location: orchestrator/state_store.py - _pipeline_path() method

Pipeline IDs are used directly to construct file paths without proper sanitization:

def _pipeline_path(self, pipeline_id: str) -> Path:
    return self.pipelines_dir / f"{pipeline_id}.json"

If pipeline_id contains ../, it could write files outside the intended directory.

Fix: Validate and sanitize pipeline IDs:

def _pipeline_path(self, pipeline_id: str) -> Path:
    if not re.match(r'^issue-[0-9]+$', pipeline_id):
        raise ValueError(f"Invalid pipeline ID format: {pipeline_id}")
    path = self.pipelines_dir / f"{pipeline_id}.json"
    # Ensure resolved path is still under pipelines_dir
    if not path.resolve().is_relative_to(self.pipelines_dir.resolve()):
        raise ValueError("Path traversal detected")
    return path

3. Missing Authentication on API Endpoints

Location: All files in orchestrator/routes/

None of the API endpoints implement authentication or authorization. Any process with network access to the orchestrator can:

Create/delete pipelines
Spawn containers
Resolve HITL decisions
Modify pipeline state

Fix: Add authentication middleware using gateway session tokens:

def require_auth(f):
    @wraps(f)
    def decorated(*args, **kwargs):
        token = request.headers.get('X-Session-Token')
        if not token or not validate_session_token(token):
            return jsonify({"error": "Unauthorized"}), 401
        return f(*args, **kwargs)
    return decorated

4. Command Injection in Webhook Handler

Location: orchestrator/webhooks.py - Line ~13450

The issue event handler directly uses issue_number from the webhook payload without validation:

pipeline_id=f"issue-{issue_number}"

While not immediately exploitable, a malformed issue number from a spoofed webhook could cause issues.

Fix: Validate issue_number is an integer before use.

🟠 Correctness Issues

5. Race Condition in Container Spawning

Location: orchestrator/container_spawner.py - spawn_container() method

The method reads state, creates container, then updates state without any locking mechanism. Concurrent spawn requests could result in:

Duplicate containers
Lost state updates
Inconsistent pipeline state

Fix: Add a locking mechanism:

_spawn_locks: dict[str, threading.Lock] = {}

def spawn_container(self, ...):
    lock_key = f"{pipeline_id}:{agent_role}"
    if lock_key not in _spawn_locks:
        _spawn_locks[lock_key] = threading.Lock()
    
    with _spawn_locks[lock_key]:
        # existing spawn logic

6. Race Condition in Phase Transitions

Location: orchestrator/routes/phases.py - advance_phase() endpoint

Multiple concurrent requests to advance phases could result in skipped phases or invalid transitions:

@phases_bp.route("/<pipeline_id>/phases/advance", methods=["POST"])
def advance_phase(pipeline_id: str):
    pipeline = store.load_pipeline(pipeline_id)
    # No lock between load and save
    pipeline.current_phase = next_phase
    store.save_pipeline(pipeline)

Fix: Implement optimistic locking using a version field on the Pipeline model.

7. Missing Error Handling for Docker Connection Failures

Location: orchestrator/docker_client.py - __init__()

The DockerClient constructor doesn't handle connection failures gracefully:

def __init__(self, ...):
    self.client = docker.from_env()  # Can raise DockerException

Fix: Wrap in try/except and provide meaningful error message.

8. Histogram Bucket Counting Bug

Location: orchestrator/metrics.py - Histogram.observe() method

The bucket counting logic appears to count values into ALL buckets where value <= bucket rather than just the appropriate bucket:

for bucket in self.buckets:
    if value <= bucket:
        self._bucket_counts[bucket] += 1  # Increments multiple buckets

This is actually the correct cumulative behavior for Prometheus histograms, but the get_bucket_counts() returns cumulative counts which may confuse users expecting per-bucket counts. Consider adding documentation to clarify.

9. Git Commit Can Fail Silently

Location: orchestrator/state_store.py - _commit()

If git operations fail, the error is logged but not raised, which could lead to data loss:

def _commit(self, message: str) -> None:
    try:
        self._run_git(["add", ...])
        self._run_git(["commit", "-m", message])
    except Exception as e:
        logger.error("Git commit failed", error=str(e))
        # Error is swallowed!

Fix: Re-raise the exception or implement a retry mechanism with eventual consistency guarantees.

🟡 Robustness Issues

10. Missing Request Validation in All Routes

Location: All routes in orchestrator/routes/

Most endpoints don't validate required fields:

@pipelines_bp.route("/", methods=["POST"])
def create_pipeline():
    data = request.get_json()
    issue_number = data.get("issue_number")  # Could be None
    repo = data.get("repo")  # Could be None

Fix: Add Pydantic request validation:

class CreatePipelineRequest(BaseModel):
    issue_number: int
    repo: str
    branch: str

11. Timeout Not Enforced on Container Operations

Location: orchestrator/docker_client.py - multiple methods

Docker operations can hang indefinitely if a container becomes unresponsive:

def logs(self, container_id: str, tail: int = 100) -> str:
    container = self.client.containers.get(container_id)
    return container.logs(tail=tail).decode()  # No timeout

Fix: Add timeout parameters to Docker API calls.

12. Circuit Breaker Has No Persistence

Location: orchestrator/resilience.py - CircuitBreaker class

Circuit breaker state is stored in-memory. If the orchestrator restarts, all circuits reset to CLOSED state, potentially causing cascading failures.

Fix: Persist circuit breaker state to Redis or the state store.

13. HITL Decision Timeout Not Actively Checked

Location: orchestrator/routes/decisions.py

The timeout is stored on decisions but there's no background task to check for and handle timeouts:

@decisions_bp.route("/<pipeline_id>/decisions/check-timeouts", methods=["POST"])
def check_timeouts(pipeline_id: str):

This endpoint exists but requires manual invocation.

Fix: Add a background scheduler (APScheduler or similar) to check timeouts periodically.

14. Multi-Agent Executor Missing Error Recovery

Location: orchestrator/multi_agent.py - MultiAgentExecutor

If one agent in a wave fails, there's no clear recovery strategy:

async def execute_wave(self, wave: AgentWave) -> list[AgentExecution]:
    results = await asyncio.gather(*tasks, return_exceptions=True)

Exceptions are captured but not properly handled - the wave continues without addressing failures.

Fix: Implement proper failure handling with configurable strategies (fail-fast, continue, retry).

🔵 Design Issues

15. Inconsistent Error Response Format

Location: All routes

Some endpoints return {"error": "message"} while others return {"success": false, "message": "..."}:

# containers.py
return jsonify({"error": "Container not found"}), 404

# pipelines.py  
return jsonify({"success": False, "message": "Pipeline not found"}), 404

Fix: Standardize on a single error response format across all endpoints.

16. Circular Import Risk

Location: orchestrator/webhooks.py

The module imports from events inside handler functions to avoid circular imports:

def _handle_issue_event(payload):
    from events import EventType, emit_event  # Late import

This is a code smell indicating the module structure may need refactoring.

Fix: Consider a more modular event system with proper dependency injection.

17. Hardcoded Network Configuration

Location: orchestrator/gateway_client.py and orchestrator/sandbox_template.py

Network CIDRs and IPs are hardcoded:

SANDBOX_NETWORK = "172.32.0.0/24"

Fix: Move to configuration file or environment variables.

18. Missing Structured Logging Context

Location: Multiple files

Many log statements don't include relevant context:

logger.info("Container spawned")  # Missing container_id, pipeline_id

Fix: Always include relevant identifiers in log statements.

📝 Test Coverage Gaps

19. No Integration Tests

The tests are all unit tests with mocked dependencies. There are no integration tests that verify:

End-to-end pipeline execution
Docker container lifecycle
Gateway integration
State persistence under concurrent access

20. No Load/Stress Tests

No tests verify behavior under concurrent load or when resources are constrained.

21. Tests Don't Cover Error Paths

Many error scenarios are untested:

Docker daemon unavailable
Git repository corrupted
Network failures to gateway
Disk full conditions

✅ Positive Notes

The Pydantic models are well-structured with proper validation
The event system design is clean and extensible
Circuit breaker pattern is correctly implemented
Webhook signature verification is properly done with constant-time comparison
Test fixtures are well organized

Summary

Severity	Count
Critical Security	4
Correctness	5
Robustness	5
Design	4
Test Gaps	3

Recommendation: Request changes. The security issues (container ID validation, path traversal, missing auth) must be addressed before merge. The race conditions should also be fixed as they could cause data corruption in production.

— Authored by egg

jwbron · 2026-02-11T19:47:17Z

@james-in-a-box address this feedback: #524 (review)

james-in-a-box · 2026-02-11T19:55:21Z

Addressed Review Feedback

I've pushed changes (7671581) addressing the critical security issues and race conditions from the review:

✅ Fixed: Critical Security Issues

1. Container ID Injection Vulnerability (docker_client.py)

Added _validate_container_id() function with regex pattern validation
Pattern allows: 12-64 char hex IDs or valid container names (alphanumeric, underscore, hyphen, period)
Added validation to all methods accepting container_id: start_container, stop_container, remove_container, get_container_info, get_container_logs, wait_for_container
New InvalidContainerIdError exception for invalid formats

2. Path Traversal Risk (state_store.py)

Added _validate_pipeline_id() function enforcing issue-{number} format
Added secondary check ensuring resolved path stays within pipelines_dir
New InvalidPipelineIdError exception for invalid formats or traversal attempts

✅ Fixed: Race Conditions

5. Race Condition in Phase Transitions (routes/phases.py)

Added version field to Pipeline model for optimistic locking
Modified save_pipeline() to accept expected_version parameter
Added VersionConflictError exception (returns 409 Conflict)
Updated all phase mutation endpoints (advance_phase, start_phase, complete_phase, fail_phase) to use optimistic locking

✅ Already Addressed: Docker Connection Failures

7. Docker Connection Error Handling (docker_client.py)

Verified: The __init__ method already wraps docker.from_env() in try/except and raises DockerClientError with a meaningful message (line 79-85)

Route Updates

All route files updated with proper exception handling for the new error types:

routes/containers.py: Handles InvalidContainerIdError (400)
routes/pipelines.py: Handles InvalidPipelineIdError (400)
routes/phases.py: Handles InvalidPipelineIdError (400) and VersionConflictError (409)
routes/decisions.py: Handles InvalidPipelineIdError (400)

Remaining Items (not addressed in this commit)

The following items from the review are noted but not yet implemented:

Phase 1.5: Documentation extraction #3 Missing Authentication: Requires architectural decision on auth approach
Phase 2: Gateway module extraction #4 Container Spawner Race Condition: Similar locking needed
ci: bump hadolint/hadolint-action from 3.1.0 to 3.3.0 #9 Silent Git Commit Failures: Needs retry mechanism design
ci: bump actions/checkout from 4 to 6 #10-18 Request Validation, Error Standardization, etc.: Design consistency items

I recommend addressing the authentication requirement (#3) as a follow-up, since it requires decisions on the auth mechanism and integration with the gateway session tokens.

— Authored by egg

jwbron · 2026-02-11T20:03:35Z

@james-in-a-box pull main into this branch

james-in-a-box · 2026-02-11T20:07:07Z

Merged main into this branch

Successfully pulled main into egg/issue-496 (commit 9301879).

Changes merged from main:

Fix reviewer workflow validation failure (Fix reviewer bot: workflow validation + gateway wiring #532)
Added separate reviewer account for bot reviews (Add separate reviewer account for bot reviews #515)
Various workflow and gateway updates

No merge conflicts occurred. The branch is now up to date with main.

— Authored by egg

Superseded by new review

james-in-a-box

No agent-mode design concerns with the delta since my last review.

The changes since commit a1c45ddd address security and race condition issues identified in the base code review, and they do so in a way that aligns well with agent-mode design principles:

Good pattern: Infrastructure-level constraints

Container ID validation and pipeline ID validation are enforced in docker_client.py and state_store.py, not via prompt instructions
Optimistic locking with version conflicts is handled at the state store level, returning proper 409 errors
Path traversal protection uses resolved path validation, not trust in agent behavior

This follows the core principle from the design guide: "The sandbox is the constraint." These security fixes extend the infrastructure boundary rather than relying on prompt-level restrictions that agents could ignore.

No changes to agent prompts, pre-fetching patterns, or output format requirements were introduced in this delta.

— Authored by egg

james-in-a-box

Re-Review: PR #524 - Egg Orchestrator

This re-review examines changes made since commit a1c45ddd.

✅ Issues Properly Addressed

1. Container ID Injection Vulnerability (docker_client.py)

Fixed correctly. The _validate_container_id() function validates container IDs using an appropriate regex pattern that allows:

12-64 character hex IDs (short/full Docker IDs)
Valid container names (alphanumeric, underscore, hyphen, period - not starting with hyphen/period)

The validation is applied to all relevant methods: start_container, stop_container, remove_container, get_container_info, get_container_logs, wait_for_container.

2. Path Traversal Risk (state_store.py)

Fixed correctly. The implementation adds:

_validate_pipeline_id() with strict ^issue-[0-9]+$ pattern enforcement
Secondary is_relative_to() check to catch any bypass attempts
Appropriate InvalidPipelineIdError exception

3. Optimistic Locking for Phase Transitions (routes/phases.py)

Fixed correctly. All phase mutation endpoints (advance_phase, start_phase, complete_phase, fail_phase) now:

Capture original_version before modifications
Pass expected_version to save_pipeline()
Handle VersionConflictError with 409 Conflict response

4. Version Field on Pipeline Model (models.py)

Fixed correctly. The version field is added with default=1, ge=1 constraint and proper documentation.

🟠 Remaining Issues from Previous Review (Not Addressed)

These were noted as "not addressed in this commit" in the previous response, which is acceptable, but they should still be tracked:

5. Container Spawner Race Condition (container_spawner.py)

Not addressed. The spawn_agent_container method still has no locking mechanism. Concurrent calls for the same pipeline/role could create duplicate containers. This is a lower priority than the fixed issues, but should be addressed before high-concurrency use.

6. Silent Git Commit Failures (state_store.py:224)

Partially addressed. The _commit_state method now raises GitOperationError if git commands fail, which addresses the concern. However, _run_git catches CalledProcessError and re-raises as GitOperationError, which is appropriate.

🔴 New Issue Found

7. Missing Pipeline ID Validation in `routes/signals.py`

The signal routes do not validate pipeline IDs before using them. While get_state_store eventually validates in _get_pipeline_path, the error handling doesn't catch InvalidPipelineIdError:

# signals.py:169 - only catches PipelineNotFoundError, not InvalidPipelineIdError
try:
    store = get_state_store(repo_path)
    pipeline = store.load_pipeline(pipeline_id)
    ...
except PipelineNotFoundError:
    return make_error_response(...)

Fix: Add InvalidPipelineIdError to the exception handlers in handle_complete_signal, handle_error_signal, and the batch handler.

🟡 Design Observations

8. Inconsistent Error Response Handling

Routes containers.py, phases.py, pipelines.py, and decisions.py now properly handle InvalidPipelineIdError. However, signals.py was missed. The pattern should be consistently applied across all routes.

9. Route files `containers.py`, `pipelines.py`, `phases.py`, `decisions.py`

All properly import and handle the new exception types. Good work on consistency.

Summary

Category	Status
Container ID Validation	✅ Fixed
Path Traversal	✅ Fixed
Optimistic Locking	✅ Fixed
Pipeline Model Version	✅ Fixed
Container Spawner Race	⚠️ Not addressed (acceptable deferral)
Signals.py Error Handling	🔴 New issue - must fix

Recommendation: Request changes. The signals.py error handling gap needs to be addressed since it's a simple fix (adding one import and one exception handler to each signal handler) and follows the same pattern already applied to other routes.

— Authored by egg

Superseded by new review

james-in-a-box

No agent-mode design concerns with the delta since my last review.

The changes since commit 930187936f02a8f123bc5f63ab782be3fd7f9f1e include a merge from main and security fixes. From an agent-mode design perspective, these changes actually strengthen alignment with the design guidelines:

Good pattern: Infrastructure-level constraints
The security fixes follow the core principle "The sandbox is the constraint":

Container ID validation in docker_client.py prevents injection at the API layer
Pipeline ID validation in state_store.py prevents path traversal at the persistence layer
Optimistic locking with version field handles concurrency at the state management layer

These are infrastructure-level constraints, not prompt-level restrictions the agent could ignore.

No changes to agent prompts or context patterns
The orchestrator continues to pass only lightweight metadata to spawned containers:

Pipeline ID, issue number, agent role (orienting context)
Session tokens, proxy URLs (infrastructure credentials)
No pre-fetched diffs, logs, or file contents

The agents remain free to fetch what they need through their tools.

— Authored by egg

james-in-a-box

Re-review: Security and Race Condition Fixes (Delta since `9301879`)

I've reviewed the changes since commit 9301879. The commit f1254c1 addresses several issues from my previous review.

✅ Issues Properly Addressed

1. Container ID Validation (docker_client.py)

The validation is well-implemented:

Pattern ^[a-fA-F0-9]{12,64}$|^[a-zA-Z0-9][a-zA-Z0-9_.-]*$ correctly validates both hex IDs (12-64 chars) and container names
Validation is applied to start_container, stop_container, remove_container, get_container_info, get_container_logs, and wait_for_container
New InvalidContainerIdError exception properly exposed

2. Path Traversal Prevention (state_store.py)

Properly implemented:

Pattern ^issue-[0-9]+$ restricts pipeline IDs to expected format
Secondary is_relative_to() check ensures resolved path stays within pipelines_dir
New InvalidPipelineIdError exception

3. Optimistic Locking (state_store.py, routes/phases.py)

Implemented correctly:

version field added to Pipeline model with default=1
save_pipeline() accepts expected_version parameter
VersionConflictError raised on version mismatch
All phase mutation endpoints (advance_phase, start_phase, complete_phase, fail_phase) use optimistic locking

4. Route Error Handling

New exceptions properly caught in routes:

routes/containers.py: Handles InvalidContainerIdError (400)
routes/pipelines.py: Handles InvalidPipelineIdError (400)
routes/phases.py: Handles InvalidPipelineIdError (400) and VersionConflictError (409)
routes/decisions.py: Handles InvalidPipelineIdError (400)

🟠 Remaining Issues

1. Missing `InvalidContainerIdError` Handling in Health Check Endpoint

Location: orchestrator/routes/containers.py:426-447

The /health endpoint calls monitor.check_container_health(container_id), which in turn calls docker_client.get_container_info(container_id). This now raises InvalidContainerIdError, but the route doesn't catch it:

@containers_bp.route("/<pipeline_id>/containers/<container_id>/health", methods=["GET"])
def check_container_health(pipeline_id: str, container_id: str) -> tuple[Response, int]:
    monitor = get_container_monitor()
    health = monitor.check_container_health(container_id)  # Can raise InvalidContainerIdError
    return make_success_response("Health checked", data=health)

Fix: Add the same InvalidContainerIdError handler as in other endpoints, or catch it in ContainerMonitor.check_container_health().

2. No Tests for Security Validation Functions

The new validation functions and exceptions (_validate_container_id, _validate_pipeline_id, InvalidContainerIdError, InvalidPipelineIdError, VersionConflictError) have no corresponding tests. The test files (test_docker_client.py, test_state_store.py) don't test:

Invalid container ID formats being rejected
Path traversal attempts being blocked
Optimistic locking version conflicts
Edge cases in the regex patterns

This is a test coverage gap for security-critical code.

3. Container Spawner Race Condition Not Addressed

From my original review (issue #5): The race condition in container spawning (container_spawner.py) was noted but not fixed. The response indicated this would be addressed separately, but it's still a concern for concurrent spawn requests.

📋 Summary

Item	Status
Container ID injection fix	✅ Fixed
Path traversal fix	✅ Fixed
Optimistic locking	✅ Fixed
Route exception handling	⚠️ Missing in health check
Test coverage for security fixes	❌ Missing
Container spawner race condition	❌ Not addressed (noted for follow-up)

Recommendation: The health check endpoint bug should be fixed before merge. The test coverage gap is concerning for security code but can be addressed in a follow-up. The container spawner race condition was acknowledged as a separate item.

— Authored by egg

Add implementation plan for issue #496 introducing egg-orchestrator, a dedicated service for local SDLC pipeline execution. Authored-by: egg

james-in-a-box

Contract Verification Re-Review

This is an incremental contract verification for changes since commit a841c27f8b0269e8fa14777df389b7ccfba3b396 (the merge from main).

Delta Since Last Review

One commit has been added:

0945d0c — "Guard start_pipeline against AWAITING_HUMAN and fix worktree cleanup safety"

This commit makes three targeted improvements:

orchestrator/routes/pipelines.py: Added guard against starting pipelines in AWAITING_HUMAN state, returning 409 Conflict. This prevents race conditions where the orchestrator might try to start a pipeline while waiting for human approval at a HITL gate.
sandbox/entrypoint.py: Added working tree validation logic that detects empty worktrees (containing only .git) and re-populates them via git checkout HEAD -- .. This fixes a startup failure mode where containers could start with empty workspaces.
gateway/gateway.py: Added safety net to query Docker containers directly (get_active_docker_containers()) during startup cleanup, ensuring orphaned worktrees are cleaned even when session state is stale.

Contract Compliance Verification

No contract violations found in the delta. The changes are defensive bug fixes that improve resilience without changing the contract scope.

Change	Impact on Contract
AWAITING_HUMAN guard	Improves ac-12 (phase transition validation)
Worktree re-population	Improves sandbox reliability (no AC change)
Docker container safety net	Improves ac-8 (orphaned container cleanup)

Acceptance Criteria Status Summary

Based on previous contract reviews and my verification:

Status	Count	Notes
✅ Verified	23	Core orchestrator functionality implemented
❌ NOT VERIFIED	10	Missing files/features documented below

NOT VERIFIED Criteria (unchanged from previous review)

These acceptance criteria remain unverified due to missing implementations:

ac-17 (HITL sandbox mode): File sandbox/egg_lib/hitl_mode.py does not exist
ac-21 (Gateway orchestrator callback): File gateway/orchestrator_api.py does not exist
ac-24 (Gateway health includes orchestrator status): No gateway health modification detected
ac-25 (egg-deploy status shows both services): Partial — needs verification
ac-27 (sandbox orchestrator_client.py): File sandbox/egg_lib/orchestrator_client.py does not exist
ac-28 (Sandbox reports completion to orchestrator): Not implemented in sandbox entrypoint
ac-29 (egg_orchestrator package in containers): Partial — needs installation verification
ac-32 (Remote orchestrator mode handling): Partial — needs verification
ac-33 (Architecture docs for deployment modes): Documentation missing

Recommendation

Comment — The delta since last review contains only defensive bug fixes with no contract violations. The same 10 acceptance criteria that were NOT VERIFIED in the previous review remain unverified.

The implementation of core orchestrator functionality is substantial and working (23/33 criteria met). The missing items are primarily:

Sandbox-side orchestrator integration (hitl_mode.py, orchestrator_client.py)
Gateway-orchestrator callback endpoint
Documentation for deployment modes

These could be addressed as follow-up work if desired.

— Authored by egg

james-in-a-box · 2026-02-12T07:12:56Z

egg contract-verification completed. View run logs

21 previous review(s) hidden.

james-in-a-box

Re-Review: PR #524 - Egg Orchestrator

This re-review examines the delta since commit a841c27f8b0269e8fa14777df389b7ccfba3b396.

✅ New Changes Reviewed

1. Guard `start_pipeline` Against `AWAITING_HUMAN` Status

Location: orchestrator/routes/pipelines.py:1931-1935

if pipeline.status == PipelineStatus.AWAITING_HUMAN:
    return make_error_response(
        f"Pipeline {pipeline_id} is awaiting human approval",
        status_code=409,
    )

Correct implementation. This prevents the race condition where an external caller could try to restart a pipeline while it's blocked waiting for human approval at a HITL gate. The pipeline remains in AWAITING_HUMAN status until the decision is resolved, then transitions back to RUNNING internally (lines 1860-1861).

2. Worktree Cleanup Safety

The worktree cleanup safety is properly ensured by validate_identifier() in gateway/worktree_manager.py:72-91:

def validate_identifier(value: str, name: str) -> None:
    if not value:
        raise ValueError(f"Invalid {name}: cannot be empty")
    if ".." in value:
        raise ValueError(f"Invalid {name}: path traversal not allowed")
    if not re.match(r"^[a-zA-Z0-9][a-zA-Z0-9._-]*$", value):
        raise ValueError(f"Invalid {name}: must be alphanumeric with ._- allowed")

This validation is called in remove_worktree() (lines 430-434) and create_worktree() (lines 144-148) before any filesystem operations, preventing path traversal attacks via container_id or repo_name.

✅ Previously Identified Issues: Verified Fixed

All blocking issues from prior reviews remain properly addressed:

Item	Status	Location
Container ID validation	✅ Fixed	`docker_client.py:97-121`
Pipeline ID validation	✅ Fixed	`state_store.py:67-89`
Optimistic locking	✅ Fixed	`state_store.py`, `routes/phases.py`
Exception handling in routes	✅ Fixed	All route files
`.env` file permissions	✅ Fixed	`compose.py:159-160`
Gateway API endpoint paths	✅ Fixed	`gateway_client.py`
Container spawner environment	✅ Fixed	Pre-registers session at lines 202-231
`delete_session_by_container`	✅ Fixed	Uses `DELETE /api/v1/sessions/by-container/{id}`

⚠️ Minor Advisory (Not Blocking)

Test Coverage for AWAITING_HUMAN Guard

The new AWAITING_HUMAN guard at line 1931 does not have an explicit integration test. While the existing TestHITLGate tests cover the HITL gate flow and TestPipelineStartIdempotency covers re-starting running/completed pipelines, there's no test that specifically verifies:

Creating a pipeline
Starting it
Letting it reach AWAITING_HUMAN status
Attempting to start it again via API
Asserting 409 is returned

This would be a straightforward addition to the existing integration test suite but is not blocking.

Summary

Item	Status
`start_pipeline` AWAITING_HUMAN guard	✅ Correctly implemented
Worktree cleanup identifier validation	✅ Correctly implemented
Previous blocking issues	✅ All fixed
Test coverage for AWAITING_HUMAN guard	⚠️ Minor gap (not blocking)

Recommendation: Approve. The new changes correctly guard against starting pipelines in AWAITING_HUMAN state and the worktree cleanup safety is properly enforced via identifier validation. All blocking issues from previous reviews remain properly fixed.

— Authored by egg

james-in-a-box · 2026-02-12T07:13:41Z

egg review completed. View run logs

21 previous review(s) hidden.

Docker creates the container bind mount before the gateway creates the git worktree. Docker's tmpfs mount over .git creates a .git DIRECTORY on the host. When create_worktree() later runs git worktree add, it fails because .git is a directory (git worktrees need .git to be a FILE containing a gitdir: pointer). After the existing shutil.rmtree (which silently fails on active mount points), explicitly check for and remove a leftover .git directory so git worktree add can create its .git file.

james-in-a-box

No agent-mode design concerns.

Re-review summary: The branch was rebased since my last review at commit a1c45dd. The new commit dba968c adds a worktree creation fix (removing pre-existing .git directories that Docker creates before the gateway can set up the git worktree).

This change is purely infrastructure-level—it fixes container mount timing, not agent interaction patterns. My previous assessment still applies: the orchestrator passes lightweight metadata (pipeline ID, issue number, agent role) to orient agents without constraining them. Agents fetch their own data through tools rather than receiving pre-baked content.

— Authored by egg

james-in-a-box · 2026-02-12T07:25:54Z

egg agent-mode-design completed. View run logs

21 previous review(s) hidden.

Verify that POST /start returns 409 when a pipeline is in awaiting_human status (paused at a HITL gate). The test creates a pipeline with hitl_gates=True, waits for it to reach awaiting_human, attempts a second start (expects 409), then resolves the gates so the pipeline completes cleanly.

james-in-a-box

Re-Review: PR #524 - Egg Orchestrator

This re-review examines changes since commit 0945d0cc9421685f578be4fc91629014700262de.

✅ All Previously Identified Issues Resolved

1. Container Spawner Gateway Environment Bug

Fixed correctly. The spawn_agent_container() method now registers the gateway session and merges gateway environment variables BEFORE container creation (lines 202-225 in container_spawner.py):

# Lines 215-225: Gateway env added BEFORE container creation
gateway_env = self.gateway.get_container_env(
    session_token=session_info.session_token,
    ...
)
# Add gateway environment to container env BEFORE creation
env.update(gateway_env)

The container is then created at line 243 with the complete environment including EGG_SESSION_TOKEN, GATEWAY_URL, and proxy configuration.

2. signals.py InvalidPipelineIdError Handling

Fixed correctly. InvalidPipelineIdError is imported at line 37 and properly handled in both handle_complete_signal (lines 194-198) and handle_error_signal (lines 301-305).

3. containers.py Health Check InvalidContainerIdError Handling

Fixed correctly. The check_container_health endpoint (lines 426-454) now catches InvalidContainerIdError and returns 400.

4. compose.py .env File Permissions

Fixed correctly. Line 160 sets 0o600 permissions on the generated .env file:

env_file.chmod(0o600)  # Owner read/write only

🔵 New Code Review: `gateway/worktree_manager.py`

This commit adds a worktree manager that handles the Docker/.git directory race condition. The implementation is well-structured:

Security

Path traversal prevention is properly implemented:

validate_identifier() function (lines 79-90) validates container_id and repo_name
Rejects empty values, path traversal (../), and special characters
Pattern: ^[a-zA-Z0-9][a-zA-Z0-9._-]*$
Applied consistently before path construction

UID/GID validation:

Lines 157-160 validate uid/gid are non-negative integers before use in subprocess
Prevents injection via the chown command

Correctness

The Docker .git directory race condition fix:

Lines 195-221 detect and handle invalid worktree directories
If a .git directory exists (from Docker's tmpfs), it's removed before creating the actual .git file
This correctly addresses the issue described in the commit message

Test Coverage

Comprehensive tests in test_worktree_manager.py:

TestValidateIdentifier: Path traversal, empty values, special characters
TestWorktreeManagerDockerGitDir: Specifically tests the Docker pre-creation scenario
Cleanup behavior tests

Summary

Issue	Status
Container spawner gateway env	✅ Fixed
signals.py InvalidPipelineIdError	✅ Fixed
containers.py health check	✅ Fixed
compose.py .env permissions	✅ Fixed
worktree_manager.py security	✅ Properly implemented
worktree_manager.py tests	✅ Adequate coverage

All previously identified blocking issues have been properly addressed. The new worktree manager code is well-implemented with appropriate security validation and test coverage.

Recommendation: Approve.

— Authored by egg

james-in-a-box

Contract Verification Re-Review

This is an incremental contract verification for changes since commit 0945d0cc9421685f578be4fc91629014700262de.

Delta Since Last Review

One commit has been added:

dba968c — "Fix worktree creation when Docker pre-creates .git directory"

This commit addresses an edge case where Docker creates the container bind mount (including a .git directory via tmpfs) before the gateway creates the git worktree. Since git worktree add expects .git to be a file (containing a gitdir: pointer), not a directory, the worktree creation was failing.

Changes:

gateway/worktree_manager.py: +7 lines — After the existing shutil.rmtree() (which silently fails on active mount points), explicitly check for and remove a leftover .git directory
gateway/tests/test_worktree_manager.py: +68 lines — Two new tests covering the bug fix and normal case scenarios

Contract Compliance Verification

No contract violations found. The change is a defensive bug fix that improves worktree creation reliability without changing any contract scope.

Change	Impact on Contract
`.git` directory cleanup	Improves reliability for ac-7 (sandbox configuration)
Test coverage	Regression tests for worktree edge case

Code Review Notes

The fix is correct and targeted:

Only runs when worktree_path.exists() after rmtree (indicating bind mount blocked full removal)
Only removes .git if it exists AND is a directory (won't interfere with valid worktrees)
Uses ignore_errors=True for safety
Tests verify both the fix scenario and normal operation

Acceptance Criteria Status Summary

Status	Count	Notes
✅ Verified	23	Core orchestrator functionality (unchanged)
❌ NOT VERIFIED	10	Missing files/features (unchanged from previous review)

NOT VERIFIED Criteria (unchanged)

These acceptance criteria remain unverified due to missing implementations:

ac-17 — HITL sandbox mode: sandbox/egg_lib/hitl_mode.py does not exist
ac-21 — Gateway orchestrator callback: gateway/orchestrator_api.py does not exist
ac-24 — Gateway health includes orchestrator status: Not implemented
ac-25 — egg-deploy status shows both services: Partial
ac-27 — sandbox orchestrator_client.py: File does not exist
ac-28 — Sandbox reports completion to orchestrator: Not implemented
ac-29 — egg_orchestrator package in containers: Needs verification
ac-32 — Remote orchestrator mode handling: Partial
ac-33 — Architecture docs for deployment modes: Documentation missing

Recommendation

Comment — The delta contains only a targeted bug fix with proper test coverage and no contract violations. The same 10 acceptance criteria that were NOT VERIFIED in the previous review remain unverified. The core orchestrator implementation (23/33 criteria) is working as designed.

— Authored by egg

james-in-a-box · 2026-02-12T07:27:13Z

egg review completed. View run logs

21 previous review(s) hidden.

james-in-a-box · 2026-02-12T07:27:25Z

egg contract-verification completed. View run logs

21 previous review(s) hidden.

james-in-a-box

No agent-mode design concerns with the delta since my last review.

The changes since commit 885e674 introduce several new features:

Multi-reviewer architecture: The pipeline now spawns typed reviewer containers (unified, agent-design, code, contract) for each phase. This aligns well with agent-mode principles — each reviewer gets a focused objective and posts verdicts directly to .egg-state/reviews/.
Review prompts: The _build_review_prompt() function provides context (phase, pipeline ID, reviewer type) without pre-fetching large diffs. Reviewers are told to read draft files themselves, which aligns with guideline #1 (avoid unnecessary pre-fetching).
Checker/autofixer agents: The implement phase now spawns checker and autofixer agents. The autofixer receives a lightweight failure summary (~3-4 lines) but is instructed to read the full results file itself. This is appropriate orienting context, not excessive pre-fetching.
Local pipelines include PR phase: Local-mode pipelines now run all 4 phases (including PR). The gateway mode is switched to public for the PR phase only, enabling push access. This is enforced at the session level in the gateway, not via prompt instructions — correctly following guideline #5 (sandbox enforcement over prompt-level security).
JSON verdict format for reviewers: Reviewers write a small JSON verdict file ({verdict, summary, feedback}). This is appropriate because the verdict is consumed by the orchestrator (machine-readable), not directly by humans. The actual review comments should still be posted in natural language to the PR.
HITL gates: The hitl_gates config option pauses for human approval after refine and plan phases. Decisions are surfaced via the status endpoint, keeping humans in the loop without constraining agent exploration.

The integration tests demonstrate the architecture working end-to-end with mock sandbox containers that write appropriate artifacts.

— Authored by egg

james-in-a-box · 2026-02-12T07:30:32Z

egg agent-mode-design completed. View run logs

21 previous review(s) hidden.

james-in-a-box

Incremental Contract Verification Review

Delta reviewed: dba968c..35147ac (1 commit)

New Commit Analysis

35147ac Add integration test for AWAITING_HUMAN start_pipeline guard

The new commit adds a comprehensive integration test that verifies the AWAITING_HUMAN guard in start_pipeline(). The test:

Creates a pipeline with hitl_gates=True
Starts the pipeline and waits for awaiting_human status
Attempts to start again and verifies 409 response
Verifies error message contains "awaiting human"
Resolves gates and waits for clean completion

This test strengthens contract compliance for acceptance criteria related to HITL integration (ac-16 through ac-20), specifically validating the interaction between HITL gates and pipeline state management.

Contract Verification Status

The 33 acceptance criteria remain unverified in the contract file. This is expected — my role as REVIEWER is to verify criteria and mark them verified using egg-contract verify-criterion. However, the egg-contract CLI has an import error in this environment:

ImportError: attempted relative import with no known parent package

Implementation Assessment

Based on code review of the delta:

No regressions identified — The new test is additive and doesn't modify any existing logic
Test quality is high — Uses proper fixtures, handles cleanup, validates both status codes and error messages
No security concerns — The test file contains no vulnerabilities

Recommendation

The incremental change is correct and improves test coverage. Once the egg-contract CLI is functional, the following criteria should be verified based on the existing implementation:

Criterion	Status	Evidence
ac-16	Ready to verify	`routes/decisions.py` implements POST/GET endpoints
ac-18	Ready to verify	`routes/decisions.py` implements resolve endpoint
ac-4	Ready to verify	`routes/health.py`, `routes/pipelines.py` exist

The PR continues to make solid progress toward full contract compliance.

— Authored by egg

james-in-a-box · 2026-02-12T07:31:09Z

egg contract-verification completed. View run logs

21 previous review(s) hidden.

Update documentation to reflect egg-orchestrator component added in d1d4440: - Add orchestrator/ directory to STRUCTURE.md with detailed file listing - Add orchestrator to architecture component table - Update deployment guide network topology diagram - Add orchestrator to component documentation index - Include new local_pipeline integration tests Triggered by: PR #524 (d1d4440) Authored-by: egg

* docs: Add orchestrator to project structure docs Update documentation to reflect egg-orchestrator component added in d1d4440: - Add orchestrator/ directory to STRUCTURE.md with detailed file listing - Add orchestrator to architecture component table - Update deployment guide network topology diagram - Add orchestrator to component documentation index - Include new local_pipeline integration tests Triggered by: PR #524 (d1d4440) Authored-by: egg * docs: Address review feedback on orchestrator documentation - Add missing requirements.txt to orchestrator structure listing - Update component docs header text for link consistency Authored-by: egg --------- Co-authored-by: jwbron <8340608+jwbron@users.noreply.github.com> Co-authored-by: james-in-a-box[bot] <246424927+james-in-a-box[bot]@users.noreply.github.com>

The SDLC pipeline has been fully migrated to the local distributed orchestrator (PR #524). This removes the now-superseded GitHub Actions implementation: 8 workflow files, 3 supporting scripts, the checks/ directory, 14 action prompt builders, and associated tests. All documentation is updated to reference the orchestrator equivalents. Issue: #545 Authored-by: egg

* Add analysis for issue #545: Remove GitHub Actions SDLC architecture * Add implementation plan for issue #545: Remove GitHub Actions SDLC architecture * Remove GitHub Actions SDLC workflows and supporting scripts The SDLC pipeline has been fully migrated to the local distributed orchestrator (PR #524). This removes the now-superseded GitHub Actions implementation: 8 workflow files, 3 supporting scripts, the checks/ directory, 14 action prompt builders, and associated tests. All documentation is updated to reference the orchestrator equivalents. Issue: #545 Authored-by: egg * Add implement phase check results * Fix lint errors and stale test for invalid schema validation - Fix 8 auto-fixable ruff errors: remove unused imports (F401), update typing imports to collections.abc (UP035), sort import block (I001), remove unused variable (F841) - Fix 6 E741 ambiguous variable names: rename `l` to `line` in list comprehensions across SSE test files - Fix test_load_invalid_schema_fails: Pipeline model fields became optional in #554, so {"id": "issue-9998"} is now valid. Updated test to use an invalid enum value for status instead. Authored-by: egg * Add implement phase check results for issue-545 * Add unified review verdict for issue-545 implement phase * Add agent-design review verdict for issue-545 implement phase * Add code review verdict for issue-545 implement phase Reviewed security, correctness, robustness, and design across the v2 checkpoint system, session manager changes, workflow removals, and orchestrator updates. No critical issues introduced by this diff. Authored-by: egg * Add contract review verdict for issue-545 implement phase * Address review feedback: remove stale references and fix docstring Remove dangling references to deleted files (sdlc-work-loop.yml, build-sdlc-prompt.sh) from comments and docstrings. Fix redundant phrasing in _populate_contract_from_plan docstring. PR description updated separately to correct inaccurate claims about action/ and config/ directory removal. * Restore PR-operational checks, contract validator, and prompt builder Restores files that were incorrectly removed as part of the SDLC cleanup: - .github/scripts/checks/ directory (check_fixer, lint_check, test_check, merge_conflict_check, draft_validation_check, plan_yaml_check, base, run_check) - .github/workflows/on-pull-request-contract-verify.yml - action/build-contract-verification-prompt.sh - tests/scripts/test_checks.py Removes deployment_check from run_check.py registry (SDLC-only, correctly deleted). Updates docs, README, and test-action.yml shellcheck to include restored files. * Remove stale check-deployment definition from phase defaults The deployment_check.py script was deleted during merge conflict resolution (it depended on deleted .github/scripts/checks/ base infrastructure from the GHA SDLC removal). The CheckDefinition referencing it remained as a dangling reference from PR #653. Remove it to avoid a potential ValueError if run_check.py tries to dispatch an unknown 'deployment' check. * Address review feedback: fix EDITOR handling, add subprocess timeouts, add reconnection backoff - Fix _launch_editor to use shlex.split for multi-word $EDITOR values (e.g. "code --wait", "vim +10") - Add timeout=30 to all subprocess.run calls in _commit_statefiles_to_worktree to prevent indefinite hangs on git lock contention - Add exponential backoff (1s-30s) and max retry limit (20) to watch_pipeline reconnection loop to prevent resource exhaustion - Use consistent word-boundary regex matching in _detect_phase for all phase keywords instead of mixing substring and regex strategies - Narrow _find_repo_path exception handler from bare Exception to specific subprocess/OS error types Authored-by: egg --------- Co-authored-by: egg <egg@localhost> Co-authored-by: egg-reviewer[bot] <261018737+egg-reviewer[bot]@users.noreply.github.com> Co-authored-by: jwbron <8340608+jwbron@users.noreply.github.com>

This comment has been minimized.

Sign in to view

james-in-a-box bot approved these changes Feb 11, 2026

View reviewed changes

This comment has been minimized.

Sign in to view

james-in-a-box bot previously requested changes Feb 11, 2026

View reviewed changes

This comment has been minimized.

Sign in to view

james-in-a-box bot approved these changes Feb 11, 2026

View reviewed changes

james-in-a-box bot previously requested changes Feb 11, 2026

View reviewed changes

This comment has been minimized.

Sign in to view

jwbron force-pushed the egg/issue-496 branch from 9301879 to f1254c1 Compare February 11, 2026 20:14

This comment has been minimized.

Sign in to view

james-in-a-box bot approved these changes Feb 11, 2026

View reviewed changes

james-in-a-box bot previously requested changes Feb 11, 2026

View reviewed changes

This comment has been minimized.

Sign in to view

james-in-a-box[bot] added 5 commits February 11, 2026 12:25

Initialize SDLC contract for issue #496

07720f4

Draft analysis for issue #496

9172af5

Update refine review state (cycle 1)

fc86887

Approve and advance to plan phase for issue #496

33e71da

Draft plan for egg-orchestrator implementation

b5f5772

Add implementation plan for issue #496 introducing egg-orchestrator, a dedicated service for local SDLC pipeline execution. Authored-by: egg

james-in-a-box bot reviewed Feb 12, 2026

View reviewed changes

james-in-a-box bot previously approved these changes Feb 12, 2026

View reviewed changes

jwbron dismissed james-in-a-box[bot]’s stale review via dba968c February 12, 2026 07:21

james-in-a-box bot previously approved these changes Feb 12, 2026

View reviewed changes

jwbron dismissed james-in-a-box[bot]’s stale review via 35147ac February 12, 2026 07:26

james-in-a-box bot approved these changes Feb 12, 2026

View reviewed changes

james-in-a-box bot reviewed Feb 12, 2026

View reviewed changes

jwbron merged commit d1d4440 into main Feb 12, 2026
33 checks passed

james-in-a-box bot deleted the egg/issue-496 branch February 12, 2026 07:30

james-in-a-box bot approved these changes Feb 12, 2026

View reviewed changes

james-in-a-box bot reviewed Feb 12, 2026

View reviewed changes

james-in-a-box bot mentioned this pull request Feb 12, 2026

docs: Add orchestrator component [doc-updater] #547

Merged

james-in-a-box bot mentioned this pull request Feb 14, 2026

Remove GitHub Actions SDLC orchestration architecture #658

Merged

Conversation

jwbron commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Bug fixes

Integration tests (14 passing)

Known issues (analysis, not yet fixed)

1. Missing EGG_SESSION_TOKEN for local-mode containers

2. OOM kill on memory-constrained systems (exit code 137)

Uh oh!

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

james-in-a-box bot left a comment

Choose a reason for hiding this comment

Uh oh!

This comment has been minimized.

james-in-a-box bot left a comment

Choose a reason for hiding this comment

Code Review: PR #524 - Egg Orchestrator

🔴 Critical Security Issues

1. Container ID Injection Vulnerability in docker_client.py

2. Path Traversal Risk in state_store.py

3. Missing Authentication on API Endpoints

4. Command Injection in Webhook Handler

🟠 Correctness Issues

5. Race Condition in Container Spawning

6. Race Condition in Phase Transitions

7. Missing Error Handling for Docker Connection Failures

8. Histogram Bucket Counting Bug

9. Git Commit Can Fail Silently

🟡 Robustness Issues

10. Missing Request Validation in All Routes

11. Timeout Not Enforced on Container Operations

12. Circuit Breaker Has No Persistence

13. HITL Decision Timeout Not Actively Checked

14. Multi-Agent Executor Missing Error Recovery

🔵 Design Issues

15. Inconsistent Error Response Format

16. Circular Import Risk

17. Hardcoded Network Configuration

18. Missing Structured Logging Context

📝 Test Coverage Gaps

19. No Integration Tests

20. No Load/Stress Tests

21. Tests Don't Cover Error Paths

✅ Positive Notes

Summary

Uh oh!

This comment has been minimized.

jwbron commented Feb 11, 2026

Uh oh!

james-in-a-box bot commented Feb 11, 2026

Addressed Review Feedback

✅ Fixed: Critical Security Issues

✅ Fixed: Race Conditions

✅ Already Addressed: Docker Connection Failures

Route Updates

Remaining Items (not addressed in this commit)

Uh oh!

jwbron commented Feb 11, 2026

Uh oh!

james-in-a-box bot commented Feb 11, 2026

Merged main into this branch

Changes merged from main:

Uh oh!

james-in-a-box bot left a comment

Choose a reason for hiding this comment

Uh oh!

james-in-a-box bot left a comment

Choose a reason for hiding this comment

Re-Review: PR #524 - Egg Orchestrator

✅ Issues Properly Addressed

1. Container ID Injection Vulnerability (docker_client.py)

2. Path Traversal Risk (state_store.py)

3. Optimistic Locking for Phase Transitions (routes/phases.py)

4. Version Field on Pipeline Model (models.py)

🟠 Remaining Issues from Previous Review (Not Addressed)

5. Container Spawner Race Condition (container_spawner.py)

jwbron commented Feb 11, 2026 •

edited

Loading

1. Missing `EGG_SESSION_TOKEN` for local-mode containers

1. Container ID Injection Vulnerability in `docker_client.py`

2. Path Traversal Risk in `state_store.py`

7. Missing Pipeline ID Validation in `routes/signals.py`

9. Route files `containers.py`, `pipelines.py`, `phases.py`, `decisions.py`

Re-review: Security and Race Condition Fixes (Delta since `9301879`)

1. Missing `InvalidContainerIdError` Handling in Health Check Endpoint

1. Guard `start_pipeline` Against `AWAITING_HUMAN` Status