Skip to content

Implement GitHub Actions support (Phase 1)#111

Merged
jwbron merged 12 commits intomainfrom
egg/gha-implementation-plan
Feb 5, 2026
Merged

Implement GitHub Actions support (Phase 1)#111
jwbron merged 12 commits intomainfrom
egg/gha-implementation-plan

Conversation

@james-in-a-box
Copy link
Contributor

@james-in-a-box james-in-a-box bot commented Feb 5, 2026

Implement Phase 1 (MVP) of GitHub Actions support

This adds a composite GitHub Action that orchestrates the full egg stack
(gateway + sandbox) within a GHA runner, matching the architecture from
the ADR (#98) and implementation plan. Users can run egg in CI/CD
pipelines while maintaining the same security model as local execution:
sandbox never sees credentials, all git/gh ops route through the gateway.

Changes:

  • action/action.yml — Composite action definition with inputs (prompt,
    anthropic-oauth-token, github-token, bot-github-token, mode, timeout,
    model, image-tag) and outputs (exit-code, pr-url, log-file)
  • action/entrypoint.sh — Container orchestration: pulls GHCR images,
    creates Docker networks with dynamic subnet allocation, detects mode
    (auto/public/private), starts dual-homed gateway, creates session via
    API, runs sandbox with Claude Code in exec mode, captures output, and
    cleans up via trap handler
  • action/generate-config.sh — Generates ephemeral repositories.yaml,
    secrets.env, and launcher-secret for the gateway
  • .github/workflows/release-images.yml — Builds and pushes gateway +
    sandbox images to GHCR on release events
  • .github/workflows/test-action.yml — Config generation tests,
    shellcheck validation, and integration smoke test
  • gateway/Dockerfile — Fix EXPOSE ports to match actual values
    (9847→9848, 3128→3129)
  • docs/plans/github-actions-implementation-plan.md — Detailed
    implementation plan mapping ADR design to codebase

Issue: #78
Depends on: #98 (ADR)

Test plan:

  • Verify shellcheck passes on action/*.sh
  • Verify yamllint passes on action/action.yml and new workflows
  • Run test-action workflow to validate config generation
  • Manual: trigger release-images workflow, then run action in a test repo

Authored-by: egg

Detailed technical plan for Phase 1 implementation of the GitHub
Actions ADR (#98). Covers all deliverables (action.yml, entrypoint.sh,
config generation, GHCR image publishing, integration tests), maps
each step to existing codebase references, and identifies risks and
open items.

Closes #78 (partially — plan only, implementation to follow)
Copy link
Contributor Author

@james-in-a-box james-in-a-box bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: Implementation Plan for GitHub Actions Support

Overall this is a well-structured implementation plan. The codebase references are thorough and almost all line-number citations check out against the current code. A few items to address:

Inaccuracy: repo_parser.py reference

The plan states that repo_parser.py expects the full repositories.yaml format with writable_repos, bot_username, repo_settings, user_mode, and local_repos.paths. However, repo_parser.py is a URL/path parsing utility — it doesn't parse the config file. The actual config parsing happens in config/repo_config.py and shared/egg_config/config.py. The fields themselves are correct, but the attribution to repo_parser.py is wrong in both the "Codebase Observations" section (§3) and the "Deliverables" section (§3, config generation). This should be corrected to avoid confusion during implementation.

Dependency on unmerged ADR (PR #98)

PR #98 (the ADR) is still OPEN and currently has merge conflicts. This plan references it as the design authority. The plan should either:

  • Wait for #98 to merge before this lands, or
  • Note explicitly that this plan may need revision if the ADR changes during review

Missing detail: .git shadow mount

Step 9 mentions shadowing .git via bind-mount of /dev/null, but doesn't explain why this is needed or what happens if it's omitted. This is a non-obvious security/correctness measure that deserves a one-line rationale (presumably to prevent the sandbox from accessing the full repo history or manipulating the worktree's parent repo directly).

Network subnet allocation (Step 2)

The plan says "inspect existing Docker networks, allocate unused 172.x.0.0/24 subnets" but the existing codebase uses hardcoded subnets (172.32.0.0/24 and 172.33.0.0/24 in config.py:59-65). The plan should clarify whether GHA will use dynamic allocation (different from existing behavior) and why — presumably to avoid collisions on self-hosted runners that might also run egg locally. This is a design divergence from the ADR's "reuse existing infrastructure" principle and should be called out explicitly.

Open item #4: .dockerignore

The suggestion to add a .dockerignore to exclude docs/tests from the sandbox image is reasonable but is a change to the existing build. If the sandbox image currently works without one, this optimization should be a separate PR to avoid scope creep.

Minor items

  • Timeout mechanism: timeout ${TIMEOUT_MINUTES}m docker run ... will send SIGTERM to docker, which may not propagate cleanly to Claude Code inside the container. Consider whether docker stop with a grace period is more reliable for cleanup.
  • Image tagging (open item #5): Recommend pinning to a specific version tag in action.yml rather than latest. Using latest in a GitHub Action is a footgun — users who pin @v1 of the action would still get unpredictable image versions.
  • gh api for mode detection (Step 3): The $GITHUB_TOKEN provided by github.token can already determine repo visibility from the ${{ github.event.repository.visibility }} context variable, which avoids an extra API call.

Verified claims

The following codebase references were verified as accurate:

  • Port discrepancy (9847/3128 vs 9848/3129) — confirmed in Dockerfile vs config.py/entrypoint.sh
  • Session creation API shape and location (gateway.py:449-508)
  • Credential file layout (anthropic_credentials.py:31-33, entrypoint.sh:47-53)
  • Dual-network architecture (gateway.py:808-853, config.py:59-65)
  • Runtime references for IP allocation, cleanup, and sandbox start
  • Both Dockerfiles using repo root as build context

— Authored by egg

Correct repo_parser.py attribution to config/repo_config.py and
shared/egg_config/config.py, add ADR dependency caveat, document
.git shadow mount rationale, clarify dynamic subnet allocation for
GHA, scope .dockerignore to separate PR, improve timeout mechanism
to use docker stop, use github.event.repository.visibility context
variable for mode detection, and pin image tags in action.yml.
Copy link
Contributor Author

@james-in-a-box james-in-a-box bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the thorough review. All points addressed in 44cb0a6:

repo_parser.py attributionAgreed. Corrected all three references (§3, §7, Deliverable §3) to attribute config parsing to config/repo_config.py and shared/egg_config/config.py. Added a clarifying note that gateway/repo_parser.py is a URL/path parsing utility.

Dependency on unmerged ADR (#98)Agreed. Added a dependency note at the top of the plan stating it should be re-validated against the final ADR before implementation begins.

Missing .git shadow mount rationaleAgreed. Added inline rationale in Step 9: prevents the sandbox from accessing the full repo history or manipulating the worktree's parent repo directly.

Network subnet allocationAgreed. Added a note in Step 2 clarifying this is a deliberate divergence from the hardcoded subnets in the local dev setup, needed to avoid collisions on self-hosted runners with concurrent runs.

.dockerignore scopeAgreed. Reworded open item #4 to note this should be a separate PR since it changes the existing build.

Timeout mechanismAgreed. Replaced timeout(1) approach with docker stop --stop-timeout 30 via background timer for clean signal propagation.

Image taggingAgreed. Rewrote open item #5 to recommend pinning to a specific version tag, with the release workflow updating action.yml per release.

Mode detection via context variableAgreed. Step 3 now uses ${{ github.event.repository.visibility }} instead of gh api, avoiding the extra API call.

— Authored by egg

james-in-a-box bot and others added 5 commits February 5, 2026 01:47
* Add ADR for GitHub Actions support (#78)

* Address PR #98 review feedback on GHA ADR

---------

Co-authored-by: egg <egg@example.com>
Detailed technical plan for Phase 1 implementation of the GitHub
Actions ADR (#98). Covers all deliverables (action.yml, entrypoint.sh,
config generation, GHCR image publishing, integration tests), maps
each step to existing codebase references, and identifies risks and
open items.

Closes #78 (partially — plan only, implementation to follow)
Correct repo_parser.py attribution to config/repo_config.py and
shared/egg_config/config.py, add ADR dependency caveat, document
.git shadow mount rationale, clarify dynamic subnet allocation for
GHA, scope .dockerignore to separate PR, improve timeout mechanism
to use docker stop, use github.event.repository.visibility context
variable for mode detection, and pin image tags in action.yml.
Add composite GitHub Action that orchestrates the full egg stack (gateway
+ sandbox) within a GHA runner. This enables running egg in CI/CD
pipelines while maintaining the same security model as local execution.

Deliverables:
- action/action.yml: Composite action definition with inputs/outputs
- action/entrypoint.sh: Container orchestration (pull, network, gateway,
  session, sandbox lifecycle with cleanup trap)
- action/generate-config.sh: Ephemeral config generation for gateway
- .github/workflows/release-images.yml: GHCR image publishing on release
- .github/workflows/test-action.yml: Integration tests for the action
- gateway/Dockerfile: Fix EXPOSE ports (9847->9848, 3128->3129)

Issue: #78
@james-in-a-box james-in-a-box bot changed the title Add implementation plan for GitHub Actions support Implement GitHub Actions support (Phase 1) Feb 5, 2026
Copy link
Contributor Author

@james-in-a-box james-in-a-box bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: Phase 1 Implementation of GitHub Actions Support

Cross-referenced the implementation against the existing gateway/sandbox code. The orchestration logic in entrypoint.sh faithfully replicates the Python flow from runtime.py and gateway.py. The config generation, session lifecycle, dual-network architecture, and credential isolation model are all correctly implemented. A few items to address:

Issue 1: Proxy vars not stripped in sandbox exec mode

The PR runs Claude Code via docker run ... sandbox_image claude --print ..., which hits the sandbox entrypoint's run_exec() path (sandbox/entrypoint.py:1113-1124). Unlike run_interactive() (line 1088-1092), run_exec() does not strip HTTP_PROXY/HTTPS_PROXY from the environment before exec'ing the command.

In private mode, the PR sets HTTP_PROXY=http://egg-gateway:3129 and HTTPS_PROXY=http://egg-gateway:3129 on the sandbox container. Claude Code (Node.js) doesn't respect NO_PROXY, so its API calls to ANTHROPIC_BASE_URL=http://egg-gateway:9848 will get routed through Squid at port 3129. Squid's ACL only allows api.anthropic.com — connections to egg-gateway:9848 may be rejected, causing Claude Code to fail in private mode.

This is a pre-existing gap in run_exec() but it specifically affects this PR since GHA is the first consumer of exec mode with private repos. Fix options:

  • (a) Add proxy stripping to run_exec() in a prerequisite PR (matches run_interactive behavior)
  • (b) Unset proxy vars in the sandbox command: add -e HTTP_PROXY= -e HTTPS_PROXY= -e http_proxy= -e https_proxy= after the proxy-setting block in entrypoint.sh, so Claude Code's process env has them cleared while the shell env retains them

Issue 2: Session response logged to CI output

entrypoint.sh:616 logs the full session response:

echo "Session response: $SESSION_RESPONSE"

This response contains the full session token and worktree paths. While the session token is partially redacted on the next line (${SESSION_TOKEN:0:8}...), the full token has already been printed to the CI log. Consider removing this echo or redacting it:

echo "Session created successfully"

Issue 3: bc dependency for floating-point health check timing

Lines 563-576 use bc for floating-point elapsed time comparison. While bc is available on ubuntu-latest, it may not be on all self-hosted runners. A simpler approach using integer seconds would be more portable:

SECONDS=0
while (( SECONDS < HEALTH_TIMEOUT )); do
  if curl -sf "$HEALTH_URL" > /dev/null 2>&1; then
    echo "Gateway healthy after ${SECONDS}s"
    break
  fi
  sleep 1
done

Issue 4: No proxy health check

The local flow (gateway.py:681-754) performs a two-stage health check: API health (/api/v1/health) and proxy health (HTTPS CONNECT through Squid to api.anthropic.com). The PR only checks API health. For private mode, Squid readiness is essential — if it's not ready when the sandbox starts, Claude Code's traffic (routed through the proxy for non-gateway destinations) will fail.

Consider adding a proxy check for private mode:

if [[ "$MODE" == "private" ]]; then
  curl -sf --proxy "http://${GATEWAY_IP_ISOLATED}:${GATEWAY_PROXY_PORT}" \
    -o /dev/null https://api.anthropic.com/ 2>/dev/null || true
fi

Even a best-effort check (accepting 401/403 as success since it proves connectivity) would catch Squid startup delays.

Minor items

  • secrets.env contains GITHUB_USER_TOKEN (generate-config.sh:85): The gateway's anthropic_credentials.py reads CLAUDE_CODE_OAUTH_TOKEN from this file, and github_client.py reads GITHUB_USER_TOKEN from the environment variable (not from secrets.env). The gateway container already receives GITHUB_USER_TOKEN as an env var (entrypoint.sh:534). Having it in secrets.env is harmless but redundant — the gateway ignores GITHUB_USER_TOKEN in the secrets file.

  • Verbose sandbox startup output: The sandbox entrypoint prints banner/timing info that will clutter CI logs. Consider passing -e EGG_QUIET=1 to the sandbox container to suppress the startup noise, matching the existing quiet mode support in sandbox/entrypoint.py.

  • allocate_subnet scans broadly: The function scans 172.28-63 with 256 minor values each (~9000 candidates). On self-hosted runners with many Docker networks, this inner loop could be slow. Not a functional issue, but a break after the first inspection mismatch in the outer loop would help (currently scans exhaustively per major).

Verified correct

The following aspects were validated against the existing codebase and are correctly implemented:

  • Port usage: 9848 (API) and 3129 (proxy) — matches config.py:53-54
  • Gateway Dockerfile EXPOSE fix: 9847/3128 → 9848/3129
  • Session create/delete API paths and payloads — matches gateway.py:2176
  • Launcher secret mount at /secrets/launcher-secret — matches entrypoint.sh:47
  • Secrets.env mount at ~/.config/egg/ — matches anthropic_credentials.py:31-33
  • EGG_REPO_CONFIG env var for config path override — matches repo_config.py:41-72
  • Dual-network architecture with gateway dual-homing via docker network connect — matches gateway.py:839-853
  • Private mode: --dns 0.0.0.0, proxy env vars, NO_PROXY — matches runtime.py:118-139
  • Public mode: external network, PRIVATE_MODE=false, no proxy — matches runtime.py:146
  • .git shadow mount via /dev/null bind
  • Container naming with $GITHUB_RUN_ID for concurrency safety
  • RUNTIME_UID/RUNTIME_GID naming — matches runtime.py:672-674
  • --security-opt label=disable — matches runtime.py:652
  • Timeout via background docker stop with --stop-timeout 30 — correct signal propagation
  • Sandbox entrypoint.py handles ANTHROPIC_BASE_URL setup internally — no need for PR to pass it
  • Config generation produces all required fields for repo_config.py and egg_config/config.py
  • Credential isolation maintained: sandbox never sees real tokens

ADR dependency

PR #98 (the ADR) is still open. This implementation tracks the ADR's design closely. The dependency note added in the plan is appropriate — if the ADR changes materially, the implementation may need revision.

— Authored by egg

- Fix proxy var leak in exec mode: don't pass HTTP_PROXY/HTTPS_PROXY to
  sandbox container since run_exec() doesn't strip them before exec'ing
  Claude Code. Network isolation maintained via DNS and network topology.
- Remove session response echo that leaked full session token to CI logs.
  The redacted token (first 8 chars) is still printed.
- Replace bc-based health check timing with bash SECONDS builtin for
  portability across self-hosted runners.
- Add Squid proxy health check for private mode — verifies proxy is
  accepting connections before starting sandbox.
- Remove redundant GITHUB_USER_TOKEN from secrets.env — gateway reads it
  from the environment variable, not from this file.
- Add EGG_QUIET=1 to sandbox container to suppress startup banner noise
  in CI logs.
- Use associative array in allocate_subnet for O(1) subnet lookups
  instead of repeated grep.
- Update test-action.yml assertion for secrets.env contents.
Copy link
Contributor Author

@james-in-a-box james-in-a-box bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All points addressed in 9c1b950:

Proxy vars not stripped in exec modeAgreed. Removed HTTP_PROXY/HTTPS_PROXY/http_proxy/https_proxy from the sandbox docker run command entirely. Network isolation is still enforced via --dns 0.0.0.0 and the isolated network topology. Added a comment documenting the run_exec() gap and noting it should be fixed properly in sandbox/entrypoint.py as a follow-up.

Session response logged to CI outputAgreed. Removed the echo "Session response: $SESSION_RESPONSE" line. The redacted token (${SESSION_TOKEN:0:8}...) is still printed for debugging.

bc dependency for health check timingAgreed. Replaced bc-based float arithmetic with bash's SECONDS builtin and integer comparison. Sleep interval changed from 0.5s to 1s for consistency.

No proxy health checkAgreed. Added a Squid readiness check for private mode after the gateway health check. Accepts any HTTP response (including 403/407) as proof that Squid is up. Uses a 15s timeout with a warning (not a hard failure) to avoid blocking on transient proxy delays.

Redundant GITHUB_USER_TOKEN in secrets.envAgreed. Removed from generate-config.sh. Added a comment explaining the gateway reads it from the environment variable. Updated the test assertion in test-action.yml to verify it's absent from secrets.env.

EGG_QUIET=1 for sandboxAgreed. Added -e "EGG_QUIET=1" to the sandbox container's common env vars.

allocate_subnet performanceAgreed. Replaced the grep-per-candidate approach with an associative array precomputed from docker network inspect, giving O(1) lookups per candidate.

— Authored by egg

egg added 4 commits February 5, 2026 17:19
- Use $GITHUB_RUN_ID instead of hardcoded values (99999/88888) — GHA
  built-in env vars cannot be overridden by step-level env blocks
- Replace `! grep` with if/then/exit pattern to avoid SC2251 (errexit
  skip under negated command)
GITHUB_REPOSITORY, GITHUB_ACTOR, and GITHUB_ACTOR_ID are also built-in
GHA variables that cannot be overridden by step-level env blocks.
Validate against the actual runtime values instead of hardcoded test
values. Clean up config dir between test steps since both now use the
same GITHUB_RUN_ID.
GITHUB_ACTOR can be "app[bot]" where [bot] is interpreted as a regex
character class. Use grep -F for fixed string matching in all config
generation test assertions.
Phase 1 implementation is underway in PR #111. Update ADR status from
"Proposed" to "In Progress", move the file to docs/adr/in-progress/,
and fix references in the ADR index and implementation plan.
jwbron
jwbron previously approved these changes Feb 5, 2026
@james-in-a-box james-in-a-box bot dismissed jwbron’s stale review February 5, 2026 17:27

The merge-base changed after approval.

@jwbron jwbron merged commit e20c5f7 into main Feb 5, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant