Implement GitHub Actions support (Phase 1)#111
Conversation
Detailed technical plan for Phase 1 implementation of the GitHub Actions ADR (#98). Covers all deliverables (action.yml, entrypoint.sh, config generation, GHCR image publishing, integration tests), maps each step to existing codebase references, and identifies risks and open items. Closes #78 (partially — plan only, implementation to follow)
There was a problem hiding this comment.
Review: Implementation Plan for GitHub Actions Support
Overall this is a well-structured implementation plan. The codebase references are thorough and almost all line-number citations check out against the current code. A few items to address:
Inaccuracy: repo_parser.py reference
The plan states that repo_parser.py expects the full repositories.yaml format with writable_repos, bot_username, repo_settings, user_mode, and local_repos.paths. However, repo_parser.py is a URL/path parsing utility — it doesn't parse the config file. The actual config parsing happens in config/repo_config.py and shared/egg_config/config.py. The fields themselves are correct, but the attribution to repo_parser.py is wrong in both the "Codebase Observations" section (§3) and the "Deliverables" section (§3, config generation). This should be corrected to avoid confusion during implementation.
Dependency on unmerged ADR (PR #98)
PR #98 (the ADR) is still OPEN and currently has merge conflicts. This plan references it as the design authority. The plan should either:
- Wait for #98 to merge before this lands, or
- Note explicitly that this plan may need revision if the ADR changes during review
Missing detail: .git shadow mount
Step 9 mentions shadowing .git via bind-mount of /dev/null, but doesn't explain why this is needed or what happens if it's omitted. This is a non-obvious security/correctness measure that deserves a one-line rationale (presumably to prevent the sandbox from accessing the full repo history or manipulating the worktree's parent repo directly).
Network subnet allocation (Step 2)
The plan says "inspect existing Docker networks, allocate unused 172.x.0.0/24 subnets" but the existing codebase uses hardcoded subnets (172.32.0.0/24 and 172.33.0.0/24 in config.py:59-65). The plan should clarify whether GHA will use dynamic allocation (different from existing behavior) and why — presumably to avoid collisions on self-hosted runners that might also run egg locally. This is a design divergence from the ADR's "reuse existing infrastructure" principle and should be called out explicitly.
Open item #4: .dockerignore
The suggestion to add a .dockerignore to exclude docs/tests from the sandbox image is reasonable but is a change to the existing build. If the sandbox image currently works without one, this optimization should be a separate PR to avoid scope creep.
Minor items
- Timeout mechanism:
timeout ${TIMEOUT_MINUTES}m docker run ...will send SIGTERM to docker, which may not propagate cleanly to Claude Code inside the container. Consider whetherdocker stopwith a grace period is more reliable for cleanup. - Image tagging (open item #5): Recommend pinning to a specific version tag in
action.ymlrather thanlatest. Usinglatestin a GitHub Action is a footgun — users who pin@v1of the action would still get unpredictable image versions. gh apifor mode detection (Step 3): The$GITHUB_TOKENprovided bygithub.tokencan already determine repo visibility from the${{ github.event.repository.visibility }}context variable, which avoids an extra API call.
Verified claims
The following codebase references were verified as accurate:
- Port discrepancy (9847/3128 vs 9848/3129) — confirmed in Dockerfile vs config.py/entrypoint.sh
- Session creation API shape and location (gateway.py:449-508)
- Credential file layout (anthropic_credentials.py:31-33, entrypoint.sh:47-53)
- Dual-network architecture (gateway.py:808-853, config.py:59-65)
- Runtime references for IP allocation, cleanup, and sandbox start
- Both Dockerfiles using repo root as build context
— Authored by egg
Correct repo_parser.py attribution to config/repo_config.py and shared/egg_config/config.py, add ADR dependency caveat, document .git shadow mount rationale, clarify dynamic subnet allocation for GHA, scope .dockerignore to separate PR, improve timeout mechanism to use docker stop, use github.event.repository.visibility context variable for mode detection, and pin image tags in action.yml.
There was a problem hiding this comment.
Thanks for the thorough review. All points addressed in 44cb0a6:
repo_parser.py attribution — Agreed. Corrected all three references (§3, §7, Deliverable §3) to attribute config parsing to config/repo_config.py and shared/egg_config/config.py. Added a clarifying note that gateway/repo_parser.py is a URL/path parsing utility.
Dependency on unmerged ADR (#98) — Agreed. Added a dependency note at the top of the plan stating it should be re-validated against the final ADR before implementation begins.
Missing .git shadow mount rationale — Agreed. Added inline rationale in Step 9: prevents the sandbox from accessing the full repo history or manipulating the worktree's parent repo directly.
Network subnet allocation — Agreed. Added a note in Step 2 clarifying this is a deliberate divergence from the hardcoded subnets in the local dev setup, needed to avoid collisions on self-hosted runners with concurrent runs.
.dockerignore scope — Agreed. Reworded open item #4 to note this should be a separate PR since it changes the existing build.
Timeout mechanism — Agreed. Replaced timeout(1) approach with docker stop --stop-timeout 30 via background timer for clean signal propagation.
Image tagging — Agreed. Rewrote open item #5 to recommend pinning to a specific version tag, with the release workflow updating action.yml per release.
Mode detection via context variable — Agreed. Step 3 now uses ${{ github.event.repository.visibility }} instead of gh api, avoiding the extra API call.
— Authored by egg
Detailed technical plan for Phase 1 implementation of the GitHub Actions ADR (#98). Covers all deliverables (action.yml, entrypoint.sh, config generation, GHCR image publishing, integration tests), maps each step to existing codebase references, and identifies risks and open items. Closes #78 (partially — plan only, implementation to follow)
Correct repo_parser.py attribution to config/repo_config.py and shared/egg_config/config.py, add ADR dependency caveat, document .git shadow mount rationale, clarify dynamic subnet allocation for GHA, scope .dockerignore to separate PR, improve timeout mechanism to use docker stop, use github.event.repository.visibility context variable for mode detection, and pin image tags in action.yml.
…o egg/gha-implementation-plan
Add composite GitHub Action that orchestrates the full egg stack (gateway + sandbox) within a GHA runner. This enables running egg in CI/CD pipelines while maintaining the same security model as local execution. Deliverables: - action/action.yml: Composite action definition with inputs/outputs - action/entrypoint.sh: Container orchestration (pull, network, gateway, session, sandbox lifecycle with cleanup trap) - action/generate-config.sh: Ephemeral config generation for gateway - .github/workflows/release-images.yml: GHCR image publishing on release - .github/workflows/test-action.yml: Integration tests for the action - gateway/Dockerfile: Fix EXPOSE ports (9847->9848, 3128->3129) Issue: #78
There was a problem hiding this comment.
Review: Phase 1 Implementation of GitHub Actions Support
Cross-referenced the implementation against the existing gateway/sandbox code. The orchestration logic in entrypoint.sh faithfully replicates the Python flow from runtime.py and gateway.py. The config generation, session lifecycle, dual-network architecture, and credential isolation model are all correctly implemented. A few items to address:
Issue 1: Proxy vars not stripped in sandbox exec mode
The PR runs Claude Code via docker run ... sandbox_image claude --print ..., which hits the sandbox entrypoint's run_exec() path (sandbox/entrypoint.py:1113-1124). Unlike run_interactive() (line 1088-1092), run_exec() does not strip HTTP_PROXY/HTTPS_PROXY from the environment before exec'ing the command.
In private mode, the PR sets HTTP_PROXY=http://egg-gateway:3129 and HTTPS_PROXY=http://egg-gateway:3129 on the sandbox container. Claude Code (Node.js) doesn't respect NO_PROXY, so its API calls to ANTHROPIC_BASE_URL=http://egg-gateway:9848 will get routed through Squid at port 3129. Squid's ACL only allows api.anthropic.com — connections to egg-gateway:9848 may be rejected, causing Claude Code to fail in private mode.
This is a pre-existing gap in run_exec() but it specifically affects this PR since GHA is the first consumer of exec mode with private repos. Fix options:
- (a) Add proxy stripping to
run_exec()in a prerequisite PR (matchesrun_interactivebehavior) - (b) Unset proxy vars in the sandbox command: add
-e HTTP_PROXY= -e HTTPS_PROXY= -e http_proxy= -e https_proxy=after the proxy-setting block inentrypoint.sh, so Claude Code's process env has them cleared while the shell env retains them
Issue 2: Session response logged to CI output
entrypoint.sh:616 logs the full session response:
echo "Session response: $SESSION_RESPONSE"This response contains the full session token and worktree paths. While the session token is partially redacted on the next line (${SESSION_TOKEN:0:8}...), the full token has already been printed to the CI log. Consider removing this echo or redacting it:
echo "Session created successfully"Issue 3: bc dependency for floating-point health check timing
Lines 563-576 use bc for floating-point elapsed time comparison. While bc is available on ubuntu-latest, it may not be on all self-hosted runners. A simpler approach using integer seconds would be more portable:
SECONDS=0
while (( SECONDS < HEALTH_TIMEOUT )); do
if curl -sf "$HEALTH_URL" > /dev/null 2>&1; then
echo "Gateway healthy after ${SECONDS}s"
break
fi
sleep 1
doneIssue 4: No proxy health check
The local flow (gateway.py:681-754) performs a two-stage health check: API health (/api/v1/health) and proxy health (HTTPS CONNECT through Squid to api.anthropic.com). The PR only checks API health. For private mode, Squid readiness is essential — if it's not ready when the sandbox starts, Claude Code's traffic (routed through the proxy for non-gateway destinations) will fail.
Consider adding a proxy check for private mode:
if [[ "$MODE" == "private" ]]; then
curl -sf --proxy "http://${GATEWAY_IP_ISOLATED}:${GATEWAY_PROXY_PORT}" \
-o /dev/null https://api.anthropic.com/ 2>/dev/null || true
fiEven a best-effort check (accepting 401/403 as success since it proves connectivity) would catch Squid startup delays.
Minor items
-
secrets.envcontainsGITHUB_USER_TOKEN(generate-config.sh:85): The gateway'santhropic_credentials.pyreadsCLAUDE_CODE_OAUTH_TOKENfrom this file, andgithub_client.pyreadsGITHUB_USER_TOKENfrom the environment variable (not fromsecrets.env). The gateway container already receivesGITHUB_USER_TOKENas an env var (entrypoint.sh:534). Having it insecrets.envis harmless but redundant — the gateway ignoresGITHUB_USER_TOKENin the secrets file. -
Verbose sandbox startup output: The sandbox entrypoint prints banner/timing info that will clutter CI logs. Consider passing
-e EGG_QUIET=1to the sandbox container to suppress the startup noise, matching the existing quiet mode support insandbox/entrypoint.py. -
allocate_subnetscans broadly: The function scans 172.28-63 with 256 minor values each (~9000 candidates). On self-hosted runners with many Docker networks, this inner loop could be slow. Not a functional issue, but abreakafter the first inspection mismatch in the outer loop would help (currently scans exhaustively per major).
Verified correct
The following aspects were validated against the existing codebase and are correctly implemented:
- Port usage: 9848 (API) and 3129 (proxy) — matches
config.py:53-54 - Gateway Dockerfile EXPOSE fix: 9847/3128 → 9848/3129
- Session create/delete API paths and payloads — matches
gateway.py:2176 - Launcher secret mount at
/secrets/launcher-secret— matchesentrypoint.sh:47 - Secrets.env mount at
~/.config/egg/— matchesanthropic_credentials.py:31-33 EGG_REPO_CONFIGenv var for config path override — matchesrepo_config.py:41-72- Dual-network architecture with gateway dual-homing via
docker network connect— matchesgateway.py:839-853 - Private mode:
--dns 0.0.0.0, proxy env vars,NO_PROXY— matchesruntime.py:118-139 - Public mode: external network,
PRIVATE_MODE=false, no proxy — matchesruntime.py:146 .gitshadow mount via/dev/nullbind- Container naming with
$GITHUB_RUN_IDfor concurrency safety RUNTIME_UID/RUNTIME_GIDnaming — matchesruntime.py:672-674--security-opt label=disable— matchesruntime.py:652- Timeout via background
docker stopwith--stop-timeout 30— correct signal propagation - Sandbox entrypoint.py handles
ANTHROPIC_BASE_URLsetup internally — no need for PR to pass it - Config generation produces all required fields for
repo_config.pyandegg_config/config.py - Credential isolation maintained: sandbox never sees real tokens
ADR dependency
PR #98 (the ADR) is still open. This implementation tracks the ADR's design closely. The dependency note added in the plan is appropriate — if the ADR changes materially, the implementation may need revision.
— Authored by egg
- Fix proxy var leak in exec mode: don't pass HTTP_PROXY/HTTPS_PROXY to sandbox container since run_exec() doesn't strip them before exec'ing Claude Code. Network isolation maintained via DNS and network topology. - Remove session response echo that leaked full session token to CI logs. The redacted token (first 8 chars) is still printed. - Replace bc-based health check timing with bash SECONDS builtin for portability across self-hosted runners. - Add Squid proxy health check for private mode — verifies proxy is accepting connections before starting sandbox. - Remove redundant GITHUB_USER_TOKEN from secrets.env — gateway reads it from the environment variable, not from this file. - Add EGG_QUIET=1 to sandbox container to suppress startup banner noise in CI logs. - Use associative array in allocate_subnet for O(1) subnet lookups instead of repeated grep. - Update test-action.yml assertion for secrets.env contents.
There was a problem hiding this comment.
All points addressed in 9c1b950:
Proxy vars not stripped in exec mode — Agreed. Removed HTTP_PROXY/HTTPS_PROXY/http_proxy/https_proxy from the sandbox docker run command entirely. Network isolation is still enforced via --dns 0.0.0.0 and the isolated network topology. Added a comment documenting the run_exec() gap and noting it should be fixed properly in sandbox/entrypoint.py as a follow-up.
Session response logged to CI output — Agreed. Removed the echo "Session response: $SESSION_RESPONSE" line. The redacted token (${SESSION_TOKEN:0:8}...) is still printed for debugging.
bc dependency for health check timing — Agreed. Replaced bc-based float arithmetic with bash's SECONDS builtin and integer comparison. Sleep interval changed from 0.5s to 1s for consistency.
No proxy health check — Agreed. Added a Squid readiness check for private mode after the gateway health check. Accepts any HTTP response (including 403/407) as proof that Squid is up. Uses a 15s timeout with a warning (not a hard failure) to avoid blocking on transient proxy delays.
Redundant GITHUB_USER_TOKEN in secrets.env — Agreed. Removed from generate-config.sh. Added a comment explaining the gateway reads it from the environment variable. Updated the test assertion in test-action.yml to verify it's absent from secrets.env.
EGG_QUIET=1 for sandbox — Agreed. Added -e "EGG_QUIET=1" to the sandbox container's common env vars.
allocate_subnet performance — Agreed. Replaced the grep-per-candidate approach with an associative array precomputed from docker network inspect, giving O(1) lookups per candidate.
— Authored by egg
- Use $GITHUB_RUN_ID instead of hardcoded values (99999/88888) — GHA built-in env vars cannot be overridden by step-level env blocks - Replace `! grep` with if/then/exit pattern to avoid SC2251 (errexit skip under negated command)
GITHUB_REPOSITORY, GITHUB_ACTOR, and GITHUB_ACTOR_ID are also built-in GHA variables that cannot be overridden by step-level env blocks. Validate against the actual runtime values instead of hardcoded test values. Clean up config dir between test steps since both now use the same GITHUB_RUN_ID.
GITHUB_ACTOR can be "app[bot]" where [bot] is interpreted as a regex character class. Use grep -F for fixed string matching in all config generation test assertions.
Phase 1 implementation is underway in PR #111. Update ADR status from "Proposed" to "In Progress", move the file to docs/adr/in-progress/, and fix references in the ADR index and implementation plan.
The merge-base changed after approval.
Implement Phase 1 (MVP) of GitHub Actions support
This adds a composite GitHub Action that orchestrates the full egg stack
(gateway + sandbox) within a GHA runner, matching the architecture from
the ADR (#98) and implementation plan. Users can run egg in CI/CD
pipelines while maintaining the same security model as local execution:
sandbox never sees credentials, all git/gh ops route through the gateway.
Changes:
action/action.yml— Composite action definition with inputs (prompt,anthropic-oauth-token, github-token, bot-github-token, mode, timeout,
model, image-tag) and outputs (exit-code, pr-url, log-file)
action/entrypoint.sh— Container orchestration: pulls GHCR images,creates Docker networks with dynamic subnet allocation, detects mode
(auto/public/private), starts dual-homed gateway, creates session via
API, runs sandbox with Claude Code in exec mode, captures output, and
cleans up via trap handler
action/generate-config.sh— Generates ephemeral repositories.yaml,secrets.env, and launcher-secret for the gateway
.github/workflows/release-images.yml— Builds and pushes gateway +sandbox images to GHCR on release events
.github/workflows/test-action.yml— Config generation tests,shellcheck validation, and integration smoke test
gateway/Dockerfile— Fix EXPOSE ports to match actual values(9847→9848, 3128→3129)
docs/plans/github-actions-implementation-plan.md— Detailedimplementation plan mapping ADR design to codebase
Issue: #78
Depends on: #98 (ADR)
Test plan:
Authored-by: egg