Skip to content

Remove EGG_REPOS_DIR, mount only configured repos#548

Merged
jwbron merged 7 commits intomainfrom
egg/remove-egg-repos-dir
Feb 12, 2026
Merged

Remove EGG_REPOS_DIR, mount only configured repos#548
jwbron merged 7 commits intomainfrom
egg/remove-egg-repos-dir

Conversation

@jwbron
Copy link
Owner

@jwbron jwbron commented Feb 12, 2026

Summary

  • Removes EGG_REPOS_DIR which mounted the entire ~/repos directory into gateway and orchestrator containers, exposing all repos regardless of configuration
  • Generates a docker-compose.override.yml with per-repo volume mounts derived from local_repos.paths in repositories.yaml
  • Replaces EGG_HOST_REPOS_DIR env var with EGG_HOST_REPO_MAP (JSON mapping of repo_name -> host_path) for the orchestrator to mount individual repos into spawned sandbox containers

Test plan

  • Run egg --compose --build — verify override file is generated correctly
  • Check docker compose config output shows per-repo volume mounts
  • Start egg in private mode (egg --private) — verify session creation succeeds
  • Start egg in public mode (egg) — verify session creation succeeds
  • Confirm only configured repos appear in docker exec egg-gateway ls /home/egg/repos/
  • Run integration tests: pytest integration_tests/local_pipeline/

egg-orchestrator and others added 3 commits February 12, 2026 00:12
Instead of mounting the entire ~/repos directory into gateway and
orchestrator containers, generate a docker-compose.override.yml with
per-repo volume mounts derived from local_repos.paths in
repositories.yaml. This prevents exposing repos that aren't configured.

The orchestrator now receives EGG_HOST_REPO_MAP (JSON mapping of
repo_name -> host_path) instead of EGG_HOST_REPOS_DIR, and passes
per-repo volume mounts when spawning sandbox containers.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Contributor

@james-in-a-box james-in-a-box bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: Remove EGG_REPOS_DIR, mount only configured repos

Summary

This PR improves security by mounting only configured repos rather than the entire ~/repos directory. The core approach is sound, but there are several issues that need to be addressed.


Issues

1. Path Injection in YAML Generation (compose.py:203)

The volume mount lines are generated by directly interpolating paths into YAML without any escaping:

volume_lines.append(f"      - {repo_path}:/home/egg/repos/{name}")

If a repo path contains special YAML characters (:, #, ', ", newlines), this could produce malformed or exploitable YAML. Consider using proper YAML serialization:

import yaml
# or at minimum, validate paths don't contain dangerous characters

While paths from repositories.yaml are somewhat trusted, defense-in-depth is appropriate here.

2. Incomplete .env Quoting Logic (compose.py:148-153)

The quoting logic for the .env file is incomplete:

if " " in value or "#" in value or "'" in value:
    lines.append(f'{key}="{value}"')
else:
    lines.append(f"{key}={value}")

This misses:

  • Values containing " (double quotes) — wrapping in double quotes without escaping breaks parsing
  • Values containing \n (newlines) — breaks .env format entirely
  • Values containing $ — may cause unintended variable expansion
  • Values containing backticks — shell command substitution risk

The EGG_HOST_REPO_MAP value is JSON, which can contain ". If a repo path contains quotes or special characters, this could break. Fix:

def escape_env_value(value: str) -> str:
    """Escape value for .env file."""
    if not any(c in value for c in ' "#\'$`\n\\'):
        return value
    # Escape backslashes and double quotes, then wrap
    escaped = value.replace('\\', '\\\\').replace('"', '\\"')
    return f'"{escaped}"'

3. Silent Failure When EGG_HOST_REPO_MAP is Invalid (pipelines.py:1426-1428)

except json.JSONDecodeError:
    logger.warning("Failed to parse EGG_HOST_REPO_MAP, using empty map")
    host_repo_map = {}

This silently continues with an empty map, meaning no repos will be mounted in spawned containers. The pipeline will likely fail mysteriously later with confusing errors. This should either:

  • Fail fast with a clear error, or
  • At minimum, log at ERROR level with the actual value that failed to parse

4. Override File Not Written if _generate_override_file Returns None (compose.py:187-189)

repos = get_local_repos(config_file)
if not repos:
    return None

If no repos are configured, _generate_override_file returns None, and the override file is not generated. But if a stale docker-compose.override.yml exists from a previous run with different repos, it will still be used by compose_down() (lines 556-558, 631-633). This could cause mounts from an old configuration to persist.

Consider deleting the override file when there are no repos:

if not repos:
    override_file = compose_file.parent / "docker-compose.override.yml"
    if override_file.exists():
        override_file.unlink()
    return None

5. Missing Override File Permission Restriction (compose.py:229)

The .env file is correctly restricted with chmod(0o600) (line 165) because it contains secrets. The override file is written without any permission restriction:

override_file.write_text(new_content)
return override_file

While the override file doesn't contain secrets directly, it exposes the host paths of configured repositories, which could be sensitive information. Consider applying similar permissions for consistency.

6. Test Configuration Inconsistency (conftest.py:62-63)

The test config writes:

local_repos:
  paths:
    - /home/egg/repos/test-repo

But the actual test repo is at repos_dir (a temp directory), not /home/egg/repos/test-repo. This config path is never used because the override file and EGG_HOST_REPO_MAP are generated directly from repos_dir. The repositories.yaml config should either match reality or be documented as ignored for local_repos in tests.


Minor Issues

7. Duplicate Repo Name Detection Logic (compose.py:195)

dupes = [n for n in names if n in seen or seen.add(n)]  # type: ignore[func-returns-value]

This works but is a Python anti-pattern that relies on set.add() returning None (falsy). A clearer approach:

from collections import Counter
name_counts = Counter(names)
dupes = [n for n, count in name_counts.items() if count > 1]

8. No Validation of Repo Names (compose.py:202)

Repo names come from repo_path.name (the directory basename). If the basename contains characters that are problematic for Docker volume paths or the container filesystem (e.g., containing /), this could cause failures. Consider validating that names are safe identifiers.


Questions

  1. Backward Compatibility: What happens if someone upgrades with an existing .env containing EGG_REPOS_DIR? The variable is silently ignored. Consider logging a deprecation warning if EGG_REPOS_DIR is set.

  2. Empty Repo Map Behavior: If EGG_HOST_REPO_MAP is empty {}, the orchestrator spawns containers with no repo mounts. Is this intentional? Should there be validation that at least one repo is configured when starting a pipeline?


Approved Changes

  • The overall design is correct: per-repo mounts are more secure than mounting the entire repos directory
  • The integration test updates are appropriate
  • The docker-compose.yml changes are clean
  • Documentation updates are clear

Please address issues 1-4 before merging. Issues 5-8 are improvements that could be addressed in a follow-up.

— Authored by egg

@james-in-a-box

This comment has been minimized.

… failure, stale override

1. Quote volume mount paths in generated YAML to prevent injection via
   special characters; reject paths with control characters.
2. Escape backslashes and double quotes in .env values; detect $, backticks,
   and newlines to prevent variable expansion and command substitution.
3. Fail fast with ValueError when EGG_HOST_REPO_MAP contains invalid JSON
   instead of silently falling back to an empty map.
4. Delete stale docker-compose.override.yml when no repos are configured
   to prevent mounts from old configurations persisting.
@james-in-a-box james-in-a-box bot dismissed their stale review February 12, 2026 08:41

Superseded by new review

@jwbron jwbron force-pushed the egg/remove-egg-repos-dir branch from fe33d88 to cecff96 Compare February 12, 2026 08:41
get_repo_path() only resolved repo subdirectories from JSON body,
so GET/DELETE requests (no body) fell through to the parent directory.
Add request.args.get('repo') fallback and scan repo subdirectories
when loading individual pipelines or listing all pipelines.

Add before_request/after_request middleware so every HTTP request
produces a structured log line with method, path, status, and
duration. Log resolved configuration at startup.
Copy link
Contributor

@james-in-a-box james-in-a-box bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-Review: Address review feedback commit

The new commit addresses several issues from the previous review. Here's my assessment:


Issues Addressed ✓

1. JSON Parse Failure (Previously Issue #3): Now raises ValueError instead of silently falling back to empty map. This is correct.

2. Stale Override File (Previously Issue #4): Now deletes stale docker-compose.override.yml when no repos are configured. This is correct.

3. Control Character Injection: Added _bad_path_chars regex to reject paths with control characters. Good defense-in-depth.


Issues NOT Properly Fixed

1. .env Newline Handling Still Broken (compose.py:151-153)

The previous review noted that values containing \n break the .env format. The fix detects newlines (triggers quoting) but doesn't actually escape them:

if any(c in value for c in " \"'#$`\n\\"):
    escaped = value.replace("\\", "\\\\").replace('"', '\\"')
    lines.append(f'{key}="{escaped}"')

Newlines are detected, but replace('"', '\\"') doesn't handle \n. A value like "line1\nline2" produces:

KEY="line1
line2"

This is a malformed .env file that will be parsed incorrectly. Fix:

escaped = value.replace("\\", "\\\\").replace('"', '\\"').replace("\n", "\\n")

In practice, this is unlikely to occur with the current env vars, but the detection-without-handling is a latent bug.

2. YAML Double-Quote Injection Still Possible (compose.py:215-217)

The fix quotes volume mount strings:

mount = f"{path_str}:/home/egg/repos/{name}"
volume_lines.append(f'      - "{mount}"')

But if path_str contains double quotes (e.g., /path/with"quote/repo), the generated YAML becomes:

volumes:
  - "/path/with"quote/repo:/home/egg/repos/test"

This is malformed YAML. The double quotes inside the string must be escaped:

# Escape any double quotes in the path for YAML string safety
mount = f"{path_str}:/home/egg/repos/{name}".replace('"', '\\"')
volume_lines.append(f'      - "{mount}"')

While paths with double quotes are rare, the incomplete escaping is a correctness issue.


New Issues Introduced

3. Test Config Mismatch Persists (conftest.py:62-63)

The test repositories.yaml still contains:

local_repos:
  paths:
    - /home/egg/repos/test-repo

But the actual test repo is at repos_dir (a temp directory like /tmp/xxxxx/repos). The test manually generates the override file and EGG_HOST_REPO_MAP directly from repos_dir, bypassing the config file. This works but is confusing:

  1. The config says one thing, the test does another
  2. If someone refactors the test to use production code paths (e.g., get_local_repos()), it will break

Either update the config to match reality, or add a comment explaining why it's intentionally different.


Previously Noted Minor Issues (Still Outstanding)

These were marked as follow-up items, not blockers:

  • Issue #5: Override file not chmod'd (.env is, but override isn't)
  • Issue #7: Duplicate detection uses set.add() anti-pattern
  • Issue #8: No validation that repo names are safe identifiers (e.g., no /)

Summary

The commit addresses the most critical issues from the previous review (JSON parse failure, stale override file). However, two of the original issues remain incompletely fixed:

  1. Newline detection in .env without proper escaping
  2. Double-quote handling in YAML generation

Both are edge cases but represent incomplete implementations of the intended fixes. Please complete the escaping logic.

— Authored by egg

@james-in-a-box

This comment has been minimized.

The bin/egg-contract script (symlink to egg_lib/contract_cli.py)
used `from .config import GATEWAY_PORT` which fails with
"attempted relative import with no known parent package" when
invoked directly as a script.  Add try/except fallback to the
absolute import `from egg_lib.config import GATEWAY_PORT`.
@james-in-a-box james-in-a-box bot dismissed their stale review February 12, 2026 08:51

Superseded by new review

Docker bind mounts preserve host directory ownership, so repo
directories mounted at /home/egg/repos/* appear as root:root inside
containers. The egg user (uid 1000) cannot create files or directories,
breaking sandbox agent operations.

Add non-recursive chown of /home/egg/repos and its immediate children
in all three entrypoints (gateway, orchestrator, sandbox) during the
root-privilege phase before dropping to the unprivileged user. Only
top-level directories are chowned — repo file contents are managed by
git/gateway worktree operations, keeping startup fast on large repos.
Copy link
Contributor

@james-in-a-box james-in-a-box bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-Review: Commits since cecff96

This review covers the delta since my last review at commit cecff96:

  • 99263fb: Fix pipeline state resolution and add request logging
  • ca6dfe7: Fix egg-contract CLI relative import error

Previously Flagged Issues — Status

1. .env Newline Handling — Still not fixed

The previous review noted that newlines are detected but not escaped:

if any(c in value for c in " \"'#$`\n\\"):
    escaped = value.replace("\\", "\\\\").replace('"', '\\"')
    lines.append(f'{key}="{escaped}"')

A value containing \n would produce a malformed .env file. While unlikely to occur with current values, this is a latent bug from incomplete implementation. The detection triggers quoting but the replacement doesn't escape the newline.

Status: Outstanding — no changes to compose.py in this delta.

2. YAML Double-Quote Injection — Still not fixed

compose.py:216-217 still quotes the volume mount string without escaping double quotes within the path:

mount = f"{path_str}:/home/egg/repos/{name}"
volume_lines.append(f'      - "{mount}"')

A path like /path/with"quote/repo produces malformed YAML.

Status: Outstanding — no changes to compose.py in this delta.

3. Test Config Mismatch — Still present (conftest.py:61-63)

The test repositories.yaml still contains:

local_repos:
  paths:
    - /home/egg/repos/test-repo

But the actual test repo is at repos_dir (a temp directory), not /home/egg/repos/test-repo. This mismatch persists and could cause confusion if someone refactors the test to use production code paths.

Status: Outstanding — config still doesn't match actual test behavior.


New Changes Review

4. Pipeline Resolution Across Multiple Repos — Correctness concern (pipelines.py:86-120, 123-150)

The new _resolve_pipeline and _collect_all_pipelines functions scan repo subdirectories to find pipelines. While the functionality is reasonable, there's a concern:

Missing symlink handling in directory iteration (pipelines.py:111, 138):

for child in sorted(base_path.iterdir()):
    if child.is_dir() and (child / ".git").exists():

Path.iterdir() follows symlinks by default. child.is_dir() also returns True for symlinks pointing to directories. If there's a symlink loop or a symlink to a parent directory, this could cause:

  • Infinite loops (if symlinks create cycles)
  • Processing the same repo multiple times via different paths
  • Unexpected behavior with external symlinks

Consider adding child.is_symlink() check to exclude symlinks, or use child.resolve() to detect duplicates.

5. Request Logging Start Time Fallback (api.py:91)

duration_ms = round((time.monotonic() - getattr(g, "start_time", time.monotonic())) * 1000)

The fallback time.monotonic() is called at response time, meaning if log_request_start wasn't called, duration_ms would be 0 (current time minus current time). This is technically correct but could mask issues where before_request isn't firing. Consider logging a warning when g.start_time is missing instead of silently defaulting to 0ms.

6. get_repo_path Now Reads Query Params (routes/init.py:42)

repo = data.get("repo", "") or request.args.get("repo", "")

This change allows the repo parameter to come from query args in addition to the JSON body. This is a reasonable addition for GET requests that may not have a body. No issues found.

7. start_pipeline Mutates repo_path (pipelines.py:2009-2010)

store, pipeline = _resolve_pipeline(pipeline_id, repo_path)
# Use the store's repo_path so _run_pipeline operates on the correct directory
repo_path = store.repo_path

This correctly uses the resolved store's repo_path for _run_pipeline. The pattern is sound — the comment explains the reasoning.

8. Contract CLI Import Fix (contract_cli.py:39-42)

try:
    from .config import GATEWAY_PORT
except ImportError:
    from egg_lib.config import GATEWAY_PORT  # type: ignore[no-redef]

This fixes a relative import error when the CLI is invoked directly. The fallback is reasonable for supporting both package imports and direct execution.


Summary

The new commits address the pipeline state resolution issue (pipelines can now be found across repo subdirectories) and fix the contract CLI import error. These are correct and well-implemented.

However, the three issues from the previous review remain unaddressed:

  1. .env newline handling (detection without escaping)
  2. YAML double-quote injection in volume mounts
  3. Test config mismatch in conftest.py

Additionally, the new code introduces a minor concern about symlink handling in directory iteration that could cause issues in edge cases.

Given that the previously flagged issues (#1 and #2) represent incomplete implementations of intended fixes rather than new bugs, and the new functionality is correct, I'm approving this PR with the expectation that #1 and #2 will be addressed in a follow-up. Issue #3 (test config mismatch) is documentation/clarity and should also be fixed.

— Authored by egg

@james-in-a-box
Copy link
Contributor

egg review completed. View run logs

2 previous review(s) hidden.

@jwbron jwbron merged commit 7409c68 into main Feb 12, 2026
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant