feat: SSH transport for local sandboxed deployments

## Background

OpenAB targets k3s on cloud, where Kubernetes NetworkPolicy and Pod isolation handle security. For local deployments (developer laptop, home server), the only path today is:

```toml
[agent]
command = "claude"
args = ["--acp"]  # full host permissions
```

The Claude subprocess inherits the host's full filesystem and network access. For a Discord bot accepting messages from arbitrary users, this is a meaningful attack surface.

## Proposal: SSH as a zero-code-change transport

`AcpConnection::spawn()` treats the agent as a stdio JSON-RPC process. SSH is a transparent byte pipe over that same stdio — no changes to ACP protocol, `SessionPool`, or `AcpConnection` internals.

```toml
# Current (local, no isolation)
[agent]
command = "claude"
args = ["--acp"]

# Proposed (SSH to sandbox)
[agent]
command = "ssh"
args = [
  "-T",                                     # No PTY (see below)
  "-o", "BatchMode=yes",                    # Fail-fast, no interactive prompts
  "-o", "ServerAliveInterval=30",           # Keep-alive for long sessions
  "-o", "ServerAliveCountMax=3",
  "-o", "StrictHostKeyChecking=accept-new", # Daemon has no terminal
  "user@sandbox-host",
  "claude", "--acp"
]
```

That's the entire diff from OpenAB's perspective. **Zero code changes, zero new dependencies, zero additional maintenance burden for maintainers.** If SSH transport breaks, it's a user configuration issue, not an OpenAB bug.

```
Current                              Proposed
───────────────────────              ──────────────────────────────
OpenAB                               OpenAB
  │ spawn                              │ spawn
  ▼                                    ▼
claude (host permissions)            ssh -T user@sandbox
  ├─ reads ~/.ssh ✗                    │ encrypted stdio pipe
  ├─ reads ~/Documents ✗               ▼
  └─ unrestricted network ✗          claude (inside sandbox)
                                       ├─ Landlock: /workspace only ✓
                                       ├─ Network: allowlist only ✓
                                       └─ MCP via host proxy ✓
```

## Why `-T` is critical (experimentally verified)

I tested SSH stdio with an OrbStack VM. Results:

| Flag | Behavior | JSON-RPC safe? |
|------|----------|----------------|
| `-T` | Clean byte pipe, stderr separated | ✅ Yes |
| `-t` | Warns "PTY not allocated", stderr leaks into stdout | ❌ Corrupts JSON stream |
| `-tt` | Forced PTY + piped stdin → **hangs indefinitely** (exit 144/SIGKILL) | ❌ Deadlock |

PTY inserts CR/LF conversion (`\n` → `\r\n`), merges stderr into stdout, and enables echo mode — all of which break JSON-RPC parsing. **`-T` is mandatory, not optional.**

## Sandbox is user's choice

The proposal is about SSH as a transport, not any specific sandbox:

| Environment | SSH target | Notes |
|-------------|-----------|-------|
| Mac (OrbStack) | `vm-name@orb` | Via `~/.orbstack/ssh/config` ProxyCommand |
| Linux | `user@nspawn-container` | systemd-nspawn with SSH |
| Remote machine | `user@10.0.0.5` | Any Linux server |
| Docker | wrapper script using `docker exec` | Alternative to SSH |

## MCP server access from sandbox (experimentally verified)

Tested from OrbStack VM:

```
From VM → 127.0.0.1:18765     → FAIL ❌ (VM's localhost ≠ host's localhost)
From VM → host.internal:18765  → 200 OK ✅ (OrbStack DNS alias to host)
```

For MCP servers running on the host, the sandbox cannot use `localhost`. Options vary by sandbox technology:

```
MCP access patterns:

Option A: Host DNS alias (OrbStack)
  claude (VM) ──http://host.internal:PORT──> MCP server (host)

Option B: SSH port forwarding (universal)
  ssh -L 8080:localhost:8080 user@sandbox
  claude (VM) ──http://localhost:8080──> [tunnel] ──> MCP server (host)

Option C: Network bridge (Docker --network host)
  claude (container) ──http://localhost:PORT──> MCP server (host)
```

This means web search from inside a sandbox works without opening the sandbox to arbitrary domains:

```
claude (sandbox) ──host.internal:8080──> MCP search proxy (host) ──HTTPS──> Brave/Tavily
```

## Known limitations

### 1. `kill_on_drop` does not reliably terminate remote processes

Experimentally verified: killing the local SSH client process leaves the remote subprocess running.

```
kill ssh-client → SSH server receives EOF → sends SIGHUP to remote shell
                                          → but remote claude may survive
                                            (especially with nohup or ControlMaster)
```

Mitigations:
- Do **not** use SSH ControlMaster for agent connections
- Ensure SSH server has `ClientAliveInterval` set (detects dead clients)
- Session pool TTL cleanup should verify remote process health
- Future: OpenAB could send an explicit ACP `session/close` before dropping the connection

### 2. SSH connection startup latency

Each `AcpConnection::spawn()` incurs SSH handshake overhead (~50-200ms). Negligible for long-lived sessions (pool TTL = 24h), but noticeable if sessions are frequently recycled. ControlMaster could reduce this but conflicts with limitation #1.

### 3. SSH key auth is required

OpenAB runs as a daemon without a terminal. Interactive password prompts will hang the process. `-o BatchMode=yes` forces fail-fast behavior. Users must configure SSH key-based auth beforehand.

## Scope

Intentionally narrow:

- ✅ Document SSH as a supported `command` pattern with config example and SSH flag rationale
- ✅ Add `config.toml.example` snippet for the SSH sandbox case
- ❌ No changes to `AcpConnection`, `SessionPool`, or ACP protocol
- ❌ No sandbox-specific code or dependencies
- ❌ No changes to cloud/k3s deployment path

If there's interest in a first-class `[agent.backend]` abstraction later, that can be a separate discussion.

## Relation to #99

#99 addresses the **input** side — how prompts reach OpenAB (Discord vs HTTP).
This issue addresses the **execution** side — where and with what permissions the agent runs.

The two are orthogonal and can be implemented independently.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: SSH transport for local sandboxed deployments #104

Background

Proposal: SSH as a zero-code-change transport

Why `-T` is critical (experimentally verified)

Sandbox is user's choice

MCP server access from sandbox (experimentally verified)

Known limitations

1. `kill_on_drop` does not reliably terminate remote processes

2. SSH connection startup latency

3. SSH key auth is required

Scope

Relation to #99

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Flag	Behavior	JSON-RPC safe?
`-T`	Clean byte pipe, stderr separated	✅ Yes
`-t`	Warns "PTY not allocated", stderr leaks into stdout	❌ Corrupts JSON stream
`-tt`	Forced PTY + piped stdin → hangs indefinitely (exit 144/SIGKILL)	❌ Deadlock

Environment	SSH target	Notes
Mac (OrbStack)	`vm-name@orb`	Via `~/.orbstack/ssh/config` ProxyCommand
Linux	`user@nspawn-container`	systemd-nspawn with SSH
Remote machine	`user@10.0.0.5`	Any Linux server
Docker	wrapper script using `docker exec`	Alternative to SSH

feat: SSH transport for local sandboxed deployments #104

Description

Background

Proposal: SSH as a zero-code-change transport

Why -T is critical (experimentally verified)

Sandbox is user's choice

MCP server access from sandbox (experimentally verified)

Known limitations

1. kill_on_drop does not reliably terminate remote processes

2. SSH connection startup latency

3. SSH key auth is required

Scope

Relation to #99

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Why `-T` is critical (experimentally verified)

1. `kill_on_drop` does not reliably terminate remote processes