Skip to content

fix(ssh): restart remote terminal sessions after SSH reconnection#1471

Merged
arnestrickmann merged 4 commits intomainfrom
emdash/albert-fix-889
Mar 14, 2026
Merged

fix(ssh): restart remote terminal sessions after SSH reconnection#1471
arnestrickmann merged 4 commits intomainfrom
emdash/albert-fix-889

Conversation

@arnestrickmann
Copy link
Contributor

Summary

  • Add restart() method to TerminalSessionManager that re-establishes the PTY connection for dead remote sessions
  • Emit a custom emdash:ssh-connection-restored event when SSH transitions from disconnected to connected, listened to by TerminalPane to trigger session restart
  • Guard against duplicate PTY connections by deduplicating connectPty() and restart() calls with stored promises
  • Track PTY data/exit listeners separately (ptyListenerDisposables) so they can be cleaned up and re-registered on restart without disposing the full session
  • In SessionRegistry, auto-restart inactive remote sessions on reattach

Test plan

  • Verify sessionRegistry.test.ts passes: inactive remote sessions restart on reattach, active sessions do not
  • Manual test: disconnect SSH, reconnect, confirm terminal sessions resume without requiring a new task
  • Verify no duplicate PTY listeners accumulate after multiple reconnections

@vercel
Copy link

vercel bot commented Mar 14, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
docs Ready Ready Preview, Comment Mar 14, 2026 4:57am

Request Review

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 14, 2026

Greptile Summary

This PR implements SSH reconnection recovery for remote terminal sessions by wiring together three layers: a custom DOM event (emdash:ssh-connection-restored) dispatched by useRemoteProject on state transitions, consumed by TerminalPane to call restart() on the affected session; a new restart() method on TerminalSessionManager that re-establishes the PTY connection for dead sessions while deduplicating concurrent calls; and an auto-restart guard in SessionRegistry.getOrCreate() for passive reattach paths. The overall design is sound, but there is one correctness issue and one design risk worth addressing before merging.

  • Bug — onPtyStarted listener leaks into disposables across restart cycles: On each successful connectPty() call (both initial and restart), an offStarted listener is pushed to this.disposables. cleanupPtyListeners() only drains ptyListenerDisposables, so if the PTY exits before onPtyStarted fires, the stale listener remains in disposables when the next restart adds a fresh one. When onPtyStarted eventually fires, both stale listeners match and each calls sendSizeIfStarted(), sending redundant resize packets. The fix is a one-line change: push offStarted into ptyListenerDisposables instead of disposables.
  • Design risk — ptyConnectPromise deduplication silently drops hasExistingSession=true: If connectPty(false) is still awaiting when restart() calls connectPty(true), the second call returns the cached promise that was started with resume: false, telling the backend to start a fresh PTY instead of resuming the tmux session. This is a narrow race, but a defensive log warning would surface it clearly.
  • Fragile invariant in useRemoteProject: connectionStateRef and the React connectionState must always be updated together via updateConnectionState. Any future direct setConnectionState call would desync them and could cause spurious or missed restore events.
  • Test coverage gap: sessionRegistry.test.ts covers the single-restart case but misses multi-reconnect cycles and the TerminalPane event-listener trigger path.

Confidence Score: 3/5

  • Safe to merge after addressing the onPtyStarted listener accumulation bug; the remaining issues are low-probability design risks.
  • The core reconnection design is well-structured and the deduplication guards are in place. However, the onPtyStarted listener being added to disposables (rather than ptyListenerDisposables) is a real correctness bug that can cause redundant resize events after multiple reconnections — the exact scenario this PR targets. The narrow ptyConnectPromise + hasExistingSession race and the connectionStateRef invariant are lower-risk but add meaningful review surface.
  • src/renderer/terminal/TerminalSessionManager.ts — the onPtyStarted listener placement at line 1301 needs the fix before this is production-safe across multiple reconnection cycles.

Important Files Changed

Filename Overview
src/renderer/terminal/TerminalSessionManager.ts Adds isPtyActive(), restart(), cleanupPtyListeners(), and deduplicates connectPty() with a stored promise. The onPtyStarted listener is pushed to this.disposables (not ptyListenerDisposables), meaning it is not cleaned up by cleanupPtyListeners() between restart cycles — stale listeners can accumulate if the PTY exits before onPtyStarted fires. The ptyConnectPromise deduplication also silently drops the hasExistingSession=true flag on concurrent calls.
src/renderer/terminal/SessionRegistry.ts Adds auto-restart of inactive remote sessions on reattach via getOrCreate. The restart is fire-and-forget (void existing.restart()), which is safe because restart() has internal deduplication. The setTheme and attach calls still proceed synchronously on the returned session, which is the correct behavior.
src/renderer/components/TerminalPane.tsx Adds a useEffect that listens for SSH_CONNECTION_RESTORED_EVENT and calls sessionRef.current?.restart(). The use of sessionRef (a ref, not a closure value) is correct — the handler always accesses the latest session. The dependency on remote?.connectionId ensures the listener is correctly re-registered when the connection ID changes.
src/renderer/hooks/useRemoteProject.ts Introduces connectionStateRef to track the previous state and detect 'disconnected→connected' transitions inside updateConnectionState, then dispatches dispatchSshConnectionRestored. The ref-based previous-state detection is correct. A fragile invariant exists: connectionStateRef and connectionState must always be updated together through updateConnectionState; any future direct call to setConnectionState would desync them.
src/renderer/lib/sshEvents.ts New file exporting a custom DOM event name, a typed detail interface, and a safe dispatch helper. The typeof window guard makes it safe in SSR/test environments.
src/test/renderer/sessionRegistry.test.ts New test file covering the two primary cases in SessionRegistry.getOrCreate: inactive remote sessions restart on reattach, and active sessions do not. Missing coverage for multiple successive reconnections and the TerminalPane event listener path.

Sequence Diagram

sequenceDiagram
    participant SSH as SSH Monitor (main)
    participant Hook as useRemoteProject
    participant Window as window (DOM events)
    participant Pane as TerminalPane
    participant Registry as SessionRegistry
    participant TSM as TerminalSessionManager

    SSH->>Hook: sshGetState → 'connected'
    Note over Hook: previousState = 'disconnected'
    Hook->>Hook: connectionStateRef.current = 'connected'
    Hook->>Window: dispatchSshConnectionRestored({ connectionId })

    par TerminalPane path
        Window-->>Pane: SSH_CONNECTION_RESTORED_EVENT
        Pane->>TSM: sessionRef.current.restart()
    and SessionRegistry path (on next attach)
        Pane->>Registry: terminalSessionRegistry.attach(options)
        Registry->>TSM: existing.isPtyActive() → false
        Registry->>TSM: existing.restart()
    end

    Note over TSM: restartPromise deduplication<br/>prevents concurrent restarts

    TSM->>TSM: cleanupPtyListeners()
    TSM->>TSM: connectPty(hasExistingSession=true)
    TSM-->>SSH: ptyStartDirect({ resume: true })
    SSH-->>TSM: { ok: true }
    TSM->>TSM: ptyStarted = true
    TSM->>TSM: setupPtyDataListener(id)
    TSM->>Pane: emitReady()
Loading

Last reviewed commit: 10ca3c1

offStarted = undefined;
}
});
if (offStarted) this.disposables.push(offStarted);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

onPtyStarted listener accumulates in disposables across restart cycles

offStarted is pushed into this.disposables, but cleanupPtyListeners() — called by restart() — only drains this.ptyListenerDisposables. This means that if the PTY process exits before onPtyStarted has fired (i.e., the listener hasn't had a chance to self-remove from disposables via removeOffStartedFromDisposables()), the stale entry remains in this.disposables when the next restart adds a new offStarted.

When onPtyStarted eventually fires for the new restart, all stale listeners whose closured payload?.id === id will match and:

  1. Each fires sendSizeIfStarted() — sending redundant resize packets per stale entry.
  2. Each calls its own offStarted?.() to unsubscribe (so IPC subscriptions do get cleaned up), but only after all the extra side-effects have already run.

This PR intentionally separated ptyListenerDisposables from disposables for exactly this kind of restart-safe cleanup. The offStarted listener belongs in the same category.

Suggested change
if (offStarted) this.disposables.push(offStarted);
if (offStarted) this.ptyListenerDisposables.push(offStarted);

Comment on lines +1232 to +1234
if (this.ptyConnectPromise) {
return this.ptyConnectPromise;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ptyConnectPromise deduplication silently drops the hasExistingSession=true argument

When restart() calls connectPty(true) and this.ptyConnectPromise is already non-null (i.e., a concurrent connectPty(false) call from the initial start-up path is still in flight), the second call immediately returns the cached promise. That cached promise was created with resume: false in its closure, so the backend receives a "start fresh" instruction instead of "resume existing tmux session".

While this requires a narrow race (the initial connectPty(false) must still be awaiting resolution at exactly the moment the PTY exit event fires and restart() is called), it is observable when the SSH connection is flaky during the first connect. The typical symptom would be the terminal appearing blank after reconnection instead of restoring the previous session.

Consider logging a warning when the cached promise is returned in order to make this easier to diagnose:

if (this.ptyConnectPromise) {
  if (hasExistingSession) {
    log.warn('terminalSession:connectPty deduplication dropped resume flag', { id: this.options.taskId });
  }
  return this.ptyConnectPromise;
}

const [host, setHost] = useState<string | null>(null);
const healthCheckIntervalRef = useRef<NodeJS.Timeout | null>(null);
const isMountedRef = useRef(true);
const connectionStateRef = useRef<ConnectionState>(connectionState);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

connectionStateRef not kept in sync with connectionState when connectionState changes outside updateConnectionState

connectionStateRef is initialized once from connectionState and then updated exclusively inside updateConnectionState. However, there is one code path that calls setConnectionState directly without going through updateConnectionState — the initial useState lazy initializer. If the module-level connectionStateCache returns 'connected' for the project, connectionState starts as 'connected' while connectionStateRef.current is also initialized to 'connected'. This is fine.

The real risk is subtler: setConnectionState is the React state setter, and connectionStateRef is the mutable mirror. They are only kept in sync through updateConnectionState. If any future code path calls setConnectionState directly (bypassing updateConnectionState), connectionStateRef would become stale and the transition guard previousState !== 'connected' could fire spuriously (false positive restore event) or miss a real transition (false negative). This is a fragile invariant.

Consider enforcing it explicitly with a private helper that wraps both updates, and removing the direct export of setConnectionState from the hook's closure.

Comment on lines +36 to +87
it('restarts an existing remote session when its PTY is no longer active', () => {
const theme = { base: 'dark' as const };
const container = {} as HTMLElement;

const first = terminalSessionRegistry.attach({
taskId: 'task-1',
container,
remote: { connectionId: 'ssh-1' },
initialSize: { cols: 120, rows: 32 },
theme,
});

expect(managerInstances).toHaveLength(1);
expect(first).toBe(managerInstances[0]);

managerInstances[0].isPtyActive.mockReturnValue(false);

const second = terminalSessionRegistry.attach({
taskId: 'task-1',
container,
remote: { connectionId: 'ssh-1' },
initialSize: { cols: 120, rows: 32 },
theme,
});

expect(second).toBe(first);
expect(managerInstances[0].restart).toHaveBeenCalledTimes(1);
});

it('does not restart an active session on reattach', () => {
const theme = { base: 'dark' as const };
const container = {} as HTMLElement;

terminalSessionRegistry.attach({
taskId: 'task-2',
container,
remote: { connectionId: 'ssh-2' },
initialSize: { cols: 120, rows: 32 },
theme,
});

terminalSessionRegistry.attach({
taskId: 'task-2',
container,
remote: { connectionId: 'ssh-2' },
initialSize: { cols: 120, rows: 32 },
theme,
});

expect(managerInstances[0].restart).not.toHaveBeenCalled();
});
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test suite missing coverage for multiple successive reconnection cycles

The tests verify a single restart trigger but don't cover the scenario where the SSH connection drops and restores more than once within the lifetime of a session. Key cases worth adding:

  1. Multi-reconnect cycle: PTY exits → restart → PTY exits again → restart again. Verify that restart is called the expected number of times with no stale side-effects (e.g., sendSizeIfStarted equivalent not called multiple times).
  2. Concurrent restart deduplication: Calling terminalSessionRegistry.attach() twice in rapid succession when isPtyActive() returns false should result in exactly one restart() invocation (dedup via restartPromise).
  3. TerminalPane event listener path: The SSH_CONNECTION_RESTORED_EVENT handler in TerminalPane independently calls restart(). A test covering both triggers firing for the same session would confirm the promise deduplication is effective end-to-end.

# Conflicts:
#	src/main/services/ptyManager.ts
@arnestrickmann arnestrickmann merged commit c9e8f4f into main Mar 14, 2026
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant