Skip to content

Conversation

@EhabY
Copy link
Collaborator

@EhabY EhabY commented Oct 17, 2025

Closes #626

@EhabY EhabY force-pushed the show-remote-ssh-output-on-start branch from bbb0391 to 92b1788 Compare October 17, 2025 13:13
@johnstcn johnstcn requested a review from code-asher October 20, 2025 14:11
@EhabY EhabY force-pushed the show-remote-ssh-output-on-start branch from 92b1788 to 719de6f Compare October 22, 2025 08:41
@EhabY EhabY force-pushed the show-remote-ssh-output-on-start branch from 719de6f to fe6d1dc Compare October 23, 2025 10:31
title: "Waiting for the agent to connect...",
location: vscode.ProgressLocation.Notification,
progressTitle: "Waiting for the agent to connect...",
timeoutMs: 3 * 60 * 1000,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We used to potentially wait forever, should we set a timeout for this? What's a good value 🤔 ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is no good timeout value really, agents can take an arbitrary amount of time to connect. As long as there is no error or network disconnect, we should keep waiting.

If we did need a timeout, it should probably be something like where we send a ping and if we do not get a pong back in N seconds after M attempts then we error. I think maybe websockets already do something like that though?

Ultimately though as long as the connection looks good, and we are communicating the status to the user, we should let the user decide whether they want to wait, IMO.

Copy link
Collaborator Author

@EhabY EhabY Oct 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah fair enough, timing out for something that can potentially take arbitrary amount of time and would lead to the product being unusable is bad. Removed the timeout!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually brings up a question, should we make the progress indicator cancellable? Currently it's not which might make sense since cancelling means we close the remote connection.

Copy link
Member

@code-asher code-asher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not had a chance to try it yet, but it is looking really good!

}
});
.filter((line) => line !== "");
for (const line of lines) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see a few places have been changed from forEach to a for of loop, was there a specific reason for this or just preference?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I installed sonarqube which has some default warnings, see https://next.sonarqube.com/sonarqube/coding_rules?open=javascript%3AS7728&rule_key=javascript%3AS7728

It's not really that significant but it's better I suppose

Comment on lines 152 to 156
// This fetches the initial bunch of logs.
const logs = await client.getWorkspaceAgentLogs(agent.id);
for (const log of logs) {
writeEmitter.fire(log.output + "\r\n");
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The watch endpoint already streams all the logs, I think (including previous ones, since the default for the after query param is zero which means all logs), so I believe we would not need the extra getWorkspaceAgentLogs call.

Copy link
Collaborator Author

@EhabY EhabY Oct 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this also true for the waitForBuild? because there we also do:

// This fetches the initial bunch of logs.
const logs = await client.getWorkspaceBuildLogs(workspace.latest_build.id);
for (const log of logs) {
	writeEmitter.fire(log.output + "\r\n");
}

const socket = await client.watchBuildLogsByBuildId(
	workspace.latest_build.id,
	logs,
);

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to be the case, I've removed the initial fetching and the logs seem to be complete for both requests 🤔

title: "Waiting for the agent to connect...",
location: vscode.ProgressLocation.Notification,
progressTitle: "Waiting for the agent to connect...",
timeoutMs: 3 * 60 * 1000,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is no good timeout value really, agents can take an arbitrary amount of time to connect. As long as there is no error or network disconnect, we should keep waiting.

If we did need a timeout, it should probably be something like where we send a ping and if we do not get a pong back in N seconds after M attempts then we error. I think maybe websockets already do something like that though?

Ultimately though as long as the connection looks good, and we are communicating the status to the user, we should let the user decide whether they want to wait, IMO.

try {
terminalSession = new TerminalSession("Agent Log");
const { socket: agentSocket, completion: logsCompletion } =
await streamAgentLogs(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The await here I think means we could possibly miss a state change in between the previous watch and this? Maybe we could store the current state in monitor and check it before registering the onChange.

But, I do feel like we have this state machine that is broken up into separate sections and it could make a lot of sense to put it all together. A single machine that can take a workspace object, explicitly lists out all the possible states, and does the right thing.

For example what if the workspace is stopped while we wait for the scripts? We are only checking the lifecycle state so we would miss that change I think. The web socket would close eventually, although I think we might not be watching for that either (I think this may have already been an issue though, or I am misreading the code haha).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean that we have one big state machine that checks the workspace state AND the life cycle state of the agents?

So for example, if we are starting the workspace then handle that as before, if we switch to started then we handle the agent life cycle all in one big switch that is executed on every onChange

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, added an agent status/lifecycle state machine and made sure to throw an error if at any point the workspace is not running

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Show agent startup script when connecting to a Workspace

2 participants