-
Notifications
You must be signed in to change notification settings - Fork 160
fix: update workload status when health check fails for remote MCP servers #3077
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: update workload status when health check fails for remote MCP servers #3077
Conversation
f60f1fc to
59d6d41
Compare
When a remote MCP server becomes unreachable, the health check fails and the proxy is stopped. However, the workload status was not being updated, causing thv list to report the server as running even though nothing was listening on the port. This fix adds a callback mechanism that notifies the runner when a health check fails, allowing it to update the workload status to unhealthy. Closes stacklok#3079 Signed-off-by: Frederic Le Feurmou <fredericlefeurmou@gmail.com>
59d6d41 to
9f7dc5e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR fixes a "ghost status" issue where remote MCP servers that become unreachable continue to show as "running" in thv list, even though the health check has failed and the proxy has stopped. The solution adds a callback mechanism that allows the proxy to notify the runner when health checks fail, enabling proper workload status updates.
Key Changes:
- Added
HealthCheckFailedCallbacktype to allow notification of health check failures - Integrated callback into the transparent proxy health monitoring system
- Connected the callback to the status manager in the runner for remote servers
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/transport/types/transport.go | Defines HealthCheckFailedCallback type and adds OnHealthCheckFailed field to Config struct |
| pkg/transport/proxy/transparent/transparent_proxy.go | Adds onHealthCheckFailed callback parameter to constructor and invokes it when health check fails |
| pkg/transport/proxy/transparent/transparent_test.go | Updates all test calls to NewTransparentProxy with nil callback parameter |
| pkg/transport/http.go | Adds SetOnHealthCheckFailed setter method and passes callback to TransparentProxy |
| pkg/runner/runner.go | Sets health check failure callback for remote servers to update workload status to unhealthy |
| cmd/thv/app/proxy.go | Passes nil callback for local proxy command (not needed for local proxies) |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Removed the unused `OnHealthCheckFailed HealthCheckFailedCallback` code Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3077 +/- ##
==========================================
- Coverage 56.82% 56.80% -0.03%
==========================================
Files 335 335
Lines 33474 33492 +18
==========================================
+ Hits 19022 19025 +3
- Misses 12868 12883 +15
Partials 1584 1584 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Closes #3079
Problem
When a remote MCP server becomes unreachable (e.g., Datadog MCP), the health check fails and the proxy is stopped. However, the workload status is not updated, causing
thv listto report the server as "running" even though nothing is listening on the port.This creates a "ghost status" problem where:
thv listshows the server as "running"lsof -i :PORTshows nothing listeningSolution
Add a callback mechanism that notifies the runner when a health check fails, allowing it to update the workload status to
unhealthy.Changes
HealthCheckFailedCallbacktype to transport typesonHealthCheckFailedcallback toTransparentProxyHTTPTransporttoTransparentProxymonitorHealth()before stopping proxyResult
When a remote MCP server dies:
unhealthythv listshows accurate status