feat(alertd): add database health check with database-down event#274
Merged
feat(alertd): add database health check with database-down event#274
Conversation
Add a periodic health check (every 30 seconds) that monitors the PostgreSQL database connection. When the database becomes unreachable, a database-down internal event is triggered and sent to the configured default target with a built-in template. The health check: - Runs SELECT 1 with a 5-second timeout every 30 seconds - Tracks state to only alert on transitions (healthy -> unhealthy) - Logs when the database connection is restored - Redacts passwords from the database URL in alert context Also adds DatabaseDown variant to EventType and EventContext, a default template for the event, and re-exports AlwaysSend, WhenChanged, TargetConnection, and TargetEmail from the crate. Generated-With: claude-opus-4
Generated-With: claude-opus-4
Generated-With: claude-opus-4
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a periodic health check that monitors the PostgreSQL database connection and triggers an internal
database-downevent when the database becomes unreachable.Changes
New
database-downevent typeevents.rs—DatabaseDownvariant onEventType(serialized as"database-down") andEventContext(carriesdatabase_urlanderror_message). Includes a default template mentioning the redacted DB URL, error details, and that all SQL-based alerts are non-functional.Periodic health check in the daemon
daemon.rs— Spawns a task that runs every 30 seconds:pool.get_timeout(5s)thenSELECT 1was_downstate so the event only fires on the healthy → unhealthy transition (no repeated alerts)Re-exports
lib.rs— Re-exportsAlwaysSend,WhenChanged,TargetConnection, andTargetEmailfor downstream/test use.Tests
tests/database_health.rs— 9 new tests:event: database-downUsers can also define custom alert definitions for
event: database-downin their alert YAML files to override the default template or route to specific targets.