Skip to content

feat(alertd): add database health check with database-down event#274

Merged
passcod merged 3 commits intomainfrom
feat/alertd-database-health-check
Mar 17, 2026
Merged

feat(alertd): add database health check with database-down event#274
passcod merged 3 commits intomainfrom
feat/alertd-database-health-check

Conversation

@passcod
Copy link
Member

@passcod passcod commented Mar 17, 2026

Summary

Adds a periodic health check that monitors the PostgreSQL database connection and triggers an internal database-down event when the database becomes unreachable.

Changes

New database-down event type

  • events.rsDatabaseDown variant on EventType (serialized as "database-down") and EventContext (carries database_url and error_message). Includes a default template mentioning the redacted DB URL, error details, and that all SQL-based alerts are non-functional.

Periodic health check in the daemon

  • daemon.rs — Spawns a task that runs every 30 seconds:
    • Attempts pool.get_timeout(5s) then SELECT 1
    • Tracks was_down state so the event only fires on the healthy → unhealthy transition (no repeated alerts)
    • Logs when the database connection is restored
    • Redacts the password from the database URL before including it in the alert context

Re-exports

  • lib.rs — Re-exports AlwaysSend, WhenChanged, TargetConnection, and TargetEmail for downstream/test use.

Tests

  • tests/database_health.rs — 9 new tests:
    • Event type parsing and serialization
    • Alert definition with event: database-down
    • Default template rendering with expected content
    • Target resolution for database-down event alerts
    • Health check detection of unreachable databases
    • Health check success on valid databases
    • URL password redaction (with password, without password, unparseable)

Users can also define custom alert definitions for event: database-down in their alert YAML files to override the default template or route to specific targets.

passcod added 3 commits March 17, 2026 13:56
Add a periodic health check (every 30 seconds) that monitors the
PostgreSQL database connection. When the database becomes unreachable,
a database-down internal event is triggered and sent to the configured
default target with a built-in template.

The health check:
- Runs SELECT 1 with a 5-second timeout every 30 seconds
- Tracks state to only alert on transitions (healthy -> unhealthy)
- Logs when the database connection is restored
- Redacts passwords from the database URL in alert context

Also adds DatabaseDown variant to EventType and EventContext, a default
template for the event, and re-exports AlwaysSend, WhenChanged,
TargetConnection, and TargetEmail from the crate.

Generated-With: claude-opus-4
@passcod passcod merged commit 26e0b32 into main Mar 17, 2026
15 checks passed
@passcod passcod deleted the feat/alertd-database-health-check branch March 17, 2026 03:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant