Skip to content

Conversation

@neilcook
Copy link
Collaborator

@neilcook neilcook commented Nov 4, 2025

When the send and/or recv replication queues are full, previous behaviour was to log an error message. Given that replication messages can be in the order of thousands or tens of thousands per second, this would cause a doom loop of performance problems.

Instead, this PR creates new metrics to track the send queue size and also to track when the queue size is exceeded. These metrics are used instead of logging.

New metrics examples:

# HELP wforce_repl_send_queue_size How full is the replication per-sibling send queue?
# TYPE wforce_repl_send_queue_size gauge
wforce_repl_send_queue_size{sibling="1.2.3.4:1234"} 10
wforce_repl_send_queue_size{sibling="127.0.0.1"} 0
wforce_repl_send_queue_size{sibling="1.2.3.4"} 0
wforce_repl_send_queue_size{sibling="127.0.0.1:1233"} 10

# HELP wforce_replication_send_queue_error_total How many errors trying to add replication messages to the send queue?
# TYPE wforce_replication_send_queue_error_total counter
wforce_replication_send_queue_error_total{sibling="1.2.3.4:1234"} 0
wforce_replication_send_queue_error_total{sibling="127.0.0.1"} 0
wforce_replication_send_queue_error_total{sibling="1.2.3.4"} 0
wforce_replication_send_queue_error_total{sibling="127.0.0.1:1233"} 0

# HELP wforce_replication_rcvd_queue_error_total How many errors trying to add replication msgs to the receive queue?
# TYPE wforce_replication_rcvd_queue_error_total counter
wforce_replication_rcvd_queue_error_total{sibling="1.2.3.4:1234"} 0
wforce_replication_rcvd_queue_error_total{sibling="127.0.0.1"} 0
wforce_replication_rcvd_queue_error_total{sibling="1.2.3.4"} 0
wforce_replication_rcvd_queue_error_total{sibling="127.0.0.1:1233"} 0

@github-actions
Copy link

github-actions bot commented Nov 5, 2025

Test Results

  2 files  ±0    2 suites  ±0   33m 18s ⏱️ -3s
 73 tests ±0   73 ✅ ±0  0 💤 ±0  0 ❌ ±0 
146 runs  ±0  146 ✅ ±0  0 💤 ±0  0 ❌ ±0 

Results for commit 7eff70f. ± Comparison against base commit 5d13229.

♻️ This comment has been updated with latest results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant