-
Notifications
You must be signed in to change notification settings - Fork 1.2k
introduce timeout factor to control invoker queue behavior #3767
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| min = 100 ms | ||
| max = 5 m | ||
| std = 1 m | ||
| timeoutfactor = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a comment as this knob isn’t obvious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Codecov Report
@@ Coverage Diff @@
## master #3767 +/- ##
=========================================
- Coverage 75.84% 75.14% -0.7%
=========================================
Files 145 132 -13
Lines 6921 6143 -778
Branches 410 374 -36
=========================================
- Hits 5249 4616 -633
+ Misses 1672 1527 -145
Continue to review full report at Codecov.
|
| "CONFIG_whisk_timeLimit_min": "{{ limit_action_time_min | default() }}" | ||
| "CONFIG_whisk_timeLimit_max": "{{ limit_action_time_max | default() }}" | ||
| "CONFIG_whisk_timeLimit_std": "{{ limit_action_time_std | default() }}" | ||
| "CONFIG_whisk_timeLimit_timeoutfactor": "{{ limit_action_timeoutfactor | default() }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this rather be part of the loadbalancer config since it is specific to the loadbalancer after all?
If an active-ack does not appear after a certain timeout (action timeout + 1 minute), the active-ack is considered as lost and we "force" it to keep the loadbalancer's state sane. In some cases where the system gets increasingly slow, this timeout is too narrow. This makes the value configurable to be able to rapidly adjust it and to find a good default in production environments. Co-authored-by: Markus Thömmes <markusthoemmes@me.com>
|
PG3 2561 🔵 |
markusthoemmes
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
If an active-ack does not appear after a certain timeout (action timeout + 1 minute), the active-ack is considered as lost and we "force" it to keep the loadbalancer's state sane. In some cases where the system gets increasingly slow, this timeout is too narrow. This makes the value configurable to be able to rapidly adjust it and to find a good default in production environments. Co-authored-by: Markus Thömmes <markusthoemmes@me.com>
This PR introduces a configurable multiplicator to influence the forced active ack timeout period.
Description
In rare condition one can see that the load balancer put activations into queues of invokers falsely
identified to have free capacity because forced active acks are send out for activation that are waiting in the queue to be executed. With increasing the timeout the loadbalancer can be influenced to distribute the load wider in this cases.
Related issue and scope
My changes affect the following components
Types of changes
Checklist:
[PG3:2420 succeeded]