fix: proxy-webhook selector matches operator pods#3228
fix: proxy-webhook selector matches operator pods#3228bowling233 wants to merge 1 commit intotektoncd:mainfrom
Conversation
Both the `tekton-operator` and `tekton-operator-proxy-webhook` Deployments label their Pods with `name: tekton-operator`. The `tekton-operator-proxy-webhook` Service uses this same label as its only selector, so it inadvertently load-balances traffic across both Deployments. Because `tekton-operator` pods do not serve on port 8443, ~50% of admission webhook requests fail: failed calling webhook "proxy.operator.tekton.dev": Post ".../tekton-operator-proxy-webhook.../defaulting": dial tcp <ClusterIP>:443: connect: connection refused Because MutatingWebhookConfiguration has `failurePolicy: Fail`, each such failure immediately rejects TaskRun Pod creation. Rename the proxy-webhook Deployment's selector matchLabels and pod template label from `name: tekton-operator` to `name: tekton-operator-proxy-webhook`, and update the Service selector to match. The `app: tekton-operator` label is left unchanged. Applies to both Kubernetes and OpenShift manifests. Adding a set-based (NotIn) expression to the Service selector instead was not viable as Kubernetes Services only support equality-based (matchLabels) selectors.
|
|
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Pull request overview
Fixes a production routing bug where the tekton-operator-proxy-webhook Service selector unintentionally matched both proxy-webhook and main operator pods, causing intermittent admission webhook failures and rejected TaskRun Pod creation.
Changes:
- Update the proxy-webhook Deployment selector + pod template label to use
name: tekton-operator-proxy-webhook. - Update the proxy-webhook Service selector to match the new pod label (Kubernetes + OpenShift manifests).
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| cmd/kubernetes/operator/kodata/webhook/webhook.yaml | Aligns proxy-webhook Deployment/Service selectors to target only proxy-webhook pods on Kubernetes. |
| cmd/openshift/operator/kodata/webhook/webhook.yaml | Same selector/label fix for the OpenShift manifest. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@bowling233 , thank for your PR.
|
Changes
Fixes #3227
Both the
tekton-operatorandtekton-operator-proxy-webhookDeploymentslabel their Pods with
name: tekton-operator. Thetekton-operator-proxy-webhookService uses this same label as its onlyselector, so it inadvertently load-balances traffic across both Deployments.
Because
tekton-operatorpods do not serve on port 8443, ~50% of admissionwebhook requests fail with
connection refused. Since theMutatingWebhookConfiguration has
failurePolicy: Fail, each failureimmediately rejects TaskRun Pod creation.
Changes:
cmd/kubernetes/operator/kodata/webhook/webhook.yaml: rename theproxy-webhook Deployment's
matchLabelsselector and pod template labelfrom
name: tekton-operatortoname: tekton-operator-proxy-webhook;update the Service
selectorto match.cmd/openshift/operator/kodata/webhook/webhook.yaml: same change for theOpenShift manifest.
The existing
app: tekton-operatorlabel is preserved on both Deployments.No other resources are affected.
Alternative considered: adding a set-based (
NotIn) expression to theService selector to exclude
tekton-operatorpods. This was not viablebecause Kubernetes Services only support equality-based (
matchLabels)selectors.
Submitter Checklist
These are the criteria that every PR should meet, please check them off as you
review them:
make test lintbefore submitting a PRSee the contribution guide for more details.
Release Notes