tekton-operator-proxy-webhook Service selector matches operator pods, causing ~50% webhook admission failures

## Environment

- Tekton Operator version: v0.78.1
- Kubernetes version: v1.33
- Platform: Kubernetes (bare-metal)

## Description

The `tekton-operator-proxy-webhook` Service uses `name: tekton-operator` as
its pod selector. This label is also present on pods of the main
`tekton-operator` Deployment. As a result, the Service load-balances admission
webhook traffic across **both** Deployments, even though `tekton-operator`
pods do not listen on port 8443.

This causes approximately **50% of all admission webhook requests** to fail
with `connection refused`, which — because the MutatingWebhookConfiguration
has `failurePolicy: Fail` — immediately rejects the creation of any TaskRun
pod.

## Steps to Reproduce

1. Deploy Tekton Operator v0.78.1 on a Kubernetes cluster.
2. Inspect the endpoints of the `tekton-operator-proxy-webhook` Service:
   ```
   kubectl get endpoints tekton-operator-proxy-webhook -n <namespace>
   ```
3. Observe that the Endpoints list includes pods from **both**
   `tekton-operator` and `tekton-operator-proxy-webhook` Deployments.
4. Trigger any Pipeline/TaskRun. Observe that roughly half of new TaskRun pod
   creation attempts fail immediately.

## Expected Behavior

The `tekton-operator-proxy-webhook` Service should only route traffic to
`tekton-operator-proxy-webhook` pods (port 8443). The `tekton-operator` pods
should never appear in this Service's Endpoints.

## Actual Behavior

The Service Endpoints include pods from both Deployments:

```yaml
# kubectl get endpoints tekton-operator-proxy-webhook -n tekton -o yaml
subsets:
- addresses:
  - ip: 172.26.0.66   # tekton-operator-proxy-webhook pod  ✅ serves on 8443
  - ip: 172.26.1.157  # tekton-operator pod                ❌ does not serve on 8443
  ports:
  - port: 8443
```

Stress-testing the Service directly (20 requests via ClusterIP) showed
roughly 50% failing with `connection refused`:

```
req-1: PASS (HTTP 415)  req-2: FAIL (000)  req-3: FAIL (000)
req-4: FAIL (000)       req-5: FAIL (000)  req-6: PASS (HTTP 415)
req-7: PASS (HTTP 415)  req-8: FAIL (000)  req-9: PASS (HTTP 415)
req-10: FAIL (000)
```

The failure manifests as the following error when creating TaskRun pods:

```
failed to create task run pod "<pod-name>":
Internal error occurred: failed calling webhook "proxy.operator.tekton.dev":
failed to call webhook:
Post "https://tekton-operator-proxy-webhook.<ns>.svc:443/defaulting?timeout=10s":
dial tcp <ClusterIP>:443: connect: connection refused
```

Note: the error appends a misleading hint ("Maybe missing or invalid Task
…") that does not reflect the real cause.

## Root Cause

Both Deployments use the same pod template label `name: tekton-operator`:

**`tekton-operator` Deployment** (`config/kubernetes/base/operator.yaml`):
```yaml
selector:
  matchLabels:
    name: tekton-operator   # ← same label
template:
  metadata:
    labels:
      name: tekton-operator # ← same label
```

**`tekton-operator-proxy-webhook` Deployment**
(`cmd/kubernetes/operator/kodata/webhook/webhook.yaml`):
```yaml
selector:
  matchLabels:
    name: tekton-operator   # ← collision!
template:
  metadata:
    labels:
      name: tekton-operator # ← collision!
```

**`tekton-operator-proxy-webhook` Service**:
```yaml
selector:
  name: tekton-operator     # ← matches both Deployments!
```

The same issue exists in the OpenShift manifest
(`cmd/openshift/operator/kodata/webhook/webhook.yaml`).

## Proposed Fix

Change the proxy-webhook Deployment's `matchLabels` selector and pod template
label from `name: tekton-operator` to `name: tekton-operator-proxy-webhook`,
and update the Service selector to match. The existing `app: tekton-operator`
label is left unchanged.

I have a patch ready and will submit a PR.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tekton-operator-proxy-webhook Service selector matches operator pods, causing ~50% webhook admission failures #3227

Environment

Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Root Cause

Proposed Fix

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

tekton-operator-proxy-webhook Service selector matches operator pods, causing ~50% webhook admission failures #3227

Description

Environment

Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Root Cause

Proposed Fix

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions