Skip to content

Conversation

@valerymo
Copy link
Contributor

@valerymo valerymo commented Mar 27, 2023

WHAT

Jira: https://issues.redhat.com/browse/THREESCALE-8997
Allow defining the pod priority of components via APIManager CR

Added optional priorityClassName definition for following 3scale pods: (testing WIP)

Component Fresh install adds priorityClassName Reconcile priorityClassName
apicast-production [X] [X]
apicast-staging [X] [X]
backend-cron [X] [X]
backend-listener [X] [X]
backend-worker [X] [X]
backend-redis [X] [X]
system-app [X] [X]
system-sidekiq [X] [X]
system-sphinx [X] [X]
system-searchd [X] [X]
system-memcache [X] [X]
system-mysql [X] [X]
system-postgresql [X] [X]
system-redis [X] [X]
zync [X] [X]
zync-database [X] [X]
zync-que [X] [X]

ApiManager CR example for apicast-staging (see Validation section for CR example for all/other DCs/pods

apiVersion: apps.3scale.net/v1alpha1
kind: APIManager
metadata:
  name: example-apimanager
spec:
  wildcardDomain: <wildcardDomain>
  apicast:
    stagingSpec:
      priorityClassName: system-node-critical

S3 secret example

kind: Secret
apiVersion: v1
metadata: 
  name: s3-credentials
data: 
  AWS_ACCESS_KEY_ID: QUtXXXXXXXXXXXX=
  AWS_SECRET_ACCESS_KEY: bmZl=
  AWS_BUCKET: dGV=
  AWS_REGION: dXMt==
type: Opaque

Validation

Install 3scale operator

$ cd 3scale-operator
$ make install
$ export NAMESPACE=3scale-test
$ oc new-project $NAMESPACE
$ make download
$ make run

Secret and CR to be installed on separate terminal window.

Secrets & CRs for test

  • Secret's AWS fields - encoded base64: echo ABC |base64
oc apply -f - <<EOF
---
kind: Secret
apiVersion: v1
metadata: 
  name: s3-credentials
data: 
  AWS_ACCESS_KEY_ID: QUtJQVYXXXXXX=
  AWS_SECRET_ACCESS_KEY: bmZX=
  AWS_BUCKET: dGX=
  AWS_REGION: dXX==
type: Opaque
EOF
  • Sample of CR for test 3scale pods
apiVersion: apps.3scale.net/v1alpha1
kind: APIManager
metadata:
    name: example-apimanager
spec:
    wildcardDomain: api.vmogilev01.0nno.s1.devshift.org
    resourceRequirementsEnabled: false
    apicast:
        stagingSpec:
            priorityClassName: openshift-user-critical
        productionSpec:
            priorityClassName: openshift-user-critical
    backend:
        listenerSpec:
            priorityClassName: openshift-user-critical
        cronSpec:
            priorityClassName: openshift-user-critical
        workerSpec:
            priorityClassName: openshift-user-critical
        redisPriorityClassName: openshift-user-critical
    system:
        appSpec:
            priorityClassName: openshift-user-critical
        sidekiqSpec:
            priorityClassName: openshift-user-critical
        sphinxSpec:
            priorityClassName: openshift-user-critical
        searchdSpec:
            priorityClassName: openshift-user-critical
        memcachedPriorityClassName: openshift-user-critical
        redisPriorityClassName: openshift-user-critical
        database:
            postgresql:
                priorityClassName: openshift-user-critical                
    zync:
        appSpec:
            priorityClassName: openshift-user-critical
        queSpec:
            priorityClassName: openshift-user-critical
        databasePriorityClassName: openshift-user-critical

Describe pods and see that priorityClassName set

$ oc describe pod apicast-staging-1-xxxx |grep "Priority Class Name"
Priority Class Name:  system-node-critical

oc describe pod system-memcache-1-xxxx |grep "Priority Class Name"
Priority Class Name:  system-node-critical

$ oc describe pod backend-redis-2-xxxx |grep "Priority Class Name"
Priority Class Name:  system-node-critical

Check similar for other pods

kubectl get deploymentconfigs -o name | xargs -I {} bash -c "echo -n '{} -> priorityClassName: '; kubectl get {} -o jsonpath='{.spec.template.spec.priorityClassName}' | yq e -P"

@MStokluska
Copy link
Contributor

In my opinion, I would pull the priorityClassName from within the each deployment config functions rather than from reconciler. Is systemOptions something we could leverage here to stay consistent?

@valerymo valerymo force-pushed the vmo_THREESCALE-8997 branch 2 times, most recently from 305c335 to cc8bc83 Compare April 3, 2023 09:05
@valerymo valerymo force-pushed the vmo_THREESCALE-8997 branch 5 times, most recently from 8cead92 to e35bbfc Compare April 4, 2023 13:58
@valerymo
Copy link
Contributor Author

valerymo commented Apr 4, 2023

Hello @eguzki , @MStokluska , I'd like provide this PR for Review. I hope it's implemented now, as Eguzki suggested. I will be in PTO from tomorrow, as can't address your comments, but any notes will be appreciated and I can address it at the end of next week.
I'd like to note, that I still trying to test it, as seems some configuration or my environment issue (I hope), as I can't see that Priority changed for pod apicast-staging.
Thank you for your review and help

@valerymo valerymo changed the title [WIP] THREESCALE-8997 Allow defining the pod priority of components via APIManager CR THREESCALE-8997 Allow defining the pod priority of components via APIManager CR Apr 4, 2023
@valerymo
Copy link
Contributor Author

Hello @eguzki , @MStokluska , just wanted say that it's working for me now (I found annoying issue wher "PriorityClass" had typo (and it was reason of issue). Now - it's tested for apicast-staging (as appears in description validation notes). I will try do more tests, but wanted to ask ask you to review if you will have time.. Thank you

@eguzki
Copy link
Member

eguzki commented Apr 14, 2023

Upgrade procedures are missing. Implemented, as an example, the upgrade for the apicast-staging DC. The others should be done in a similar way valerymo#3

@valerymo
Copy link
Contributor Author

Hi @eguzki . Priority Class Name upgrade - implemented. Thank you very much for example.

@valerymo valerymo force-pushed the vmo_THREESCALE-8997 branch 2 times, most recently from bc8ca0e to 928d987 Compare April 17, 2023 06:50
@eguzki
Copy link
Member

eguzki commented Apr 17, 2023

/test test-e2e

Copy link
Member

@eguzki eguzki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing upgrade for:

  • backend-cron
  • backend-listener
  • backend-worker

@valerymo
Copy link
Contributor Author

valerymo commented May 1, 2023

Missing upgrade for:

  • backend-cron
  • backend-listener
  • backend-worker

@eguzki , sorry for delay with addressing the comments (PTO + other ...). DeploymentConfigPriorityClassMutator was added to GenericBackendMutators, it is not right?

@valerymo valerymo force-pushed the vmo_THREESCALE-8997 branch 3 times, most recently from ac572ec to f519ecd Compare May 1, 2023 11:53
@valerymo valerymo force-pushed the vmo_THREESCALE-8997 branch from f519ecd to 267eb4b Compare May 1, 2023 12:05
@valerymo
Copy link
Contributor Author

valerymo commented May 1, 2023

Hello @eguzki , Test done after fix as following

  1. initially Priority Class was set for all as system-node-critical in APIManager
  2. pods checked: backend-cron, backend-listener, backend-worker, system-mysql, apicast-staging. Check example:
$ oc describe pod backend-cron-1-g9gkr |grep -i prior
Priority:             2000001000
Priority Class Name:  system-node-critical

All pods had Priority Class system-node-critical
3) APIManager was edited and priority changed to openshift-user-critical.
As result - pods were recreated, and priority was changed, as in example:

$ oc describe pod system-mysql-2-l6wb7 |grep -i prior
Priority:             1000000000
Priority Class Name:  openshift-user-critical
$ oc describe pod backend-listener-2-v5g68 |grep -i prior
Priority:             1000000000
Priority Class Name:  openshift-user-critical

Hope, this is the test of Upgrade, or if it's wrong test, could you please help, how to check.
Could you please look / review it. Thank you for useful comments and help.

@valerymo
Copy link
Contributor Author

valerymo commented May 3, 2023

Hello @eguzki , hope I addressed all your comments. Thank you very much for your comments and help! Hope the testing table now could be completed. Could you please look when will have time. Thank you!

@eguzki
Copy link
Member

eguzki commented May 3, 2023

system-sphinx component has been deprecated (not dropped for now). The replacement is system-searchd. Priority class name should be implemented for fresh install and also upgrade. Upgrade is not strictly needed, but provides reconcile feature, so if the APImanager spec is changed, the deployment will also change.

@eguzki
Copy link
Member

eguzki commented May 4, 2023

backend-redis does not get the priority class name spec from the APIManager in fresh install mode

@valerymo
Copy link
Contributor Author

valerymo commented May 4, 2023

system-sphinx component has been deprecated (not dropped for now). The replacement is system-searchd. Priority class name should be implemented for fresh install and also upgrade. Upgrade is not strictly needed, but provides reconcile feature, so if the APImanager spec is changed, the deployment will also change.

It's implemented now, but problem found in Upgrade - pod rolling update is not working. Please see log bellow

$ oc describe pod system-searchd-1-d4t2t |grep Prio
Priority:             2000001000
Priority Class Name:  system-node-critical

$ oc edit APIManager example-apimanager
apimanager.apps.3scale.net/example-apimanager edited

$ oc get pod |grep searchd
system-searchd-1-d4t2t        1/1     Running             0               4m59s
system-searchd-1-deploy       0/1     Completed           0               5m
system-searchd-2-8bgpm        0/1     ContainerCreating   0               85s
system-searchd-2-deploy       1/1     Running             0               87s


**** Problem with searchd pod rolling update:
$ oc describe pod system-searchd-2-8bgpm
  Warning  FailedAttachVolume  11m                    attachdetach-controller  Multi-Attach error for volume "pvc-62ec0cb3-91ea-4d43-91c5-c09dd80143fa" Volume is already used by pod(s) system-searchd-1-d4t2t
  Warning  FailedMount         3m10s (x2 over 5m13s)  kubelet                  Unable to attach or mount volumes: unmounted volumes=[system-searchd-database], unattached volumes=[kube-api-access-cccgf system-searchd-database]: timed out waiting for the condition


@valerymo
Copy link
Contributor Author

valerymo commented May 4, 2023

backend-redis does not get the priority class name spec from the APIManager in fresh install mode

Fixed. Thank you!

$ kubectl get deploymentconfigs -o name | xargs -I {} bash -c "echo -n '{} -> priorityClassName: '; kubectl get {} -o jsonpath='{.spec.template.spec.priorityClassName}' | yq e -P"
deploymentconfig.apps.openshift.io/apicast-production -> priorityClassName: system-node-critical
deploymentconfig.apps.openshift.io/apicast-staging -> priorityClassName: system-node-critical
deploymentconfig.apps.openshift.io/backend-cron -> priorityClassName: system-node-critical
deploymentconfig.apps.openshift.io/backend-listener -> priorityClassName: system-node-critical
deploymentconfig.apps.openshift.io/backend-redis -> priorityClassName: openshift-user-critical
deploymentconfig.apps.openshift.io/backend-worker -> priorityClassName: system-node-critical
deploymentconfig.apps.openshift.io/system-app -> priorityClassName: system-node-critical
deploymentconfig.apps.openshift.io/system-memcache -> priorityClassName: system-node-critical
deploymentconfig.apps.openshift.io/system-mysql -> priorityClassName: openshift-user-critical
deploymentconfig.apps.openshift.io/system-redis -> priorityClassName: system-node-critical
deploymentconfig.apps.openshift.io/system-searchd -> priorityClassName: openshift-user-critical
deploymentconfig.apps.openshift.io/system-sidekiq -> priorityClassName: openshift-user-critical
deploymentconfig.apps.openshift.io/zync -> priorityClassName: system-node-critical
deploymentconfig.apps.openshift.io/zync-database -> priorityClassName: system-node-critical
deploymentconfig.apps.openshift.io/zync-que -> priorityClassName: system-node-critical

$ date
Thu May  4 13:55:11 IDT 2023
$ oc whoami --show-server
https://api.vmogilev01.0nno.s1.devshift.org:6443

@valerymo valerymo force-pushed the vmo_THREESCALE-8997 branch from 04be67c to ec92435 Compare May 4, 2023 13:55
@qlty-cloud-legacy
Copy link

Code Climate has analyzed commit ec92435 and detected 18 issues on this pull request.

Here's the issue category breakdown:

Category Count
Complexity 2
Duplication 16

View more on Code Climate.

@valerymo
Copy link
Contributor Author

valerymo commented May 4, 2023

@eguzki
system-postgresql issue fixed

$ kubectl get deploymentconfigs -o name | xargs -I {} bash -c "echo -n '{} -> priorityClassName: '; kubectl get {} -o jsonpath='{.spec.template.spec.priorityClassName}' | yq e -P"
deploymentconfig.apps.openshift.io/apicast-production -> priorityClassName: system-node-critical
deploymentconfig.apps.openshift.io/apicast-staging -> priorityClassName: system-node-critical
deploymentconfig.apps.openshift.io/backend-cron -> priorityClassName: system-node-critical
deploymentconfig.apps.openshift.io/backend-listener -> priorityClassName: system-node-critical
deploymentconfig.apps.openshift.io/backend-redis -> priorityClassName: system-node-critical
deploymentconfig.apps.openshift.io/backend-worker -> priorityClassName: system-node-critical
deploymentconfig.apps.openshift.io/system-app -> priorityClassName: system-node-critical
deploymentconfig.apps.openshift.io/system-memcache -> priorityClassName: system-node-critical
deploymentconfig.apps.openshift.io/system-postgresql -> priorityClassName: system-node-critical
deploymentconfig.apps.openshift.io/system-redis -> priorityClassName: system-node-critical
deploymentconfig.apps.openshift.io/system-searchd -> priorityClassName: system-node-critical
deploymentconfig.apps.openshift.io/system-sidekiq -> priorityClassName: system-node-critical
deploymentconfig.apps.openshift.io/zync -> priorityClassName: system-node-critical
deploymentconfig.apps.openshift.io/zync-database -> priorityClassName: system-node-critical
deploymentconfig.apps.openshift.io/zync-que -> priorityClassName: system-node-critical

After change Priority Class:
$ oc describe dc system-postgresql |grep -i prio
Priority Class Name: openshift-user-critical

@laurafitzgerald
Copy link

Hi guys. Sorry I'm late to the party here but I wonder if there is any value in a single entry that could be applied to all pods, given that we likely wont differentiate between 3scale components themselves in terms of priority and consider it's priority as a whole. wdyt?

Also, and I know it's just for example/verification but I don't think we would recommend the use of system-node-critical as the PodPriority for 3scale. If the user is deploying 3scale themselves they would likley use openshift-user-critical Priority Class.

@valerymo
Copy link
Contributor Author

valerymo commented May 4, 2023

system-node-critical

system-node-critical - it's just for example , for testing. It's taken from ApiManager CR.
If Priority Class Name not defined in CR - it's not set in DC/Pod.
Default priority is 0. I removed now Priority class definition from backend-cron, so it's as following:

$ oc describe pod backend-cron-2-hrfdw |grep -i priority
Priority:         0

@laurafitzgerald
Copy link

Thanks Valery. I understand it's used an example but just noting that PriorityClass is reserved for system critical pods so shouldn't be used for user workloads/applications.

@valerymo
Copy link
Contributor Author

valerymo commented May 4, 2023

Thanks Valery. I understand it's used an example but just noting that PriorityClass is reserved for system critical pods so shouldn't be used for user workloads/applications.

Thank you Laura. Yes agree, also seems to me it's a Big Risk to have option set Default priority on application level for All pods together. Current approach seems to me - safe
But maybe I didn't understand you. If you want to say that some of 3scale components should NOT be allowed to change PriorityClass?, sorry if not understand

Yes, I agree, probably system-node-critical is not good for testing example, confusing. I can change it to openshift-user-critical in Validation section, if required.

@valerymo
Copy link
Contributor Author

valerymo commented May 4, 2023

APIManager CR example updated in validation section. Priority Class changed to openshift-user-critical
cc @laurafitzgerald Thank you for comment

$ kubectl get deploymentconfigs -o name | xargs -I {} bash -c "echo -n '{} -> priorityClassName: '; kubectl get {} -o jsonpath='{.spec.template.spec.priorityClassName}' | yq e -P"
deploymentconfig.apps.openshift.io/apicast-production -> priorityClassName: openshift-user-critical
deploymentconfig.apps.openshift.io/apicast-staging -> priorityClassName: openshift-user-critical
deploymentconfig.apps.openshift.io/backend-cron -> priorityClassName: openshift-user-critical
deploymentconfig.apps.openshift.io/backend-listener -> priorityClassName: openshift-user-critical
deploymentconfig.apps.openshift.io/backend-redis -> priorityClassName: openshift-user-critical
deploymentconfig.apps.openshift.io/backend-worker -> priorityClassName: openshift-user-critical
deploymentconfig.apps.openshift.io/system-app -> priorityClassName: openshift-user-critical
deploymentconfig.apps.openshift.io/system-memcache -> priorityClassName: openshift-user-critical
deploymentconfig.apps.openshift.io/system-postgresql -> priorityClassName: openshift-user-critical
deploymentconfig.apps.openshift.io/system-redis -> priorityClassName: openshift-user-critical
deploymentconfig.apps.openshift.io/system-searchd -> priorityClassName: openshift-user-critical
deploymentconfig.apps.openshift.io/system-sidekiq -> priorityClassName: openshift-user-critical
deploymentconfig.apps.openshift.io/zync -> priorityClassName: openshift-user-critical
deploymentconfig.apps.openshift.io/zync-database -> priorityClassName: openshift-user-critical
deploymentconfig.apps.openshift.io/zync-que -> priorityClassName: openshift-user-critical

@valerymo
Copy link
Contributor Author

valerymo commented May 7, 2023

@laurafitzgerald , @eguzki could you please tell if any changes required in current solution.
I checked eng-long-lived cluster. All DCs have priorityClassName: rhoam-pod-priority.
Please no need to reply on the message if it is outside of your working hours. Thank you

Copy link
Member

@eguzki eguzki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good Job @valerymo 🎖️

@eguzki
Copy link
Member

eguzki commented May 8, 2023

Do the honors and merge the PR @valerymo :)

@valerymo
Copy link
Contributor Author

valerymo commented May 8, 2023

@eguzki , @laurafitzgerald , Thank you very much for your kind help and review!

@valerymo valerymo merged commit ede8838 into 3scale:master May 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants