#high container_cpu_cfs_periods_total after upgrade to trident 24.06
1 messages · Page 1 of 1 (latest)
Am not aware of 24.06.0 or any other Trident version causing cpu issue. Here are some general suggestions: - Is occurring for all nodes, all trident pods? kubectl get pods -n trident -o wide will show which nodes each pod resides. - Check trident pod describes for cpu limits and node system logs. (Any zombie processes?) Could require a node reboot. - Check trident logs for any transactions or repeating requests that are not able to complete, or 'context deadline' errors. Check kubesys pods for same and health. If needing further please open a NetApp support case; check with K8S orchestrator vendor; and Linux OS. Hope this helps.
It is related just to the operator pod. 23.04 didn't have cpu limits set for trident-operator:
https://github.com/NetApp/trident/blob/08aa639d05fbeb823cf282dc43f66d89137f2eb5/helm/trident-operator/templates/deployment.yaml#L61C5-L61C6
24.06 has it:
https://github.com/NetApp/trident/blob/cb68cb389d9dea65eff2d2509b4784827bfbfe5c/helm/trident-operator/templates/deployment.yaml#L88
after some tweaking and I think operator pod reboot container_cpu_cfs_periods_total metrics went down for .slice but for some reason there are still high for .slice:cri-containerd