#high CPU due to background procs - fail over edition.

1 messages · Page 1 of 1 (latest)

languid patio
#

Hi!

So if I've got an HA pair suffering 80% CPU per node which all of the contention is listed as background processes, how does this effect fail overs?

teal knoll
#

Please open a support case for performance troubleshooting. This might be correctable after a performance evaluation/investigation.

languid patio
#

Already did, mostly asking speculative questions at this rate.

#

It's sitting with the perf team already, it's (not)fun

valid cloud
#

Oh hi.

#

@languid patio I forgot you were on Discord. 🙂

tender carbon
#

80% CPU usage is in itself not a primary performance metric. Meaning it does not matter how high the CPU usage is unless you have actual impacts in functionality or performance somewhere. How many CPUs are at 80%? did you only look at sysstat -x or have you also checked sysstat -m? What domain is impacted the most (sysstat -M)?

queen sluice
#

Jesse, what is the priority on the case and business impact described in the case

#

progress on cases depends on the prioity and business impact description

#

of course, it is much easier to get the cases escalated to a duty manager

#

At a high level, something to look at

#

when did the CPU usage go up to 80%

#

has CPU usage stayed consistent at 80% or does it fluctuate?

#

if it does, is there a pattern

#

takeover process is speculative with 80% CPU usage per node

languid patio
languid patio
#

Sat with a P2 case with NA support for a day or two (my management was very uneasy with the consumption (Children's hospital and 1130 shares of 1200 are on these A400's)).
Identified as background processing and was given indepth demonstration as to what was eating it.

#

It's been slowly lowering for the last three weeks (80% per node all day every day with spikes to prolonged 100%) to today 55% sustained.

languid patio
# valid cloud Oh hi.

Paul my dude.
I wasn't trying to go back around people lol, mostly crowd sourcing information as well for consensus 🙂

As I said with others here, my people got very uneasy with post-cut sitting at high consumption day 1.

queen sluice
#

i guess at this point, it is to observe given that it is down to 55%. I assume it is down to 55% on both nodes?

languid patio
#

50-60% per node.
It's been slowly lowering.

Reigning hypothesis is because we threw the fire hose at it for data on ingress (new vols wholesale migrated in with datadobi).
The rate in which it are and tiered data means that a whole lot of data wasn't able to dediplicate/compress on ingress

#

It will either come down or settle, and so with most being background // WAFL_EXEMPT it will become a non issue if more heinous workloads need it

valid cloud
#

I'm actually helping on the case so it's all good.

valid cloud
#

Jesse did you have time today to go over anything?

languid patio
#

Yes, replied back to Greg n co.
4P EST works for me, tomorrow AM is a bit easier but I'm flexible if you guys are.

valid cloud
#

Ok cool. I've got several morning meetings so it's hard to schedule usually.