Symptom: All CPUs locked to 1500mhz (minimum P-state) and never boost, even under sustained load. This makes compute-intensive workloads 2-4x slower than expected.
Evidence from within the container:
# Governor is schedutil, should ramp up under load — but doesn't
$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
schedutil
# Driver is legacy acpi-cpufreq (not amd-pstate)
$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_driver
acpi-cpufreq
# Only 3 P-states available
$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies
3300000 2400000 1500000
# Stuck at minimum even under sustained CPU load
$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq
1500000 # ALL cores report 1500000, even while running compute workloads
# Limits look correct - max is 3300 MHz, boost is enabled
$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq
3300000
$ cat /sys/devices/system/cpu/cpufreq/boost
1
$ cat /sys/devices/system/cpu/cpu0/cpufreq/bios_limit
3300000
Hardware: AMD EPYC 9575F (Zen 5 Turin) 128 threads visible. Should boost to 3.3ghz base (or higher with turbo)
Root cause hypothesis: The schedutil governor on the host isn't receiving proper CPU utilization feedback for this VM/container, so it never ramps up above the min P-state. Alternatively, the host is using acpi-cupfreq instead of amd-pstate or amd-pstate-epp which is the recommended driver for Zen4+ CPUs and handles frequency scaling much better.
Comparison with a working machine (using amd-pstate-epp driver, correct for Zen 4): Boosts to 5342mhz under load - everything working as expected
Requested fix (any of these):
- Switch host CPU frequency driver to amd-pstate-epp (kernel param amd_pstate=active)
- Switch governer to performance on host:
echo performance | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor - If neither is possible, at least ensure schedutil is responding to load (may require kernel update or BIOS CPPC enablement)