stack stressor fails without any indication why on AMD Milan system
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Stress-ng |
Triaged
|
Low
|
Colin Ian King |
Bug Description
System is a 224 core AMD Milan server. I'm running the stress-ng memory test from the cert suite and finding that the stack stressor fails without any message why.
2x AMD EPYC 7663 56-Core Processor
$ sudo stress-ng --timeout 60 -v --aggressive stack
stress-ng: debug: [814037] 224 processors online, 224 processors configured
stress-ng: error: [814037] No stress workers invoked
May 7 05:55:00 jamano stress-ng: invoked with 'stress-n' by user 0
May 7 05:55:00 jamano stress-ng: system: 'jamano' Linux 5.4.0-72-generic #80-Ubuntu SMP Mon Apr 12 17:35:00 UTC 2021 x86_64
May 7 05:55:00 jamano stress-ng: memory (MB): total 32038.84, free 26006.32, shared 51.52, buffer 111.03, swap 8192.00, free swap 6834.61
That was done by running the stack stressor while tailing syslog.
This stressor runs fine on AMD Rome and Intel systems.
Changed in stress-ng: | |
assignee: | nobody → Colin Ian King (colin-king) |
status: | New → In Progress |
status: | In Progress → Triaged |
This seems to be somewhat tied to the amount fo RAM per core.
On the example above, the system has 112C/224T and only 32GB Ram.
OEM partners are also seeing failures with Milan systems, that are also likely due to the high core/thread count per GB of ram.
Is there a minimum amount of RAM that is suggested for StressNG, now that we're seeing the advent of cpus with very high core counts?