stack stressor fails without any indication why on AMD Milan system

Bug #1927681 reported by Jeff Lane 
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Stress-ng
Triaged
Low
Colin Ian King

Bug Description

System is a 224 core AMD Milan server. I'm running the stress-ng memory test from the cert suite and finding that the stack stressor fails without any message why.

2x AMD EPYC 7663 56-Core Processor

$ sudo stress-ng --timeout 60 -v --aggressive stack
stress-ng: debug: [814037] 224 processors online, 224 processors configured
stress-ng: error: [814037] No stress workers invoked
May 7 05:55:00 jamano stress-ng: invoked with 'stress-n' by user 0
May 7 05:55:00 jamano stress-ng: system: 'jamano' Linux 5.4.0-72-generic #80-Ubuntu SMP Mon Apr 12 17:35:00 UTC 2021 x86_64
May 7 05:55:00 jamano stress-ng: memory (MB): total 32038.84, free 26006.32, shared 51.52, buffer 111.03, swap 8192.00, free swap 6834.61

That was done by running the stack stressor while tailing syslog.

This stressor runs fine on AMD Rome and Intel systems.

Revision history for this message
Jeff Lane  (bladernr) wrote :

This seems to be somewhat tied to the amount fo RAM per core.

On the example above, the system has 112C/224T and only 32GB Ram.

OEM partners are also seeing failures with Milan systems, that are also likely due to the high core/thread count per GB of ram.

Is there a minimum amount of RAM that is suggested for StressNG, now that we're seeing the advent of cpus with very high core counts?

Revision history for this message
Jeff Lane  (bladernr) wrote :

Got some suggestions from Colin that I'm going to play around with and see if I can improve the test script.

Revision history for this message
Colin Ian King (colin-king) wrote :

For new releases of stress-ng, try using --oom-avoid to avoid OOM'ing on low memory systems

Changed in stress-ng:
importance: Undecided → Low
Changed in stress-ng:
assignee: nobody → Colin Ian King (colin-king)
status: New → In Progress
status: In Progress → Triaged
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.