Fix ADL: System shutdown automatically when run Prime95&stress-ng with i9-12900K

Bug #2018236 reported by koba
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Undecided
Unassigned
Focal
New
Undecided
Unassigned
Jammy
New
Undecided
Unassigned
linux-meta-hwe-5.15 (Ubuntu)
New
Undecided
Unassigned
Focal
New
Undecided
Unassigned
Jammy
Invalid
Undecided
Unassigned
thermald (Ubuntu)
Fix Released
Undecided
Unassigned
Focal
In Progress
High
koba
Jammy
Fix Released
Undecided
koba

Bug Description

[Description]
System shutdown automatically when stressing the machine.

[Fix]
Need these two to fix the issue.

cbdd92b) Parse idsp and trips
d385f20) Use PL1 max/min from PPCC when policies match
https://github.com/intel/thermal_daemon/commit/d385f20764e1e5477450405be71ec719adc973be

[Test Case]
1. Find a unit with i9-12900k CPU and air cooling
2. Install tools
#sudo apt install stress-ng s-tui
#sudo systemctl stop thermald
#sudo thermald --no-daemon --loglevel=debug --adaptive --ignore-cpuid-check > thermald_log.txt &
#download prime95 linux version: p95v308b15.linux64.tar.gz and decompress
4. Stress test: (you may need to open multiple terminals for the test)
#./mprime
#sudo stress-ng -a 0 --class cpu,cpu-cache --ignite-cpu -v
5. monitor cpu temperature for 6 hours if you didn’t hit overheat and shutdown issue.
#sudo s-tui -c

[Where problems could occur]
low

koba (kobako)
Changed in thermald (Ubuntu Jammy):
status: New → In Progress
assignee: nobody → koba (kobako)
summary: - Fix ADL: System shutdwon automically when run Prime95 with i9-12900K
+ Fix ADL: System shutdwon automically when run Prime95&stress-ng with
+ i9-12900K
description: updated
koba (kobako)
description: updated
description: updated
koba (kobako)
tags: added: originate-from-1982073
Revision history for this message
Chris Halse Rogers (raof) wrote : Re: Fix ADL: System shutdwon automically when run Prime95&stress-ng with i9-12900K

These patches are in 2.5, so fixed in Kinetic and above.

Changed in thermald (Ubuntu):
status: New → Fix Released
Revision history for this message
Chris Halse Rogers (raof) wrote : Please test proposed package

Hello koba, or anyone else affected,

Accepted thermald into jammy-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/thermald/2.4.9-1ubuntu0.3 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-jammy to verification-done-jammy. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-jammy. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in thermald (Ubuntu Jammy):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-jammy
Revision history for this message
koba (kobako) wrote (last edit ): Re: Fix ADL: System shutdwon automically when run Prime95&stress-ng with i9-12900K

+202307281516, re-run 7 hours and system didn't reboot

~~~
Verified with 15 mins and system didn't reboot
~~~
~$ cat /proc/cpuinfo | head
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 183
model name : 13th Gen Intel(R) Core(TM) i9-13900
stepping : 1
microcode : 0x10e
cpu MHz : 2000.000
cache size : 36864 KB
physical id : 0
$ uname -a
Linux x31-Precision-3260 6.1.0-1014-oem #14-Ubuntu SMP PREEMPT_DYNAMIC Fri May 19 06:02:46 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

$ sudo apt policy thermald
[sudo] password for x31:
thermald:
  Installed: 2.4.9-1ubuntu0.3
  Candidate: 2.4.9-1ubuntu0.3
  Version table:
 *** 2.4.9-1ubuntu0.3 500
        500 http://tw.archive.ubuntu.com/ubuntu jammy-proposed/main amd64 Packages
        100 /var/lib/dpkg/status
     2.4.9-1ubuntu0.3 500
        500 https://ppa.launchpadcontent.net/kobako/exp-thermald/ubuntu jammy/main amd64 Packages
     2.4.9-1ubuntu0.2 500
        500 http://tw.archive.ubuntu.com/ubuntu jammy-updates/main amd64 Packages
     2.4.9-1 500
        500 http://tw.archive.ubuntu.com/ubuntu jammy/main amd64 Packages

~~~

tags: added: verification-done verification-done-jammy
removed: verification-needed verification-needed-jammy
Revision history for this message
Andreas Hasenack (ahasenack) wrote :

The test plan calls for a 6h run:
"""
5. monitor cpu temperature for 6 hours if you didn’t hit overheat and shutdown issue.
"""

Yet the verification says it ran for only 15min?

Is the original test plan wrong, or why is a 15min run ok?

tags: added: verification-needed-jammy
removed: verification-done-jammy
Revision history for this message
koba (kobako) wrote :

@Andreas, as per personal experience, if issue is occurred, it would shutdown less than 10 mins.
so is it necessary to run 6 hours?

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

I'm just noticing the discrepancy between the test plan, and the test execution. If you are confident that a 10min run reproduces the problem, then please update the test plan accordingly, because it currently says 6h.

Revision history for this message
koba (kobako) wrote :

@Andreas, re-ran the test case for 7 hours and system didn't reboot.
run stress-ng and mprime simutaneously.

tags: added: verification-done-jammy
removed: verification-needed-jammy
summary: - Fix ADL: System shutdwon automically when run Prime95&stress-ng with
+ Fix ADL: System shutdown automically when run Prime95&stress-ng with
i9-12900K
summary: - Fix ADL: System shutdown automically when run Prime95&stress-ng with
+ Fix ADL: System shutdown automatically when run Prime95&stress-ng with
i9-12900K
Revision history for this message
Łukasz Zemczak (sil2100) wrote : Update Released

The verification of the Stable Release Update for thermald has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package thermald - 2.4.9-1ubuntu0.3

---------------
thermald (2.4.9-1ubuntu0.3) jammy; urgency=medium

  * Cherry-pick following fixes from thermald 2.5.1 and 2.5.2 (LP: #1995606)
  * debian/patches/0013-Add-AlderLake-N.patch
    - Add support for Adler Lake N (LP: #2012260)
  * debian/patches/0007-Add-INT3400-base-path-for-Raptor-Lake.patch
    - Fix RPL: Add INT3400 base path(LP: #1989044)
  * debian/patches/0014-Process-ITMT-v2.patch
    - Support ITMTv2 for Raptor Lake (LP: #2007579)
  * debian/patches/0008-Install-passive-default.patch
    - Fix throttled GPU (LP: #1981087)
  * debian/patches/0012-Always-match-motion-0.patch
    - Fix in-motion function doesn't work (LP: #2018275)
  * debian/patches/0003-Parse-ITMT-Table.patch
  * debian/patches/0004-Add-capability-for-min-max-per-trip.patch
  * debian/patches/0005-Install-ITMT_target.patch
  * debian/patches/0006-Use-per-trip-min-max.patch
  * debian/patches/0009-Parse-idsp-and-trips.patch
  * debian/patches/0010-use-PL1-max-min-from-PPCC-when-policies-match.patch
  * debian/patches/0011-Parse-GDDV-before-thd_engine-init.patch
    - Fix i9-12900k shutdown when run Prime95 and stress-ng (LP: #2018236)

 -- Koba Ko <email address hidden> Wed, 05 Jul 2023 13:37:32 +0200

Changed in thermald (Ubuntu Jammy):
status: Fix Committed → Fix Released
Changed in thermald (Ubuntu Focal):
assignee: nobody → koba (kobako)
importance: Undecided → High
Changed in thermald (Ubuntu Focal):
status: New → In Progress
jeremyszu (os369510)
Changed in linux-meta-hwe-5.15 (Ubuntu Jammy):
status: New → Fix Released
status: Fix Released → Invalid
Changed in linux (Ubuntu):
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.