S3 stress issue for amdgpu Navi 31/Navi33

Bug #2024427 reported by You-Sheng Yang
24
This bug affects 2 people
Affects Status Importance Assigned to Milestone
HWE Next
New
Undecided
Unassigned
linux (Ubuntu)
Status tracked in Mantic
Jammy
Invalid
Undecided
Unassigned
Lunar
Fix Committed
High
You-Sheng Yang
Mantic
Triaged
Undecided
Unassigned
linux-firmware (Ubuntu)
Status tracked in Mantic
Jammy
Fix Released
High
You-Sheng Yang
Lunar
Fix Released
High
You-Sheng Yang
Mantic
Fix Released
Undecided
Unassigned
linux-oem-6.1 (Ubuntu)
Status tracked in Mantic
Jammy
Fix Released
Undecided
Unassigned
Lunar
Invalid
Undecided
Unassigned
Mantic
Invalid
Undecided
Unassigned

Bug Description

[SRU Justification]

BugLink: https://bugs.launchpad.net/bugs/2024427

[Impact]

Under stress testing it was reported that the amdgpu Navi31/Navi33 platforms
will sometimes fail to wake from S3.

[Fix]

kernel patches:
ac2f5739fdca drm/amdgpu/mes11: enable reg active poll
a2fe4534bb38 drm/amd/amdgpu: update mes11 api def
da9a8dc33da2 drm/amdgpu: reserve the old gc_11_0_*_mes.bin
616843d5a11b drm/amd/amdgpu: introduce gc_*_mes_2.bin v2
09bf14907d86 drm/amdgpu: declare firmware for new MES 11.0.4

firmware patches:
* Navi31: ffe1a41e2ddb amdgpu: update GC 11.0.0 firmware for amd.5.5 release
* Navi33: a5d7b4df1a76 amdgpu: update GC 11.0.2 firmware for amd.5.5 release

[Test Case]

$ checkbox-cli run com.canonical.certification::stress-suspend-30-cycles-with-reboots-automated

[Where problems could occur]

Little we know about the firmware fixes. However, while with these commits have
been pulled via stable kernel fixes, the driver begins to request new firmware
blobs of a different filename.

[Other Info]

The kernel driver commits are in v6.4-rc1, backported to v6.3.4, v6.1.31, and
partially (missing da9a8dc33da2, 616843d5a11b) v6.2.16. Only linux/lunar has to
be fixed.

For the firmware parts, they have been included in linux-firmware/mantic,
leaving linux-firmware/lunar and linux-firmware/jammy to be fixed.

========== original bug report ==========

amdgppu update is needed to fix some potential Navi31/Navi33 S3 issue.

amdgpu:
ac2f5739fdca drm/amdgpu/mes11: enable reg active poll
a2fe4534bb38 drm/amd/amdgpu: update mes11 api def
da9a8dc33da2 drm/amdgpu: reserve the old gc_11_0_*_mes.bin
616843d5a11b drm/amd/amdgpu: introduce gc_*_mes_2.bin v2
09bf14907d86 drm/amdgpu: declare firmware for new MES 11.0.4

Navi31:
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=ffe1a41e2ddbc39109b12d95dcac282d90eba8fc
Navi33:
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=a5d7b4df1a76f82e2ecb725cc1b56ce111830bac

You-Sheng Yang (vicamo)
tags: added: oem-priority originate-from-2024123 somerville
You-Sheng Yang (vicamo)
Changed in linux-firmware (Ubuntu Mantic):
status: New → Invalid
Revision history for this message
You-Sheng Yang (vicamo) wrote :

All the nominated commits are in v6.4-rc1, backported to v6.3.4, v6.1.31, and partially (missing da9a8dc33da2, 616843d5a11b) v6.2.16.

Changed in linux-oem-6.1 (Ubuntu Lunar):
status: New → Invalid
Changed in linux-oem-6.1 (Ubuntu Mantic):
status: New → Invalid
Revision history for this message
You-Sheng Yang (vicamo) wrote :

Need details (symptoms, reproduce steps)

Changed in linux-oem-6.1 (Ubuntu Jammy):
status: New → Incomplete
Changed in linux (Ubuntu Mantic):
status: New → Invalid
Changed in linux-firmware (Ubuntu Mantic):
status: Invalid → New
Revision history for this message
Timo Aaltonen (tjaalton) wrote :

-1016 has 6.1.31

Changed in linux-oem-6.1 (Ubuntu Jammy):
status: Incomplete → Fix Committed
Juerg Haefliger (juergh)
tags: added: kern-7207
Revision history for this message
Mario Limonciello (superm1) wrote (last edit ):

mantic has linux-firmware 20230629.gitee91452d-0ubuntu1, closing that task.

Changed in linux-firmware (Ubuntu Mantic):
status: New → Fix Released
Changed in linux (Ubuntu Jammy):
status: New → Invalid
Changed in linux (Ubuntu Lunar):
status: New → Triaged
Changed in linux (Ubuntu Mantic):
status: Invalid → Triaged
Revision history for this message
You-Sheng Yang (vicamo) wrote :

In oem-6.1 6.1.0-1016.16 through stable updates bug 2021945.

Changed in linux-oem-6.1 (Ubuntu Jammy):
status: Fix Committed → Fix Released
Revision history for this message
You-Sheng Yang (vicamo) wrote :

@superm1, do you have the symptoms, reproducing steps? We'll need them to justify the need to SRU firmware changes.

Revision history for this message
Mario Limonciello (superm1) wrote :

Under stress testing it was reported that the system will sometimes fail to wake from S3.

This firmware version has been run heavily under Q/A on various 6.1.y kernel and those issues don't occur after taking this runtime firmware update.

You-Sheng Yang (vicamo)
Changed in linux-firmware (Ubuntu Jammy):
assignee: nobody → You-Sheng Yang (vicamo)
importance: Undecided → High
status: New → In Progress
Changed in linux-firmware (Ubuntu Lunar):
assignee: nobody → You-Sheng Yang (vicamo)
importance: Undecided → High
status: New → In Progress
Changed in linux (Ubuntu Lunar):
assignee: nobody → You-Sheng Yang (vicamo)
importance: Undecided → High
status: Triaged → In Progress
Revision history for this message
You-Sheng Yang (vicamo) wrote :
description: updated
Revision history for this message
Juerg Haefliger (juergh) wrote :

I'm accepting the PRs but please note that the emails contain again a PR *and* a git (binary) diff. The diff is useless and fills up inboxes and the mailing list. It might even trigger the message-too-big mailing list filter. I've asked in the to please not do that. I'm saying it again: Please don't do that or I have to start rejecting PRs.

Changed in linux-firmware (Ubuntu Jammy):
status: In Progress → Fix Committed
Changed in linux-firmware (Ubuntu Lunar):
status: In Progress → Fix Committed
Revision history for this message
Timo Aaltonen (tjaalton) wrote : Please test proposed package

Hello You-Sheng, or anyone else affected,

Accepted linux-firmware into lunar-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/linux-firmware/20230323.gitbcdcfbcf-0ubuntu1.3 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-lunar to verification-done-lunar. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-lunar. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Revision history for this message
Timo Aaltonen (tjaalton) wrote : Re: potential S3 issue for amdgpu Navi 31/Navi33

the test case should include testing on 6.2 without the patches

Revision history for this message
Timo Aaltonen (tjaalton) wrote : Please test proposed package

Hello You-Sheng, or anyone else affected,

Accepted linux-firmware into jammy-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/linux-firmware/20220329.git681281e4-0ubuntu3.15 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-jammy to verification-done-jammy. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-jammy. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Revision history for this message
You-Sheng Yang (vicamo) wrote : Re: potential S3 issue for amdgpu Navi 31/Navi33
Revision history for this message
You-Sheng Yang (vicamo) wrote :

@juergh, ok. I deliberately added that `-p` argument to `git request-pull` in the past in order to give more details before one decides to fetch it down. I've removed it from my local scripts.

Revision history for this message
Renjith Pananchikkal (renjith-pananchikkal) wrote :

Firmware file "gc_11_0_1_mes_2.bin" is missing

$ uname -r
6.1.0-1016-oem

$ sudo dpkg -l | grep linux-firmware
ii linux-firmware 20220329.git681281e4-0ubuntu3.15 all Firmware for Linux kernel drivers

$ sudo dmesg | grep amdgpu | grep -i error
[ 4.151591] amdgpu 0000:c3:00.0: Direct firmware load for amdgpu/gc_11_0_1_mes_2.bin failed with error -2

$ ls -1 /lib/firmware/amdgpu/gc_11_0_1_mes*
/lib/firmware/amdgpu/gc_11_0_1_mes1.bin
/lib/firmware/amdgpu/gc_11_0_1_mes.bin

Revision history for this message
Mario Limonciello (superm1) wrote :

Just to clarify Renjith tested on a Phoenix machine.
Based on Renjith's test result though, this is a fail and an extra change is needed.

It turns out that this linux-firmware commit should come too:

1c513ec7 ("amdgpu: Update GC 11.0.1 and 11.0.4")

This is because of kernel commit

616843d5a11b drm/amd/amdgpu: introduce gc_*_mes_2.bin v2

This actually changes for all RDNA3 GPUs, so Phoenix needs the updated F/W too.

tags: added: verification-failed-jammy
Revision history for this message
You-Sheng Yang (vicamo) wrote :

See follow-up in bug 2027959.

Revision history for this message
Robie Basak (racb) wrote : Please test proposed package

Hello You-Sheng, or anyone else affected,

Accepted linux-firmware into lunar-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/linux-firmware/20230323.gitbcdcfbcf-0ubuntu1.4 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-lunar to verification-done-lunar. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-lunar. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Revision history for this message
Robie Basak (racb) wrote :

Hello You-Sheng, or anyone else affected,

Accepted linux-firmware into jammy-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/linux-firmware/20220329.git681281e4-0ubuntu3.16 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-jammy to verification-done-jammy. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-jammy. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

tags: added: verification-done-jammy
removed: verification-failed-jammy
Revision history for this message
Timo Aaltonen (tjaalton) wrote : Re: potential S3 issue for amdgpu Navi 31/Navi33

lunar needs to be verified as well

Revision history for this message
You-Sheng Yang (vicamo) wrote :

verified linux-firmware/lunar version 20230323.gitbcdcfbcf-0ubuntu1.4.

tags: added: verification-done-lunar
Revision history for this message
Timo Aaltonen (tjaalton) wrote : Update Released

The verification of the Stable Release Update for linux-firmware has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Launchpad Janitor (janitor) wrote : Re: potential S3 issue for amdgpu Navi 31/Navi33

This bug was fixed in the package linux-firmware - 20220329.git681281e4-0ubuntu3.16

---------------
linux-firmware (20220329.git681281e4-0ubuntu3.16) jammy; urgency=medium

  * Follow-up: potential S3 issue for amdgpu Navi 31/Navi33 (LP: #2027959)
    - amdgpu: update GC 11.0.1 firmware for amd.5.5 release
    - amdgpu: update GC 11.0.4 firmware for amd.5.5 release
    - amdgpu: Update GC 11.0.1 and 11.0.4
  * Add firmware files for HP G10 series laptops (LP: #2023193)
    - cirrus: Add firmware and tuning files for HP G10 series laptops

linux-firmware (20220329.git681281e4-0ubuntu3.15) jammy; urgency=medium

  * upgrade iwlwifi firmware of FW API 72 for WiFi 6E support in Malaysia and Morocco (LP: #2020627)
    - iwlwifi: add new FWs from core72-129 release
    - iwlwifi: add new PNVM binaries from core74-44 release
    - iwlwifi: add new FWs from core74_pv-60 release
    - iwlwifi: add new FWs from core75-47 release
    - iwlwifi: add new FWs from core76-35 release
    - iwlwifi: update core69 and core72 firmwares for Ty device
    - iwlwifi: update core69 and core72 firmwares for So device
  * i915: Add DMC/GuC/HuC firmware for Meteor Lake (LP: #2026253)
    - i915: Add DMC v2.11 for MTL
    - i915: Update MTL DMC to v2.12
    - i915: Add GuC v70.6.6 for MTL
    - i915: Add HuC v8.5.0 for MTL
  * AMD Rembrandt / Phoenix PSR-SU related freezes (LP: #2024774)
    - SAUCE: DMCUB updates for DCN314 and Yellow Carp
  * potential S3 issue for amdgpu Navi 31/Navi33 (LP: #2024427)
    - amdgpu: update GC 11.0.0 firmware for amd.5.5 release
    - amdgpu: update GC 11.0.2 firmware for amd.5.5 release

 -- Juerg Haefliger <email address hidden> Wed, 19 Jul 2023 10:37:52 +0200

Changed in linux-firmware (Ubuntu Jammy):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux-firmware - 20230323.gitbcdcfbcf-0ubuntu1.4

---------------
linux-firmware (20230323.gitbcdcfbcf-0ubuntu1.4) lunar; urgency=medium

  * Follow-up: potential S3 issue for amdgpu Navi 31/Navi33 (LP: #2027959)
    - amdgpu: update GC 11.0.1 firmware for amd.5.5 release
    - amdgpu: update GC 11.0.4 firmware for amd.5.5 release
    - amdgpu: Update GC 11.0.1 and 11.0.4
  * Add firmware files for HP G10 series laptops (LP: #2023193)
    - cirrus: Add firmware and tuning files for HP G10 series laptops

linux-firmware (20230323.gitbcdcfbcf-0ubuntu1.3) lunar; urgency=medium

  * AMD Rembrandt / Phoenix PSR-SU related freezes (LP: #2024774)
    - SAUCE: DMCUB updates for DCN314 and Yellow Carp
  * potential S3 issue for amdgpu Navi 31/Navi33 (LP: #2024427)
    - amdgpu: update GC 11.0.0 firmware for amd.5.5 release
    - amdgpu: update GC 11.0.2 firmware for amd.5.5 release

 -- Juerg Haefliger <email address hidden> Wed, 19 Jul 2023 10:46:52 +0200

Changed in linux-firmware (Ubuntu Lunar):
status: Fix Committed → Fix Released
Changed in linux (Ubuntu Lunar):
status: In Progress → Fix Committed
Revision history for this message
Timo Aaltonen (tjaalton) wrote :

NOTE:

The updated firmware regressed (at least) rx7900xtx (navi31) so that it fails to boot, and the updated kernel does not fix that. This means that the firmware update will be reverted.

Revision history for this message
Timo Aaltonen (tjaalton) wrote :
Revision history for this message
Timo Aaltonen (tjaalton) wrote :

actually, the kernel to have the fixes is for the next cycle so the fixed kernel won't get in -updates before 04-Sep...

Revision history for this message
Steve Langasek (vorlon) wrote :

Because the linux/lunar changes had not landed, linux-firmware has been rolled back in both jammy and lunar per LP: #2029396. It appears from the comment history that the jammy revert may not have been necessary. Nevertheless it has now happened, so setting the bug state accordingly.

Changed in linux-firmware (Ubuntu Lunar):
status: Fix Released → Triaged
Changed in linux-firmware (Ubuntu Jammy):
status: Fix Released → Triaged
Revision history for this message
Mario Limonciello (superm1) wrote :

> It appears from the comment history that the jammy revert may not have been necessary.

As 6.2 is about be the default for point release media in Jammy and pulling in the the kernel commits is off the table, I do think for THIS bug it was pragmatic. bug 2027959 is another story.

summary: - potential S3 issue for amdgpu Navi 31/Navi33
+ S3 stress issue for amdgpu Navi 31/Navi33
Revision history for this message
Mario Limonciello (superm1) wrote :

I looked at the latest kernel in lunar-proposed (6.2.0-30.30) and I still don't see all the needed patches.

It has these two:
    - drm/amd/amdgpu: introduce gc_*_mes_2.bin v2
    - drm/amdgpu: reserve the old gc_11_0_*_mes.bin

But it's missing these:

ac2f5739fdca drm/amdgpu/mes11: enable reg active poll
a2fe4534bb38 drm/amd/amdgpu: update mes11 api def
09bf14907d86 drm/amdgpu: declare firmware for new MES 11.0.4

Revision history for this message
Mario Limonciello (superm1) wrote :

I'm sorry; but I'll need to retract comment #30. Those other 3 commits were merged in at a previous time, so I didn't find them in the changelog. I looked at git history and they ARE present in the lunar kernel. I believe it's safe to upgrade the firmware now with 6.2.0-30.30 or later.

Juerg Haefliger (juergh)
Changed in linux-firmware (Ubuntu Lunar):
status: Triaged → Fix Committed
Changed in linux-firmware (Ubuntu Jammy):
status: Triaged → Fix Committed
Revision history for this message
Timo Aaltonen (tjaalton) wrote : Please test proposed package

Hello You-Sheng, or anyone else affected,

Accepted linux-firmware into lunar-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/linux-firmware/20230323.gitbcdcfbcf-0ubuntu1.7 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-lunar to verification-done-lunar. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-lunar. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Revision history for this message
Timo Aaltonen (tjaalton) wrote :

Hello You-Sheng, or anyone else affected,

Accepted linux-firmware into jammy-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/linux-firmware/20220329.git681281e4-0ubuntu3.19 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-jammy to verification-done-jammy. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-jammy. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Revision history for this message
Mario Limonciello (superm1) wrote :

AMD internal team has tested this updated firmware package against Navi31 and Navi33 dGPUs on both OEM-6.1 and OEM-6.5 kernels. No new problems introduced.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux-firmware - 20220329.git681281e4-0ubuntu3.19

---------------
linux-firmware (20220329.git681281e4-0ubuntu3.19) jammy; urgency=medium

  * Missing firmware for Intel VPU on Intel Meteor Lake platforms (LP: #2031882)
    - SAUCE: Add firmware for Intel VPU on Meteor Lake platforms
  * Support Realtek RTL8852CE WiFi 6E/BT Combo (LP: #2025672)
    - rtl_bt: Add firmware v2 file for RTL8852C
    - rtw89: 8852c: update fw to v0.27.56.8
    - rtw89: 8852c: update fw to v0.27.56.9
    - rtw89: 8852c: update fw to v0.27.56.10
    - rtw89: 8852c: update fw to v0.27.56.13
  * S3 stress issue for amdgpu Navi 31/Navi33 (LP: #2024427)
    - amdgpu: update GC 11.0.0 firmware for amd.5.5 release
    - amdgpu: update GC 11.0.2 firmware for amd.5.5 release
  * Support mipi camera on Intel Meteor Lake platform (LP: #2031412)
    - SAUCE: Update Intel IPU6 firmware

 -- Juerg Haefliger <email address hidden> Fri, 22 Sep 2023 15:10:51 +0200

Changed in linux-firmware (Ubuntu Jammy):
status: Fix Committed → Fix Released
Revision history for this message
Jeroen Webb (jeroenwebb) wrote :

I upgraded to linux-firmware_20220329.git681281e4-0ubuntu3.19 last night and I couldn't start my computer without going into recovery mode this morning. There were a few warnings about missing firmware in the apt logs, but when I googled them, the usual response is "that's normal".

I played around with a pretty good chunk of settings in BIOS, such as Resizable BAR, but it consistently hangs right after "JPEG decode is enabled in VM mode".

In recovery mode, I only have access to one monitor; I believe that's by design.

GPU: RX 7900 XT
Kernel: 6.2.0-34-generic

I installed my other GPU during my lunch break, an RX 6700 XT, and it booted up just fine in normal mode.

I also tried linux-firmware_20220329.git681281e4-0ubuntu3.20 and it has the same behavior.

W: Possible missing firmware /lib/firmware/amdgpu/ip_discovery.bin for module amdgpu
W: Possible missing firmware /lib/firmware/amdgpu/vega10_cap.bin for module amdgpu
W: Possible missing firmware /lib/firmware/amdgpu/sienna_cichlid_cap.bin for module amdgpu
W: Possible missing firmware /lib/firmware/amdgpu/navi12_cap.bin for module amdgpu
W: Possible missing firmware /lib/firmware/amdgpu/aldebaran_cap.bin for module amdgpu
W: Possible missing firmware /lib/firmware/amdgpu/gc_11_0_0_toc.bin for module amdgpu
W: Possible missing firmware /lib/firmware/amdgpu/sienna_cichlid_mes1.bin for module amdgpu
W: Possible missing firmware /lib/firmware/amdgpu/sienna_cichlid_mes.bin for module amdgpu
W: Possible missing firmware /lib/firmware/amdgpu/navi10_mes.bin for module amdgpu
W: Possible missing firmware /lib/firmware/amdgpu/gc_11_0_3_mes.bin for module amdgpu
W: Possible missing firmware /lib/firmware/amdgpu/gc_11_0_2_mes_2.bin for module amdgpu
W: Possible missing firmware /lib/firmware/amdgpu/gc_11_0_0_mes_2.bin for module amdgpu

Revision history for this message
Mario Limonciello (superm1) wrote (last edit ):

Can you please upload your journal from the failure?

Also can you please experiment with rolling back to this one version of this one firmware binary:

https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/amdgpu/gc_11_0_0_imu.bin?id=e32209f07556427c9b0b841bfb76ca71e8beab05

Place that in /lib/firmware/updates/amdgpu and rebuild your initramfs and then reboot.

Revision history for this message
Jeroen Webb (jeroenwebb) wrote :

I believe this is the right journal. It has a couple of amdgpu call stacks.

I had a lot of restarts trying to get stuff working.

I'll try out that specific file later today.

Revision history for this message
Jeroen Webb (jeroenwebb) wrote :

My computer booted just fine with that file in /lib/firmware/amdgpu. I downloaded it again after booting, just to make sure that it was the right file version.

% ls -l /lib/firmware/amdgpu/gc_11_0_0_imu.bin
-rw-r--r-- 1 root root 132352 Oct 7 09:36 /lib/firmware/amdgpu/gc_11_0_0_imu.bin

% sha1sum /lib/firmware/amdgpu/gc_11_0_0_imu.bin
2587fa941d4645e5e38ce4067f630a5c6d51bc23 /lib/firmware/amdgpu/gc_11_0_0_imu.bin

% sha1sum Downloads/gc_11_0_0_imu\(1\).bin Downloads/gc_11_0_0_imu.bin
2587fa941d4645e5e38ce4067f630a5c6d51bc23 Downloads/gc_11_0_0_imu(1).bin
2587fa941d4645e5e38ce4067f630a5c6d51bc23 Downloads/gc_11_0_0_imu.bin

% dpkg -l | grep firmware
ii amd64-microcode 3.20191218.1ubuntu2.2 amd64 Processor microcode firmware for AMD CPUs
ii firmware-sof-signed 2.0-1ubuntu4.1 all Intel SOF firmware - signed
ii intel-microcode 3.20230808.0ubuntu0.22.04.1 amd64 Processor microcode firmware for Intel CPUs
ii linux-firmware 20220329.git681281e4-0ubuntu3.19 all Firmware for Linux kernel drivers

Revision history for this message
Mario Limonciello (superm1) wrote :

Thanks, let me check with others on Monday if that's the right action for your issue.

Changed in linux-firmware (Ubuntu Jammy):
status: Fix Released → Triaged
Revision history for this message
Mario Limonciello (superm1) wrote :
Revision history for this message
Alvin Huan (alvin-huan) wrote :

Please help fix this issue asap. Thanks!

https://bugs.launchpad.net/ubuntu/+source/linux-firmware/+bug/2038745

[Impact]

NV31 XTX/XTW cannot boot into Ubuntu 22.04.3 after upgrading distro to latest Ubuntu 22.04.3 kernel and firmware

Not repro with NV21 XL or W6800

[Fix]
Cherry-pick the latest PMFW https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/amdgpu/smu_13_0_0.bin

[Other Info]

The IMU FW included in Linux firmware package needs to be paired with the latest PMFW smu_13_0_0.bin for Navi31

Jammy firmware updates only include latest IMU firmware.

https://www.ubuntuupdates.org/package/core/jammy/main/updates/linux-firmware

20220329.git681281e4-0ubuntu3.19

https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/amdgpu/gc_11_0_0_imu.bin
(Missing) https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/amdgpu/smu_13_0_0.bin

Revision history for this message
Mario Limonciello (superm1) wrote :
Changed in linux-firmware (Ubuntu Jammy):
status: Triaged → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux-firmware - 20230323.gitbcdcfbcf-0ubuntu1.7

---------------
linux-firmware (20230323.gitbcdcfbcf-0ubuntu1.7) lunar; urgency=medium

  * linux-firmware is outdated (LP: #2033441)
    - nvidia: update Tu10x and Tu11x signed firmware to support newer Turing HW
  * S3 stress issue for amdgpu Navi 31/Navi33 (LP: #2024427)
    - amdgpu: update GC 11.0.0 firmware for amd.5.5 release
    - amdgpu: update GC 11.0.2 firmware for amd.5.5 release

 -- Juerg Haefliger <email address hidden> Fri, 22 Sep 2023 15:04:35 +0200

Changed in linux-firmware (Ubuntu Lunar):
status: Fix Committed → Fix Released
You-Sheng Yang (vicamo)
tags: added: originate-from-2033369
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.