HWE-22.04 Kernel 5.19 breaks suspend/wake on newer AMD Ryzen CPUs

Bug #2007718 reported by Roemer Claasen
66
This bug affects 11 people
Affects Status Importance Assigned to Milestone
linux-hwe-5.19 (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

After 22.04 LTS upgraded the linux-generic-hwe kernel to 5.19.0.32 (from 5.15.0.60) suspend/wake up is broken on my laptop.

The system refuses to wake up in about 50% of the cases, seemingly random. The only "solution" is a forced system power down.

Reverting back to 5.15.0.60 resolves this issue immediately.

I suspect some patch was not re-applied to 5.19, since there are many postings on Reddit (for instance) about sleep problems with newer AMD CPUs, but 5.15 was running beautifully.

My machine is a Lenovo Thinkpad T14s AMD Gen 3, AMD 6850u CPU.

UPDATE:

See attached log. This may not be a case of the laptop failing to resume, but a case of the laptop not suspending properly!

What happened was the following. When packing in my things, I closed the laptop and put it in my bag. At home I picked it up again, and immediately noticed how unusually hot it was. The laptop didn't resume properly on opening the lid, but was frozen as before.

So this might actually be a case of the laptop FAILING TO SUSPEND properly, instead of failing to wake up.

UPDATE:

I just noticed the battery was down to the low 40% from at least 80%. Given the temperature on opening the laption I suspect the CPU has been working hard during "sleep time".

EDIT:

Changed title after confirmation for earlier Ryzen 5000 CPU.

description: updated
Revision history for this message
Roemer Claasen (rclaasen) wrote (last edit ):

Installing mainline 6.1 resolves the issues again. Now running:

Linux 6.1.12-060112-generic #202302141939 SMP PREEMPT_DYNAMIC Tue Feb 14 19:45:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

UPDATE:
Mainline 6.1 breaks other things. Screen is flickering from time to time, more annoying than the sleep/ resume problems. Back on 5.15 for now.

Revision history for this message
Michal Zubkowicz (michalzubkowicz) wrote :

I can confirm this regression on LG Gram 17 with Intel CPU

Revision history for this message
Roemer Claasen (rclaasen) wrote (last edit ):

kern.log attached, suspend on line 59124

See above: it might actually be the case the laptop isn't suspending properly, instead of failing to wake up.

description: updated
description: updated
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux-meta-hwe-5.19 (Ubuntu):
status: New → Confirmed
Revision history for this message
Steven Maude (stevenmaude) wrote :

This also affects at least some older Ryzens too. I have a 5800u that is affected by this bug.

As mentioned, the previous 5.15 kernel still suspends and resumes reliably.

Revision history for this message
Roemer Claasen (rclaasen) wrote :

For completeness: Apport data

summary: - HWE-22.04 Kernel 5.19 breaks suspend/wake on newer AMD Ryzen 6000u
+ HWE-22.04 Kernel 5.19 breaks suspend/wake on newer AMD Ryzen CPUs
description: updated
Revision history for this message
Tim Janes (timwj) wrote :
tags: added: amdgpu jammy regression-release suspend-resume
Revision history for this message
Adrian Grzeca (lycanananas) wrote (last edit ):

I have same problem on Adler Lake Intel. After resume there is no any logs in journal. I have Nvidia P2000 GPU.

Vendor ID: GenuineIntel
  Model name: 12th Gen Intel(R) Core(TM) i7-12700KF

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Adrian, please open a separate bug from your machine by running:

  ubuntu-bug linux

affects: linux-meta-hwe-5.19 (Ubuntu) → linux-hwe-5.19 (Ubuntu)
Revision history for this message
Christopher Townsend (townsend) wrote :

I'm also experiencing this on my Lenovo Gen 3 T14s w/ Ryzen CPU :(

Revision history for this message
Andreas Knoben (andreasknoben1) wrote :

Same issue on Ubuntu 22.10 running kernel 5.19.0-35-generic, on a Lenovo IdeaPad 5 14ARE05 with an AMD Ryzen 7 4700U. The power light is not pulsing when the problem occurs, which indicates the system is not suspended. I also have not been able to discern a pattern as to when the problem does or does not occur.

Revision history for this message
Nobuto Murata (nobuto) wrote :

Just to give another data point, I've been running 5.19 from hwe-edge (instead of hwe) for some time but I realized this suspend issue after updating hwe-edge from 5.19.0.28.29~22.04.6 to 5.19.0.32.33~22.04.9.

So I went back and tested it as follows. It looks like it's a regression within 5.19. My 2 cents.

machine: ThinkPad T14 Gen 3 (21CFCTO1WW)
CPU: AMD Ryzen 7 PRO 6850U
Ubuntu release: jammy

Please note that I tested it with Secure boot *disabled* since 5.19.0-28 wouldn't boot somehow (I believe it was working fine with Secure boot enabled before).

$ uname -rv
5.19.0-35-generic #36~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Fri Feb 17 15:17:25 UTC 2

-> stuck at the second attempt of suspend

$ uname -rv
5.19.0-32-generic #33~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Mon Jan 30 17:03:34 UTC 2

-> stuck at the second attempt of suspend

$ uname -rv
5.19.0-28-generic #29~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Dec 15 12:05:40 UTC 2

-> 10/10 suspend-resume cycles succeeded

tags: added: fixed-in-linux-6.1 fixed-upstream
Revision history for this message
Marek Stasiak (marecki) wrote (last edit ):

Same problem on my PC with NVIDIA GTX 770

I reverted to 5.15 and issue is gone. Good to know it is fixed in 6.1.

[System]
OS: Ubuntu 22.04 Jammy Jellyfish
Arch: x86_64
Kernel: 5.15.0-67-generic (after reverting)
Desktop: ubuntu:GNOME
Display Server: x11

[CPU]
Vendor: GenuineIntel
Model: 12th Gen Intel(R) Core(TM) i5-12400F

[Graphics]
OpenGL Renderer: NVIDIA GeForce GTX 770/PCIe/SSE2
OpenGL Version: 4.6.0 NVIDIA 470.161.03

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Ubuntu 22.04 users can try kernel 6.1 by:

  sudo apt install linux-oem-22.04c

Revision history for this message
Roemer Claasen (rclaasen) wrote :

6.1 from linux-oem-22.04c indeed fixes this suspend/ resume issue for me.

6.1 from linux-oem-22.04c doesn't feel as mature as 5.15 though, there is some repaint/ flickering screen bug that I haven't seen in either 5.15 or 5.19 (but was also present in mainline 6.1). It does seem somewhat more efficient on battery (purely subjective "measurement").

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Does /sys/kernel/debug/dri/0/eDP-1/psr_capability (or something similar) say that PSR (panel self-refresh) is enabled? If so then usually disabling PSR will fix such bugs.

Revision history for this message
Roemer Claasen (rclaasen) wrote (last edit ):

Hi Daniel, thanks for the suggestion!

My current setup:

$ uname -rv
6.1.0-1007-oem #7-Ubuntu SMP PREEMPT_DYNAMIC Wed Feb 8 15:41:05 UTC 2023

$ cat /sys/kernel/debug/dri/0/eDP-1/psr_capability
Sink support: yes [0x03]
Driver support: yes [0x01]

$ cat /sys/kernel/debug/dri/0/eDP-1/psr_state
6

I was sort of expecting psr_state to be either 0 or 1?! can't immediately find documentation and have no time right now, but will do some experiments later and let you know the results.

BTW this seems unrelated to the suspend/resume, and since 6.1 is still not officially released by Ubuntu, I'm not filing another bug report. Should I?!

UPDATE:

Disabling PSR (Panel Self Refresh) indeed fixes the screen flickering. 6.1 from linux-oem-22.04c is now running without issues.
Disabling can be done by adding a kernel boot parameter:

 - open `/etc/ default/grub`
 - add "amdgpu.dcdebugmask=0x10" as kernel boot parameter to `GRUB_CMDLINE_LINUX_DEFAULT` (on my system this now reads "quiet splash amdgpu.dcdebugmask=0x10")
 - call `sudo update-grub`
 - reboot

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Yes the "repaint/ flickering screen bug" can be reported by running:

  ubuntu-bug linux-oem-22.04c

Revision history for this message
Roemer Claasen (rclaasen) wrote :
Revision history for this message
Mario Limonciello (superm1) wrote :

> Just to give another data point, I've been running 5.19 from hwe-edge (instead of hwe) for some time but I realized this suspend issue after updating hwe-edge from 5.19.0.28.29~22.04.6 to 5.19.0.32.33~22.04.9.

I had a try with the latest 5.19 HWE kernel on a Lenovo Z13 I have on hand (with a similar APU) but can't seem to reproduce the issue reported.

Since I can't seem to reproduce if this is a correct datapoint, I think ideally this should be bisected between those two git tags to figure out why this is happening.
Ubuntu-hwe-5.19-5.19.0-28.29_22.04.1
Ubuntu-hwe-5.19-5.19.0-32.33_22.04.1

You can clone from https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/jammy to get those two tags to bisect with.

> 6.1 from linux-oem-22.04c indeed fixes this suspend/ resume issue for me.

AFAICT 5.19 was EOL already upstream at that time, so Canonical was tracking patches from 6.0.y to bring into their 5.19 kernel. I /suspect/ what happened is by doing this some dependency commits from 6.0 are missing. If you can identify which commit caused the problem it will help a lot to figure out what else is missing.

Assuming it's an amdgpu regression (which we don't know right now) the one that looks most relevant is:

89bedb408c48 drm/amdgpu: disallow gfxoff until GC IP blocks complete s2idle resume

So another option is for one of the Canonical kernel guys to try to build a 5.19 test kernel with that reverted. If that helps I'll look for the other related commits to this.

Revision history for this message
Marcelo (lopezregaelbrujo) wrote :
Download full text (7.2 KiB)

Hi
Installing 6.1 linux-oem-22.04c fixes the second suspend/ resume for me ( the first suspend always work).
  sudo apt install linux-oem-22.04c

6.1.0-1008-oem #8-Ubuntu SMP PREEMPT_DYNAMIC Fri Mar 3 10:51:21 UTC 2023

HUAWEI

/0 bus BOM-WXX9-PCB-B2
/0/0 memory 128KiB BIOS
/0/4 processor AMD Ryzen 5 5500U with Radeon Graphics
/0/4/5 memory 384KiB L1 caché
/0/4/6 memory 3MiB L2 caché
/0/4/7 memory 8MiB L3 caché
/0/d memory 8GiB Memoria de sistema
/0/d/0 memory 4GiB Fila de chips DDR4 Síncrono Unbuffered (Unregistered) 3200 MHz (0,3 ns)
/0/d/1 memory 4GiB Fila de chips DDR4 Síncrono Unbuffered (Unregistered) 3200 MHz (0,3 ns)
/0/100 bridge Renoir/Cezanne Root Complex
/0/100/0.2 generic Renoir/Cezanne IOMMU
/0/100/2.2 bridge Renoir/Cezanne PCIe GPP Bridge
/0/100/2.2/0 wlp1s0 network RTL8822CE 802.11ac PCIe Wireless Network Adapter
/0/100/2.4 bridge Renoir/Cezanne PCIe GPP Bridge ...

Read more...

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.