HDMI output freezes under current/proposed impish kernels
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux-raspi (Ubuntu) |
Fix Released
|
Critical
|
Unassigned | ||
Impish |
Fix Released
|
Critical
|
Unassigned |
Bug Description
[Impact]
The UI of Impish desktop for raspi occasionally freezes which can be triggered by playing videos and moving windows around. Seems to happen more frequently when using higher resolutions and/or higher refresh rates (higher than 1920x1080@60Hz).
Kernel errors are along the lines of:
[ 513.762138] [drm:drm_
[ 513.762288] [drm:drm_
mmit wait timed out
[ 513.762381] [drm:drm_
done timed out
[Test Case]
Install current Impish desktop for raspi on a Pi 4. Use a high-refresh/
[Fix]
The Pi foundation identified a couple of upstream vc4 commits that seem to interfere with their downstream patches. Revert those to match the Pi foundations currently released 5.10 kernel.
[Where Problems Could Occur]
The changes are confined to the vc4 driver, so problems would only show up if that driver is in use. Which is only the case for the Ubuntu raspi desktop image, so server images should not be impacted at all.
[Original Description]
Under the current (5.13.0-1007.8) or proposed (5.13.0-1008.9) kernels for the Ubuntu Pi pre-installed desktop impish release, the HDMI output occasionally freezes. A known workaround at this time is to change the following line in /boot/firmware/
dtoverlay=
To the following:
dtoverlay=
In other words, to use the "fake" KMS overlay (fkms) instead of the "full" KMS overlay (kms).
I've been unable to determine a reliable method of guaranteeing a freeze, but it appears to occur much more readily when video playback is occurring, and when other interactions (especially moving windows around, minimizing, restoring) occurs simultaneously. Display suspend also periodically causes the same hang, which made me suspect this might be related to #1944397 but it appears that had a separate cause (now resolved).
The following dmesg outputs have been observed immediately after the display hang; this one from 1007.8:
[ 513.762138] [drm:drm_
[ 513.762288] [drm:drm_
mmit wait timed out
[ 513.762381] [drm:drm_
done timed out
[ 524.002211] [drm:drm_
[ 524.002404] [drm:drm_
] commit wait timed out
[ 534.242499] [drm:drm_
[ 534.242657] vc4-drm gpu: [drm] *ERROR* Timed out waiting for commit
[ 534.250685] ------------[ cut here ]------------
[ 534.250701] refcount_t: underflow; use-after-free.
[ 534.250735] WARNING: CPU: 1 PID: 120 at lib/refcount.c:87 refcount_
[ 534.250758] Modules linked in: rfcomm cmac algif_hash algif_skcipher af_alg hci_uart btqca btrtl btbcm
btintel bnep snd_soc_hdmi_codec vc4 btsdio snd_soc_core input_leds bluetooth snd_compress snd_bcm2835(C)
snd_pcm_dmaengine ecdh_generic ecc snd_pcm brcmfmac snd_seq_midi snd_seq_midi_event bcm2835_codec(C) bcm
2835_isp(C) bcm2835_v4l2(C) brcmutil snd_rawmidi v4l2_mem2mem bcm2835_
videobuf2_vmalloc cfg80211 videobuf2_memops videobuf2_v4l2 snd_seq videobuf2_common videodev snd_seq_devi
ce mc snd_timer vc_sm_cma(C) raspberrypi_hwmon snd bcm2835_gpiomem rpivid_mem uio_pdrv_genirq uio sch_fq_
codel ip_tables x_tables autofs4 btrfs blake2b_generic xor xor_neon zstd_compress hid_generic usbhid raid
6_pq libcrc32c dm_mirror dm_region_hash dm_log spidev dwc2 v3d roles udc_core gpu_sched crct10dif_ce i2c_
brcmstb i2c_bcm2835 spi_bcm2835 drm_kms_helper syscopyarea xhci_pci xhci_pci_renesas sysfillrect sysimgbl
t fb_sys_fops cec drm phy_generic ac97_bus aes_arm64
[ 534.251066] CPU: 1 PID: 120 Comm: kworker/1:2 Tainted: G WC 5.13.0-1007-raspi #8-Ubuntu
[ 534.251076] Hardware name: Raspberry Pi 400 Rev 1.1 (DT)
[ 534.251083] Workqueue: events drm_mode_
[ 534.251239] pstate: 60400005 (nZCv daif +PAN -UAO -TCO BTYPE=--)
[ 534.251248] pc : refcount_
[ 534.251257] lr : refcount_
[ 534.251265] sp : ffff8000118cbb50
[ 534.251269] x29: ffff8000118cbb50 x28: ffff6cf7ee894400 x27: ffff6cf80502e000
[ 534.251285] x26: ffff6cf80502e000 x25: 0000000000000006 x24: ffff6cf7ee94c500
[ 534.251300] x23: ffffa94dff246018 x22: ffff6cf833068880 x21: ffff6cf805027c80
[ 534.251314] x20: ffff6cf88ead75ac x19: ffff6cf88ead7400 x18: 0000000000000000
[ 534.251328] x17: 0000000000000000 x16: ffffa94e0a243314 x15: 0000000000000000
[ 534.251342] x14: 0000000000000000 x13: 0000000000000030 x12: ffff800010035000
[ 534.251356] x11: ffffa94e0b30dfd0 x10: 00000000fffff000 x9 : ffffa94e09d09f54
[ 534.251370] x8 : 00000000ffffefff x7 : ffffa94e0b30dfd0 x6 : 0000000000000000
[ 534.251384] x5 : ffff6cf8b799f948 x4 : 0000000000000000 x3 : 0000000000000027
[ 534.251397] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff6cf800abec80
[ 534.251411] Call trace:
[ 534.251416] refcount_
[ 534.251424] vc4_bo_
[ 534.251473] vc4_cleanup_
[ 534.251518] drm_atomic_
[ 534.251604] vc4_atomic_
[ 534.251648] commit_
[ 534.251723] drm_atomic_
[ 534.251796] drm_atomic_
[ 534.251936] atomic_
[ 534.252073] drm_framebuffer
[ 534.252209] drm_mode_
[ 534.252346] process_
[ 534.252359] worker_
[ 534.252367] kthread+0x12c/0x140
[ 534.252374] ret_from_
[ 534.252386] ---[ end trace a97341262fc57e44 ]---
And a similar one from 1008.9 (note that most of the time, the stack trace *doesn't* appear hence I'm not sure if it's related to the display freeze itself, or something auxiliary):
[ 221.914617] [drm:drm_
[ 221.914617] [drm:drm_
[ 221.914795] [drm:drm_
[ 232.154711] [drm:drm_
[ 232.154898] vc4-drm gpu: [drm] *ERROR* Timed out waiting for commit
I can produce this same stack trace, but *without* a corresponding freeze by manually locking the desktop (Super+L) and waiting for the display fade. However, after the stack trace appears, one can press a key to bring the display back and login again happily. Here are several repeated traces from such activity under the 1008.9 proposed kernel:
[ 1043.431061] [drm:drm_
[ 1043.431136] [drm:drm_
[ 1043.431384] [drm:drm_
[ 1053.671415] [drm:drm_
[ 1053.671705] [drm:drm_
[ 1063.911800] [drm:drm_
[ 1063.912147] vc4-drm gpu: [drm] *ERROR* Timed out waiting for commit
[ 1181.162739] [drm:drm_
[ 1181.418774] [drm:drm_
[ 1181.419072] [drm:drm_
[ 1191.658996] [drm:drm_
[ 1191.659289] [drm:drm_
[ 1201.899168] [drm:drm_
[ 1201.899438] vc4-drm gpu: [drm] *ERROR* Timed out waiting for commit
[ 1332.461450] [drm:drm_
[ 1332.461460] [drm:drm_
[ 1332.461717] [drm:drm_
[ 1342.701602] [drm:drm_
[ 1342.701890] [drm:drm_
[ 1352.941798] [drm:drm_
[ 1352.942067] vc4-drm gpu: [drm] *ERROR* Timed out waiting for commit
Note that even when the display is frozen (and cannot be resurrected), only the display driver appears to crash; the system remains operational (at least for some time) happily responding to SSH login requests or, if video is playing, continuing audio output.
The same occurs with the current linux-firmware-
Changed in linux-raspi (Ubuntu Impish): | |
importance: | Undecided → Critical |
description: | updated |
description: | updated |
Changed in linux-raspi (Ubuntu Impish): | |
status: | Confirmed → In Progress |
Changed in linux-raspi (Ubuntu Impish): | |
status: | Fix Committed → Fix Released |
Status changed to 'Confirmed' because the bug affects multiple users.