GPU lockup ring 0 stalled for more than X msec
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
xserver-xorg-video-ati (Ubuntu) |
Confirmed
|
Undecided
|
Unassigned |
Bug Description
Since the update:
xserver-
which resulted from:
https:/
I've experienced GPU freezes where all video becomes unresponsive, both Xorg and Ctrl+Alt terminal switching, and the GPU fan goes to full. I am still able to access the system via SSH.
Sometimes dmesg ends up full of this message repeating over and over:
radeon 0000:01:00.0: ring 0 stalled for more than 24040msec
radeon 0000:01:00.0: GPU lockup (current fence id 0x0000000000009e44 last fence id 0x0000000000009e49 on ring 0)
I sometimes get a few GPU soft reset which seem to fail in drm(?):
radeon 0000:01:00.0: Saved 110839 dwords of commands on ring 0.
radeon 0000:01:00.0: GPU softreset: 0x00000008
...
radeon 0000:01:00.0: Wait for MC idle timedout !
radeon 0000:01:00.0: Wait for MC idle timedout !
[drm] PCIE GART of 1024M enabled (table at 0x0000000000162
radeon 0000:01:00.0: WB enabled
radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000040000c00 and cpu addr 0x00000000725651ad
radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000040000c0c and cpu addr 0x00000000c3678ed8
radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x0000000000072118 and cpu addr 0x00000000dbd9e01b
[drm:r600_
[drm:evergreen
Even if the above reset doesn't happen, this freeze always results in a unable to handle page fault" BUG in radeon_ring_backup, entered from various call paths, eg:
BUG: unable to handle page fault for address: ffffbc2d80574ffc
...
Oops: 0000 [#1] SMP PTI
CPU: 2 PID: 11243 Comm: kworker/2:1H Not tainted 5.5.0-050500-
Workqueue: radeon-crtc radeon_
RIP: 0010:radeon_
Call Trace:
radeon_
radeon_
? __schedule+
process_
worker_
kthread+
? process_
? kthread_
ret_from_
or:
BUG: unable to handle page fault for address: ffffc03901000ffc
...
Oops: 0000 [#1] SMP PTI
CPU: 3 PID: 2227 Comm: compton Not tainted 5.3.0-28-generic #30~18.04.1-Ubuntu
RIP: 0010:radeon_
Call Trace:
radeon_
? dma_fence_
? reservation_
radeon_
radeon_
? radeon_
drm_ioctl_
drm_ioctl+
? radeon_
? __switch_
? __switch_
? __switch_
? __switch_
? __switch_
? __switch_
? __switch_
? __switch_
radeon_
do_vfs_
? __schedule+
ksys_
__x64_
do_syscall_
entry_
I've tried both 5.3.0-28-generic and 5.5.0-050500-
Nothing specific makes this happen, just regular usage with a compositing window manager. I'm not playing games or particularly exercising the GPU. The last two times I was just reading in web browser. It's also happened in the middle of the night while I was asleep. Sometimes I have a few days uptime, sometimes it happens in less than 24 hours from boot.
This never happened before the radeon update mentioned on the first line.
I'll attach two files of dmesg output. As per https:/
After happening every day for a week, this hasn't happened again since I logged this bug.
I also disabled Firefox WebRender so maybe that was a contributor.
I'll re-open if I can provide any useful data.