[amdgpu] ring vcn_dec_0 timeout while useing Firefoy (with va_api available)

Bug #2017591 reported by Benedikt
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux-signed-hwe-5.19 (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

While browsing with Firefox I see my screen going to black and freezing. Sometimes the system recover

I see the following log entries:

Apr 24 20:58:57 benedikt-ms kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring vcn_dec_0 timeout, signaled seq=79735, emitted seq=79736
Apr 24 20:58:57 benedikt-ms kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process RDD Process pid 7655 thread firefox:cs0 pid 7742
Apr 24 20:58:57 benedikt-ms kernel: amdgpu 0000:28:00.0: amdgpu: GPU reset begin!
Apr 24 20:58:58 benedikt-ms kernel: [drm] Register(0) [mmUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000002
Apr 24 20:58:58 benedikt-ms kernel: [drm] Register(0) [mmUVD_RBC_RB_RPTR] failed to reach value 0x000001e0 != 0x00000140
Apr 24 20:58:58 benedikt-ms kernel: [drm] Register(0) [mmUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000002
Apr 24 20:58:58 benedikt-ms kernel: [drm] free PSP TMR buffer
Apr 24 20:58:58 benedikt-ms kernel: CPU: 4 PID: 1057 Comm: kworker/u64:16 Kdump: loaded Tainted: G O 5.19.0-40-generic #41~22.04.1-Ubuntu
Apr 24 20:58:58 benedikt-ms kernel: Hardware name: Micro-Star International Co., Ltd MS-7B86/B450-A PRO (MS-7B86), BIOS A.H0 08/08/2022
Apr 24 20:58:58 benedikt-ms kernel: Workqueue: amdgpu-reset-dev drm_sched_job_timedout [gpu_sched]
Apr 24 20:58:58 benedikt-ms kernel: Call Trace:
Apr 24 20:58:58 benedikt-ms kernel: <TASK>
Apr 24 20:58:58 benedikt-ms kernel: amdgpu 0000:28:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0010 address=0x465700 flags=0x0000]
Apr 24 20:58:58 benedikt-ms kernel: show_stack+0x52/0x69
Apr 24 20:58:58 benedikt-ms kernel: dump_stack_lvl+0x49/0x6d
Apr 24 20:58:58 benedikt-ms kernel: dump_stack+0x10/0x18
Apr 24 20:58:58 benedikt-ms kernel: amdgpu 0000:28:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0010 address=0x465740 flags=0x0000]
Apr 24 20:58:58 benedikt-ms kernel: amdgpu_do_asic_reset+0x2b/0x441 [amdgpu]
Apr 24 20:58:58 benedikt-ms kernel: amdgpu_device_gpu_recover_imp.cold+0x4f6/0x805 [amdgpu]
Apr 24 20:58:58 benedikt-ms kernel: amdgpu_job_timedout+0x15e/0x190 [amdgpu]
Apr 24 20:58:58 benedikt-ms kernel: ? finish_task_switch.isra.0+0x84/0x290
Apr 24 20:58:58 benedikt-ms kernel: drm_sched_job_timedout+0x6d/0x120 [gpu_sched]
Apr 24 20:58:58 benedikt-ms kernel: process_one_work+0x21f/0x400
Apr 24 20:58:58 benedikt-ms kernel: worker_thread+0x50/0x3f0
Apr 24 20:58:58 benedikt-ms kernel: ? rescuer_thread+0x3a0/0x3a0
Apr 24 20:58:58 benedikt-ms kernel: kthread+0xee/0x120
Apr 24 20:58:58 benedikt-ms kernel: ? kthread_complete_and_exit+0x20/0x20
Apr 24 20:58:58 benedikt-ms kernel: ret_from_fork+0x22/0x30
Apr 24 20:58:58 benedikt-ms kernel: </TASK>
Apr 24 20:58:58 benedikt-ms kernel: amdgpu 0000:28:00.0: amdgpu: MODE1 reset
Apr 24 20:58:58 benedikt-ms kernel: amdgpu 0000:28:00.0: amdgpu: GPU mode1 reset
Apr 24 20:58:58 benedikt-ms kernel: amdgpu 0000:28:00.0: amdgpu: GPU smu mode1 reset
Apr 24 20:58:59 benedikt-ms kernel: amdgpu 0000:28:00.0: amdgpu: GPU reset succeeded, trying to resume
Apr 24 20:58:59 benedikt-ms kernel: [drm] PCIE GART of 512M enabled (table at 0x0000008000E10000).
Apr 24 20:58:59 benedikt-ms kernel: [drm] VRAM is lost due to GPU reset!
Apr 24 20:58:59 benedikt-ms kernel: [drm] PSP is resuming...
Apr 24 20:58:59 benedikt-ms kernel: [drm] reserve 0xa00000 from 0x80fe200000 for PSP TMR
Apr 24 20:58:59 benedikt-ms kernel: amdgpu 0000:28:00.0: amdgpu: RAS: optional ras ta ucode is not available
Apr 24 20:58:59 benedikt-ms kernel: amdgpu 0000:28:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
Apr 24 20:58:59 benedikt-ms kernel: amdgpu 0000:28:00.0: amdgpu: SMU is resuming...
Apr 24 20:58:59 benedikt-ms kernel: amdgpu 0000:28:00.0: amdgpu: smu driver if version = 0x0000000d, smu fw if version = 0x0000000f, smu fw program = 0, version = 0x00491a00 (73.26.0)
Apr 24 20:58:59 benedikt-ms kernel: amdgpu 0000:28:00.0: amdgpu: SMU driver if version not matched
Apr 24 20:58:59 benedikt-ms kernel: amdgpu 0000:28:00.0: amdgpu: use vbios provided pptable
Apr 24 20:58:59 benedikt-ms kernel: amdgpu 0000:28:00.0: amdgpu: SMU is resumed successfully!
Apr 24 20:58:59 benedikt-ms kernel: [drm] DMUB hardware initialized: version=0x0202000C
Apr 24 20:58:59 benedikt-ms kernel: [drm] kiq ring mec 2 pipe 1 q 0
Apr 24 20:58:59 benedikt-ms kernel: [drm] VCN decode and encode initialized successfully(under DPG Mode).
Apr 24 20:58:59 benedikt-ms kernel: amdgpu 0000:28:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
Apr 24 20:58:59 benedikt-ms kernel: amdgpu 0000:28:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
Apr 24 20:58:59 benedikt-ms kernel: amdgpu 0000:28:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
Apr 24 20:58:59 benedikt-ms kernel: amdgpu 0000:28:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
Apr 24 20:58:59 benedikt-ms kernel: amdgpu 0000:28:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
Apr 24 20:58:59 benedikt-ms kernel: amdgpu 0000:28:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
Apr 24 20:58:59 benedikt-ms kernel: amdgpu 0000:28:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
Apr 24 20:58:59 benedikt-ms kernel: amdgpu 0000:28:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
Apr 24 20:58:59 benedikt-ms kernel: amdgpu 0000:28:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
Apr 24 20:58:59 benedikt-ms kernel: amdgpu 0000:28:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
Apr 24 20:58:59 benedikt-ms kernel: amdgpu 0000:28:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
Apr 24 20:58:59 benedikt-ms kernel: amdgpu 0000:28:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 1
Apr 24 20:58:59 benedikt-ms kernel: amdgpu 0000:28:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
Apr 24 20:58:59 benedikt-ms kernel: amdgpu 0000:28:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 1
Apr 24 20:58:59 benedikt-ms kernel: amdgpu 0000:28:00.0: amdgpu: recover vram bo from shadow start
Apr 24 20:58:59 benedikt-ms kernel: amdgpu 0000:28:00.0: amdgpu: recover vram bo from shadow done
Apr 24 20:58:59 benedikt-ms kernel: [drm] Skip scheduling IBs!
Apr 24 20:58:59 benedikt-ms kernel: amdgpu 0000:28:00.0: amdgpu: GPU reset(1) succeeded!
Apr 24 20:58:59 benedikt-ms kernel: [drm] Skip scheduling IBs!
[...]
Apr 24 20:58:59 benedikt-ms kernel: amdgpu_cs_ioctl: 15 callbacks suppressed
Apr 24 20:58:59 benedikt-ms kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Apr 24 20:58:59 benedikt-ms kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Apr 24 20:58:59 benedikt-ms kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Apr 24 20:58:59 benedikt-ms kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Apr 24 20:58:59 benedikt-ms kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Apr 24 20:58:59 benedikt-ms kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Apr 24 20:58:59 benedikt-ms kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Apr 24 20:58:59 benedikt-ms kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Apr 24 20:58:59 benedikt-ms firefox_firefox.desktop[7655]: amdgpu: amdgpu_cs_query_fence_status failed.
Apr 24 20:58:59 benedikt-ms firefox_firefox.desktop[7655]: amdgpu: amdgpu_cs_query_fence_status failed.
Apr 24 20:58:59 benedikt-ms firefox_firefox.desktop[7655]: amdgpu: amdgpu_cs_query_fence_status failed.
Apr 24 20:58:59 benedikt-ms firefox_firefox.desktop[7655]: amdgpu: amdgpu_cs_query_fence_status failed.
Apr 24 20:58:59 benedikt-ms firefox_firefox.desktop[7655]: amdgpu: amdgpu_cs_query_fence_status failed.
Apr 24 20:58:59 benedikt-ms firefox_firefox.desktop[7655]: amdgpu: The CS has been cancelled because the context is lost.
Apr 24 20:58:59 benedikt-ms firefox_firefox.desktop[7655]: amdgpu: amdgpu_cs_query_fence_status failed.
Apr 24 20:58:59 benedikt-ms firefox_firefox.desktop[7655]: amdgpu: amdgpu_cs_query_fence_status failed.
Apr 24 20:58:59 benedikt-ms firefox_firefox.desktop[7655]: amdgpu: amdgpu_cs_query_fence_status failed.
Apr 24 20:58:59 benedikt-ms firefox_firefox.desktop[7655]: amdgpu: amdgpu_cs_query_fence_status failed.
Apr 24 20:58:59 benedikt-ms firefox_firefox.desktop[7655]: amdgpu: amdgpu_cs_query_fence_status failed.
Apr 24 20:58:59 benedikt-ms firefox_firefox.desktop[7655]: amdgpu: The CS has been cancelled because the context is lost.
[...]
Apr 24 20:58:59 benedikt-ms firefox_firefox.desktop[6015]: amdgpu: The CS has been cancelled because the context is lost.
Apr 24 20:58:59 benedikt-ms kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Apr 24 20:58:59 benedikt-ms kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Apr 24 20:58:59 benedikt-ms firefox_firefox.desktop[6015]: amdgpu: The CS has been cancelled because the context is lost.
Apr 24 20:58:59 benedikt-ms firefox_firefox.desktop[6015]: amdgpu: The CS has been cancelled because the context is lost.
Apr 24 20:58:59 benedikt-ms firefox_firefox.desktop[6015]: [GFX1-]: GFX: RenderThread detected a device reset in PostUpdate
Apr 24 20:58:59 benedikt-ms firefox_firefox.desktop[7655]: amdgpu: amdgpu_cs_query_fence_status failed.
Apr 24 20:58:59 benedikt-ms firefox_firefox.desktop[7655]: amdgpu: The CS has been cancelled because the context is lost.
Apr 24 20:58:59 benedikt-ms gnome-shell[3934]: amdgpu: The CS has been cancelled because the context is lost.
Apr 24 20:58:59 benedikt-ms firefox_firefox.desktop[7655]: amdgpu: amdgpu_cs_query_fence_status failed.
Apr 24 20:58:59 benedikt-ms firefox_firefox.desktop[7655]: amdgpu: The CS has been cancelled because the context is lost.
Apr 24 20:58:59 benedikt-ms audit[6015]: SECCOMP auid=1000 uid=1000 gid=1000 ses=5 subj=snap.firefox.firefox pid=6015 comm="Renderer" exe="/snap/firefox/2579/usr/lib/firefox/firefox" sig=0 arch=c000003e syscall=312 compat=0 ip=0x7ff24134073d code=0x50000
Apr 24 20:59:09 benedikt-ms kernel: amdgpu 0000:28:00.0: [drm] *ERROR* [CRTC:57:crtc-0] flip_done timed out
Apr 24 21:00:03 benedikt-ms kernel: amdgpu 0000:28:00.0: [drm] *ERROR* flip_done timed out
Apr 24 21:00:03 benedikt-ms kernel: amdgpu 0000:28:00.0: [drm] *ERROR* [CRTC:57:crtc-0] commit wait timed out

I had to kill it with sysrq. I have va_api active in Firefox:

libva info: VA-API version 1.7.0
libva info: Trying to open /snap/firefox/2579/gnome-platform/usr/lib/x86_64-linux-gnu/dri/radeonsi_drv_video.so
libva info: Found init function __vaDriverInit_1_7
libva info: va_openDriver() returns 0

ProblemType: Bug
DistroRelease: Ubuntu 22.04
Package: linux-image-5.19.0-40-generic 5.19.0-40.41~22.04.1
ProcVersionSignature: Ubuntu 5.19.0-40.41~22.04.1-generic 5.19.17
Uname: Linux 5.19.0-40-generic x86_64
ApportVersion: 2.20.11-0ubuntu82.4
Architecture: amd64
CasperMD5CheckResult: unknown
CurrentDesktop: ubuntu:GNOME
Date: Mon Apr 24 21:03:10 2023
InstallationDate: Installed on 2018-12-01 (1605 days ago)
InstallationMedia: Ubuntu 18.10 "Cosmic Cuttlefish" - Release amd64 (20181017.3)
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=de_DE.UTF-8
 SHELL=/bin/bash
SourcePackage: linux-signed-hwe-5.19
UpgradeStatus: Upgraded to jammy on 2022-08-20 (246 days ago)

Revision history for this message
Benedikt (benedikt-klotz) wrote :
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux-signed-hwe-5.19 (Ubuntu):
status: New → Confirmed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.