Qemu causes system hang

Bug #1828131 reported by Avi
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
nvidia-graphics-drivers-418 (Ubuntu)
Confirmed
Undecided
Unassigned
qemu (Ubuntu)
Incomplete
Undecided
Unassigned

Bug Description

I'm trying to use https://virt-manager.org/ / QEMU. I migrated over a Ubuntu Guest from Virtualbox, added it to virt-manager, and launch. It opens the window, then the system freezes.

I don't know if it's a bug in QEMU, libvirt, or virt-manager, or nvidia. Nvidia is working fine otherwise, Version: 430.09.

This is easily reproduced every time I try starting a qemu machine, so I can run any diagnostics while this happens if that helps.

This is from journalctl at the time of the freeze:

May 07 19:35:52 ap /usr/lib/gdm3/gdm-x-session[3400]: (EE) NVIDIA(0): The NVIDIA X driver has encountered an error; attempting to
May 07 19:35:52 ap /usr/lib/gdm3/gdm-x-session[3400]: (EE) NVIDIA(0): recover...
May 07 19:35:52 ap kernel: NVRM: GPU at PCI:0000:01:00: GPU-d9fbb72e-29cb-d4db-ad8f-af242c1a6c15
May 07 19:35:52 ap kernel: NVRM: Xid (PCI:0000:01:00): 13, Graphics Exception: Class 0x0 Subchannel 0x0 Mismatch
May 07 19:35:52 ap kernel: NVRM: Xid (PCI:0000:01:00): 13, Graphics Exception: ESR 0x4041b0=0x200000
May 07 19:35:52 ap kernel: NVRM: Xid (PCI:0000:01:00): 13, Graphics Exception: ESR 0x404000=0x80000002
May 07 19:35:52 ap kernel: NVRM: Xid (PCI:0000:01:00): 13, Graphics Exception: ChID 0008, Class 0000902d, Offset 00000860, Data 00000000
May 07 19:35:52 ap kernel: NVRM: Xid (PCI:0000:01:00): 32, Channel ID 00000008 intr 02000000
May 07 19:35:52 ap kernel: NVRM: Xid (PCI:0000:01:00): 31, Ch 00000009, intr 50000000. MMU Fault: ENGINE CE0 HUBCLIENT_CE0 faulted @ 0x1_005a0000. Fault is of type FAULT_PTE ACCESS_TYPE_READ
May 07 19:35:52 ap /usr/lib/gdm3/gdm-x-session[3400]: (II) NVIDIA(0): Error recovery was successful.
May 07 19:35:52 ap /usr/lib/gdm3/gdm-x-session[3400]: (EE) NVIDIA(0): The NVIDIA X driver has encountered an error; attempting to
May 07 19:35:52 ap /usr/lib/gdm3/gdm-x-session[3400]: (EE) NVIDIA(0): recover...
May 07 19:35:52 ap kernel: NVRM: Xid (PCI:0000:01:00): 32, Channel ID 0000000b intr 02000000
May 07 19:36:03 ap kernel: Asynchronous wait on fence NVIDIA:nvidia.prime:10ad97 timed out (hint:intel_atomic_commit_ready+0x0/0x50 [i915])
May 07 19:36:21 ap libvirtd[1863]: internal error: connection closed due to keepalive timeout
May 07 19:36:41 ap kernel: BUG: unable to handle kernel paging request at fffff9a5c3ff8030
May 07 19:36:41 ap kernel: #PF error: [normal kernel read fault]
May 07 19:36:41 ap kernel: PGD 85c9cc067 P4D 85c9cc067 PUD 85c9cb067 PMD 0
May 07 19:36:41 ap kernel: Oops: 0000 [#1] SMP PTI
May 07 19:36:41 ap kernel: CPU: 7 PID: 31632 Comm: worker Tainted: P U W OE 5.1.0-050100-generic #201905052130
May 07 19:36:41 ap kernel: Hardware name: Dell Inc. Precision 5530/0FP2W2, BIOS 1.10.1 04/26/2019
May 07 19:36:41 ap kernel: RIP: 0010:compaction_alloc+0x589/0x890
May 07 19:36:41 ap kernel: Code: 7d b0 41 83 e6 1f 41 83 c6 01 4d 39 fc 73 7b e9 57 01 00 00 4d 89 e2 49 c1 e2 06 4c 03 15 57 b9 18 01 4d 89 d7 4d 85 ff 74 44 <41> 8b 47 30 25 80 00 00 f0 3d 00 00 00 f0 0
May 07 19:36:41 ap kernel: RSP: 0018:ffffb352435b34a0 EFLAGS: 00010286
May 07 19:36:41 ap kernel: RAX: ffffa03dfc7d5d00 RBX: ffffb352435b36a0 RCX: 000000000000003d
May 07 19:36:41 ap kernel: RDX: 80000000000ffe00 RSI: 0000000000000000 RDI: ffffa03dfc7cd1e0
May 07 19:36:41 ap kernel: RBP: ffffb352435b3538 R08: 0000000000000000 R09: ffffa03dfc7d5d00
May 07 19:36:41 ap kernel: R10: fffff9a5c3ff8000 R11: ffffb352435b3719 R12: 80000000000ffe00
May 07 19:36:41 ap kernel: R13: 8000000000100000 R14: 0000000000000020 R15: fffff9a5c3ff8000
May 07 19:36:41 ap kernel: FS: 00007f28c17fa700(0000) GS:ffffa03ddc3c0000(0000) knlGS:0000000000000000
May 07 19:36:41 ap kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 07 19:36:41 ap kernel: CR2: fffff9a5c3ff8030 CR3: 000000069e550003 CR4: 00000000003626e0
May 07 19:36:41 ap kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
May 07 19:36:41 ap kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
May 07 19:36:41 ap kernel: Call Trace:
May 07 19:36:41 ap kernel: migrate_pages+0x107/0xb40
May 07 19:36:41 ap kernel: ? move_freelist_tail+0xd0/0xd0
May 07 19:36:41 ap kernel: ? isolate_freepages_block+0x370/0x370
May 07 19:36:41 ap kernel: compact_zone+0x752/0xd70
May 07 19:36:41 ap kernel: compact_zone_order+0xd8/0x120
May 07 19:36:41 ap kernel: try_to_compact_pages+0xb0/0x260
May 07 19:36:41 ap kernel: __alloc_pages_direct_compact+0x8c/0x170
May 07 19:36:41 ap kernel: __alloc_pages_slowpath+0x4b3/0xeb0
May 07 19:36:41 ap kernel: __alloc_pages_nodemask+0x2df/0x330
May 07 19:36:41 ap kernel: alloc_pages_vma+0x170/0x1c0
May 07 19:36:41 ap kernel: do_huge_pmd_anonymous_page+0x138/0x7b0
May 07 19:36:41 ap kernel: ? skcipher_walk_skcipher+0xb3/0xc0
May 07 19:36:41 ap kernel: __handle_mm_fault+0xd7d/0x1280
May 07 19:36:41 ap kernel: handle_mm_fault+0xe1/0x210
May 07 19:36:41 ap kernel: __do_page_fault+0x23c/0x4b0
May 07 19:36:41 ap kernel: do_page_fault+0x2e/0xe0
May 07 19:36:41 ap kernel: page_fault+0x1e/0x30
May 07 19:36:41 ap kernel: RIP: 0010:copy_user_enhanced_fast_string+0xe/0x20
May 07 19:36:41 ap kernel: Code: 89 d1 c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 31 c0 0f 01 ca c3 0f 1f 80 00 00 00 00 0f 01 cb 83 fa 40 0f 82 70 ff ff ff 89 d1 <f3> a4 31 c0 0f 01 ca c3 66 2e 0f 1f 84 00 00 00 00
May 07 19:36:41 ap kernel: RSP: 0018:ffffb352435b3c48 EFLAGS: 00050206
May 07 19:36:41 ap kernel: RAX: 00007f29c9c01000 RBX: 0000000000001000 RCX: 0000000000001000
May 07 19:36:41 ap kernel: RDX: 0000000000001000 RSI: ffffa035c9faf000 RDI: 00007f29c9c00000
May 07 19:36:41 ap kernel: RBP: ffffb352435b3c50 R08: 0000000000028100 R09: ffffffff8123677d
May 07 19:36:41 ap kernel: R10: fffff9a5df7ea780 R11: 00000000ffffa030 R12: ffffa035c9faf000
May 07 19:36:41 ap kernel: R13: ffffb352435b3e18 R14: 0000000000001000 R15: ffffb352435b3e08
May 07 19:36:41 ap kernel: ? kzfree+0x2d/0x40
May 07 19:36:41 ap kernel: ? copyout+0x2a/0x30
May 07 19:36:41 ap kernel: copy_page_to_iter+0xc1/0x300
May 07 19:36:41 ap kernel: generic_file_buffered_read+0x447/0xb70
May 07 19:36:41 ap kernel: generic_file_read_iter+0xdf/0x150
May 07 19:36:41 ap kernel: ecryptfs_read_update_atime+0x16/0x40
May 07 19:36:41 ap kernel: new_sync_read+0x111/0x170
May 07 19:36:41 ap kernel: __vfs_read+0x29/0x40
May 07 19:36:41 ap kernel: vfs_read+0x99/0x160
May 07 19:36:41 ap kernel: ksys_pread64+0x66/0xa0
May 07 19:36:41 ap kernel: __x64_sys_pread64+0x1e/0x20
May 07 19:36:41 ap kernel: do_syscall_64+0x5a/0x110
May 07 19:36:41 ap kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9
May 07 19:36:41 ap kernel: RIP: 0033:0x7f2a281d067f
May 07 19:36:41 ap kernel: Code: 41 54 49 89 d4 55 48 89 f5 53 89 fb 48 83 ec 18 e8 76 f3 ff ff 4d 89 ea 4c 89 e2 48 89 ee 41 89 c0 89 df b8 11 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 35 44 89 c7 48 89 44 24 08 e8
May 07 19:36:41 ap kernel: RSP: 002b:00007f28c17f9520 EFLAGS: 00000293 ORIG_RAX: 0000000000000011
May 07 19:36:41 ap kernel: RAX: ffffffffffffffda RBX: 0000000000000010 RCX: 00007f2a281d067f
May 07 19:36:41 ap kernel: RDX: 0000000000020000 RSI: 00007f29c9bf6000 RDI: 0000000000000010
May 07 19:36:41 ap kernel: RBP: 00007f29c9bf6000 R08: 0000000000000000 R09: 00000000ffffffff
May 07 19:36:41 ap kernel: R10: 0000000058d30000 R11: 0000000000000293 R12: 0000000000020000
May 07 19:36:41 ap kernel: R13: 0000000058d30000 R14: 000055a23bfcd670 R15: 00007f28c17f9840
May 07 19:36:41 ap kernel: Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_nat xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bridge stp llc pci_stub vboxpci(OE) vboxnetadp(O
May 07 19:36:41 ap kernel: snd_hda_core snd_hwdep snd_pcm iwlmvm mac80211 aesni_intel snd_seq_midi snd_seq_midi_event aes_x86_64 snd_rawmidi crypto_simd cryptd dell_laptop glue_helper ledtrig_audio dell_wmi int
May 07 19:36:41 ap kernel: syscopyarea sysfillrect nvme sysimgblt fb_sys_fops ahci rtsx_pci intel_lpss_pci i2c_hid ipmi_devintf drm i2c_i801 nvme_core libahci intel_lpss ipmi_msghandler hid pinctrl_cannonlake w
May 07 19:36:41 ap kernel: CR2: fffff9a5c3ff8030
May 07 19:36:41 ap kernel: ---[ end trace c66e94d82f6f955c ]---
May 07 19:36:41 ap kernel: RIP: 0010:compaction_alloc+0x589/0x890
May 07 19:36:41 ap kernel: Code: 7d b0 41 83 e6 1f 41 83 c6 01 4d 39 fc 73 7b e9 57 01 00 00 4d 89 e2 49 c1 e2 06 4c 03 15 57 b9 18 01 4d 89 d7 4d 85 ff 74 44 <41> 8b 47 30 25 80 00 00 f0 3d 00 00 00 f0 0f 84 fe
May 07 19:36:41 ap kernel: RSP: 0018:ffffb352435b34a0 EFLAGS: 00010286
May 07 19:36:41 ap kernel: RAX: ffffa03dfc7d5d00 RBX: ffffb352435b36a0 RCX: 000000000000003d
May 07 19:36:41 ap kernel: RDX: 80000000000ffe00 RSI: 0000000000000000 RDI: ffffa03dfc7cd1e0
May 07 19:36:41 ap kernel: RBP: ffffb352435b3538 R08: 0000000000000000 R09: ffffa03dfc7d5d00
May 07 19:36:41 ap kernel: R10: fffff9a5c3ff8000 R11: ffffb352435b3719 R12: 80000000000ffe00
May 07 19:36:41 ap kernel: R13: 8000000000100000 R14: 0000000000000020 R15: fffff9a5c3ff8000
May 07 19:36:41 ap kernel: FS: 00007f28c17fa700(0000) GS:ffffa03ddc3c0000(0000) knlGS:0000000000000000
May 07 19:36:41 ap kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 07 19:36:41 ap kernel: CR2: fffff9a5c3ff8030 CR3: 000000069e550003 CR4: 00000000003626e0
May 07 19:36:41 ap kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
May 07 19:36:41 ap kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
May 07 19:36:41 ap kernel: BUG: unable to handle kernel paging request at fffff9a5c2230000
May 07 19:36:41 ap kernel: #PF error: [normal kernel read fault]
May 07 19:36:41 ap kernel: PGD 85c9cc067 P4D 85c9cc067 PUD 85c9cb067 PMD 0
May 07 19:36:41 ap kernel: Oops: 0000 [#2] SMP PTI
May 07 19:36:41 ap kernel: CPU: 5 PID: 31511 Comm: CPU 0/KVM Tainted: P UD W OE 5.1.0-050100-generic #201905052130
May 07 19:36:41 ap kernel: Hardware name: Dell Inc. Precision 5530/0FP2W2, BIOS 1.10.1 04/26/2019
May 07 19:36:41 ap kernel: RIP: 0010:isolate_freepages_block+0xa4/0x370
May 07 19:36:41 ap kernel: Code: 31 e4 4c 89 c0 4c 89 45 a0 4c 8d 5f 79 49 89 d0 48 c1 e0 06 41 ba 01 00 00 00 4d 89 cf 48 89 45 a8 f6 c3 1f 0f 84 fa 00 00 00 <49> 8b 07 41 83 c4 01 a9 00 00 01 00 75 0c 49 8b 47
May 07 19:36:41 ap kernel: RSP: 0018:ffffb35243a7f260 EFLAGS: 00010246
May 07 19:36:41 ap kernel: RAX: 0000000000000000 RBX: 0000000000088c00 RCX: ffffb35243a7f569
May 07 19:36:41 ap kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffa03dfc7d5bc0
May 07 19:36:41 ap kernel: RBP: ffffb35243a7f2e0 R08: 0000000000088e00 R09: fffff9a5c2230000
May 07 19:36:41 ap kernel: R10: 0000000000000001 R11: ffffb35243a7f569 R12: 0000000000000000
May 07 19:36:41 ap kernel: R13: 0000000000000000 R14: ffffb35243a7f4f0 R15: fffff9a5c2230000
May 07 19:36:41 ap kernel: FS: 00007f2a2505e700(0000) GS:ffffa03ddc340000(0000) knlGS:0000000000000000
May 07 19:36:41 ap kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 07 19:36:41 ap kernel: CR2: fffff9a5c2230000 CR3: 000000069e550002 CR4: 00000000003626e0
May 07 19:36:41 ap kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
May 07 19:36:41 ap kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
May 07 19:36:41 ap kernel: Call Trace:
May 07 19:36:41 ap kernel: compaction_alloc+0x4d3/0x890
May 07 19:36:41 ap kernel: migrate_pages+0x107/0xb40
May 07 19:36:41 ap kernel: ? move_freelist_tail+0xd0/0xd0
May 07 19:36:41 ap kernel: ? isolate_freepages_block+0x370/0x370
May 07 19:36:41 ap kernel: compact_zone+0x752/0xd70
May 07 19:36:41 ap kernel: ? schedule+0x2c/0x70
May 07 19:36:41 ap kernel: compact_zone_order+0xd8/0x120
May 07 19:36:41 ap kernel: try_to_compact_pages+0xb0/0x260
May 07 19:36:41 ap kernel: __alloc_pages_direct_compact+0x8c/0x170
May 07 19:36:41 ap kernel: __alloc_pages_slowpath+0x4b3/0xeb0
May 07 19:36:41 ap kernel: ? page_counter_try_charge+0x5b/0xd0
May 07 19:36:41 ap kernel: __alloc_pages_nodemask+0x2df/0x330
May 07 19:36:41 ap kernel: alloc_pages_vma+0x170/0x1c0
May 07 19:36:41 ap kernel: do_huge_pmd_anonymous_page+0x138/0x7b0
May 07 19:36:41 ap kernel: __handle_mm_fault+0xd7d/0x1280
May 07 19:36:41 ap kernel: ? kvm_vcpu_gfn_to_hva_prot+0x25/0x30 [kvm]
May 07 19:36:41 ap kernel: handle_mm_fault+0xe1/0x210
May 07 19:36:41 ap kernel: __get_user_pages+0x248/0x720
May 07 19:36:41 ap kernel: get_user_pages_unlocked+0x137/0x1b0
May 07 19:36:41 ap kernel: __gfn_to_pfn_memslot+0x12a/0x400 [kvm]
May 07 19:36:41 ap kernel: ? ttwu_do_wakeup+0x1e/0x140
May 07 19:36:41 ap kernel: try_async_pf+0x89/0x250 [kvm]
May 07 19:36:41 ap kernel: tdp_page_fault+0x13c/0x280 [kvm]
May 07 19:36:41 ap kernel: ? vmx_vmexit+0x1b/0x30 [kvm_intel]
May 07 19:36:41 ap kernel: kvm_mmu_page_fault+0x75/0x640 [kvm]
May 07 19:36:41 ap kernel: ? vmx_vmexit+0x1b/0x30 [kvm_intel]
May 07 19:36:41 ap kernel: ? vmx_vmexit+0xf/0x30 [kvm_intel]
May 07 19:36:41 ap kernel: ? vmx_vmexit+0x1b/0x30 [kvm_intel]
May 07 19:36:41 ap kernel: ? vmx_vmexit+0xf/0x30 [kvm_intel]
May 07 19:36:41 ap kernel: ? vmx_vmexit+0x1b/0x30 [kvm_intel]
May 07 19:36:41 ap kernel: ? vmx_vmexit+0xf/0x30 [kvm_intel]
May 07 19:36:41 ap kernel: ? vmx_vmexit+0x1b/0x30 [kvm_intel]
May 07 19:36:41 ap kernel: ? vmx_vmexit+0xf/0x30 [kvm_intel]
May 07 19:36:41 ap kernel: ? vmx_vmexit+0xf/0x30 [kvm_intel]
May 07 19:36:41 ap kernel: ? vmx_vmexit+0x1b/0x30 [kvm_intel]
May 07 19:36:41 ap kernel: ? vmx_vmexit+0xf/0x30 [kvm_intel]
May 07 19:36:41 ap kernel: ? vmx_vmexit+0x1b/0x30 [kvm_intel]
May 07 19:36:41 ap kernel: ? vmx_vmexit+0xf/0x30 [kvm_intel]
May 07 19:36:41 ap kernel: ? vmx_vmexit+0x1b/0x30 [kvm_intel]
May 07 19:36:41 ap kernel: ? vmx_vmexit+0xf/0x30 [kvm_intel]
May 07 19:36:41 ap kernel: ? vmx_vmexit+0x1b/0x30 [kvm_intel]
May 07 19:36:41 ap kernel: ? vmx_vmexit+0xf/0x30 [kvm_intel]
May 07 19:36:41 ap kernel: handle_ept_violation+0xcb/0x220 [kvm_intel]
May 07 19:36:41 ap kernel: vmx_handle_exit+0xa7/0x7d0 [kvm_intel]
May 07 19:36:41 ap kernel: vcpu_enter_guest+0x2c7/0x14e0 [kvm]
May 07 19:36:41 ap kernel: kvm_arch_vcpu_ioctl_run+0xd1/0x570 [kvm]
May 07 19:36:41 ap kernel: kvm_vcpu_ioctl+0x24b/0x610 [kvm]
May 07 19:36:41 ap kernel: ? __seccomp_filter+0x7e/0x6d0
May 07 19:36:41 ap kernel: do_vfs_ioctl+0xa9/0x640
May 07 19:36:41 ap kernel: ? __secure_computing+0x3e/0xd0
May 07 19:36:41 ap kernel: ksys_ioctl+0x67/0x90
May 07 19:36:41 ap kernel: __x64_sys_ioctl+0x1a/0x20
May 07 19:36:41 ap kernel: do_syscall_64+0x5a/0x110
May 07 19:36:41 ap kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9
May 07 19:36:41 ap kernel: RIP: 0033:0x7f2a280e5417
May 07 19:36:41 ap kernel: Code: 00 00 90 48 8b 05 79 0a 0d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 49 0a 0d 00 f7
May 07 19:36:41 ap kernel: RSP: 002b:00007f2a2505d578 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
May 07 19:36:41 ap kernel: RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007f2a280e5417
May 07 19:36:41 ap kernel: RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000011
May 07 19:36:41 ap kernel: RBP: 0000000000000000 R08: 000055a239c155b0 R09: 00000000ffffffff
May 07 19:36:41 ap kernel: R10: 0000000000000001 R11: 0000000000000246 R12: 000055a23c158050
May 07 19:36:41 ap kernel: R13: 00007f2a25b53000 R14: 0000000000000000 R15: 000055a23c158050
May 07 19:36:41 ap kernel: Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_nat xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bridge stp llc pci_stub vboxpci(OE) vboxnetadp(O
May 07 19:36:41 ap kernel: snd_hda_core snd_hwdep snd_pcm iwlmvm mac80211 aesni_intel snd_seq_midi snd_seq_midi_event aes_x86_64 snd_rawmidi crypto_simd cryptd dell_laptop glue_helper ledtrig_audio dell_wmi int
May 07 19:36:41 ap kernel: syscopyarea sysfillrect nvme sysimgblt fb_sys_fops ahci rtsx_pci intel_lpss_pci i2c_hid ipmi_devintf drm i2c_i801 nvme_core libahci intel_lpss ipmi_msghandler hid pinctrl_cannonlake w
May 07 19:36:41 ap kernel: CR2: fffff9a5c2230000
May 07 19:36:41 ap kernel: ---[ end trace c66e94d82f6f955d ]---
May 07 19:36:41 ap kernel: RIP: 0010:compaction_alloc+0x589/0x890
May 07 19:36:41 ap kernel: Code: 7d b0 41 83 e6 1f 41 83 c6 01 4d 39 fc 73 7b e9 57 01 00 00 4d 89 e2 49 c1 e2 06 4c 03 15 57 b9 18 01 4d 89 d7 4d 85 ff 74 44 <41> 8b 47 30 25 80 00 00 f0 3d 00 00 00 f0 0f 84 fe
May 07 19:36:41 ap kernel: RSP: 0018:ffffb352435b34a0 EFLAGS: 00010286
May 07 19:36:41 ap kernel: RAX: ffffa03dfc7d5d00 RBX: ffffb352435b36a0 RCX: 000000000000003d
May 07 19:36:41 ap kernel: RDX: 80000000000ffe00 RSI: 0000000000000000 RDI: ffffa03dfc7cd1e0
May 07 19:36:41 ap kernel: RBP: ffffb352435b3538 R08: 0000000000000000 R09: ffffa03dfc7d5d00
May 07 19:36:41 ap kernel: R10: fffff9a5c3ff8000 R11: ffffb352435b3719 R12: 80000000000ffe00
May 07 19:36:41 ap kernel: R13: 8000000000100000 R14: 0000000000000020 R15: fffff9a5c3ff8000
May 07 19:36:41 ap kernel: FS: 00007f2a2505e700(0000) GS:ffffa03ddc340000(0000) knlGS:0000000000000000
May 07 19:36:41 ap kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 07 19:36:41 ap kernel: CR2: fffff9a5c2230000 CR3: 000000069e550002 CR4: 00000000003626e0
May 07 19:36:41 ap kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
May 07 19:36:41 ap kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

ProblemType: Bug
DistroRelease: Ubuntu 19.04
Package: qemu (not installed)
Uname: Linux 5.1.0-050100-generic x86_64
NonfreeKernelModules: nvidia_modeset nvidia
ApportVersion: 2.20.10-0ubuntu27
Architecture: amd64
CurrentDesktop: ubuntu:GNOME
Date: Tue May 7 20:08:12 2019
EcryptfsInUse: Yes
InstallationDate: Installed on 2019-02-20 (76 days ago)
InstallationMedia: Ubuntu 18.04 "Bionic" - Build amd64 LIVE Binary 20180608-09:38
KvmCmdLine: COMMAND STAT EUID RUID PID PPID %CPU COMMAND
Lsusb:
 Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
 Bus 001 Device 002: ID 8087:0025 Intel Corp.
 Bus 001 Device 003: ID 0c45:671d Microdia
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
MachineType: Dell Inc. Precision 5530
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.1.0-050100-generic root=UUID=18e8777c-1764-41e4-a19f-62476055de23 ro mem_sleep_default=deep mem_sleep_default=deep acpi_rev_override=1 scsi_mod.use_blk_mq=1 nouveau.modeset=0 nouveau.runpm=0 nouveau.blacklist=1 acpi_backlight=none acpi_osi=Linux acpi_osi=!
SourcePackage: qemu
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 04/26/2019
dmi.bios.vendor: Dell Inc.
dmi.bios.version: 1.10.1
dmi.board.name: 0FP2W2
dmi.board.vendor: Dell Inc.
dmi.board.version: A00
dmi.chassis.type: 10
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvr1.10.1:bd04/26/2019:svnDellInc.:pnPrecision5530:pvr:rvnDellInc.:rn0FP2W2:rvrA00:cvnDellInc.:ct10:cvr:
dmi.product.family: Precision
dmi.product.name: Precision 5530
dmi.product.sku: 087D
dmi.sys.vendor: Dell Inc.

Revision history for this message
Avi (ikes73) wrote :
Revision history for this message
Dan Streetman (ddstreet) wrote :

> Uname: Linux 5.1.0-050100-generic x86_64

We're not quite at that kernel yet, where did you install this kernel from?

> NonfreeKernelModules: nvidia_modeset nvidia

The log seems to indicate a kernel crash/bug in the nvidia module; where did you get that module from?

Changed in qemu (Ubuntu):
status: New → Incomplete
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

This version number reminds me of https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.1/

This further seems to be under serious memory pressure seeing direct_compact and slowpath, does that occur only under some conditions e.g. memory pressure and some config. Or is it pretty reproducible.
If it is, does it reproduce on a stock Ubuntu kernel as well?

Revision history for this message
Avi (ikes73) wrote :

1. Installed latest kernel using UKUU, but had the same issue on multiple earlier kernels. I can reproduce on stock.

2. Have 32GB of RAM and it isn't close to being used up, the bug occurs even when all other programs are closed.

3. Installed drivers with `sudo apt install nvidia-driver-430 xserver-xorg-video-nvidia-430 libnvidia-cfg1-430`, from ppa https://launchpad.net/~graphics-drivers/+archive/ubuntu/

Revision history for this message
Avi (ikes73) wrote :

After reproducing it a few times, looks like the kernel crash doesn't happen every time, but the nvidia crash does.

Here's another crash:

May 08 23:22:25 ap audit[7432]: AVC apparmor="STATUS" operation="profile_replace" profile="unconfined" name="libvirt-12290279-de75-4f25-893f-8c4303a00723" pid=7432 comm="apparmor_parser"
May 08 23:22:25 ap kernel: audit: type=1400 audit(1557372145.668:79): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="libvirt-12290279-de75-4f25-893f-8c4303a00723" pid=7432 comm="apparmor_parser"
May 08 23:22:27 ap guake.desktop[3131]: Showing the terminal
May 08 23:22:27 ap avahi-daemon[1358]: Joining mDNS multicast group on interface vnet0.IPv6 with address fe80::fc54:ff:feee:4e6a.
May 08 23:22:27 ap avahi-daemon[1358]: New relevant interface vnet0.IPv6 for mDNS.
May 08 23:22:27 ap avahi-daemon[1358]: Registering new address record for fe80::fc54:ff:feee:4e6a on vnet0.*.
May 08 23:22:27 ap kernel: virbr0: port 2(vnet0) entered learning state
May 08 23:22:29 ap kernel: virbr0: port 2(vnet0) entered forwarding state
May 08 23:22:29 ap kernel: virbr0: topology change detected, propagating
May 08 23:22:29 ap NetworkManager[1391]: <info> [1557372149.6019] device (virbr0): carrier: link connected
May 08 23:22:35 ap guake.desktop[3131]: Hiding on focus lose
May 08 23:22:36 ap systemd[1]: NetworkManager-dispatcher.service: Succeeded.
-- Subject: Unit succeeded
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
--
-- The unit NetworkManager-dispatcher.service has successfully entered the 'dead' state.
May 08 23:22:41 ap dnsmasq-dhcp[7254]: DHCPREQUEST(virbr0) 192.168.122.43 52:54:00:ee:4e:6a
May 08 23:22:41 ap dnsmasq-dhcp[7254]: DHCPACK(virbr0) 192.168.122.43 52:54:00:ee:4e:6a osboxes
May 08 23:22:46 ap /usr/lib/gdm3/gdm-x-session[2558]: (EE) NVIDIA(0): The NVIDIA X driver has encountered an error; attempting to
May 08 23:22:46 ap /usr/lib/gdm3/gdm-x-session[2558]: (EE) NVIDIA(0): recover...
May 08 23:22:46 ap kernel: NVRM: GPU at PCI:0000:01:00: GPU-d9fbb72e-29cb-d4db-ad8f-af242c1a6c15
May 08 23:22:46 ap kernel: NVRM: Xid (PCI:0000:01:00): 13, Graphics Exception: Class 0x0 Subchannel 0x0 Mismatch
May 08 23:22:46 ap kernel: NVRM: Xid (PCI:0000:01:00): 13, Graphics Exception: ESR 0x4041b0=0x200000
May 08 23:22:46 ap kernel: NVRM: Xid (PCI:0000:01:00): 13, Graphics Exception: ESR 0x404000=0x80000002
May 08 23:22:46 ap kernel: NVRM: Xid (PCI:0000:01:00): 13, Graphics Exception: ChID 0008, Class 0000902d, Offset 00000860, Data 00000000
May 08 23:22:46 ap kernel: NVRM: Xid (PCI:0000:01:00): 31, Ch 00000008, intr 50000000. MMU Fault: ENGINE HOST0 HUBCLIENT_HOST_CPU faulted @ 0xb_d00b0000. Fault is of type FAULT_PDE ACCESS_TYPE_READ
May 08 23:23:05 ap systemd-logind[1337]: Power key pressed.
-- Reboot --

Revision history for this message
Avi (ikes73) wrote :

I've confirmed qemu works if I select intel instead of nvidia using prime-select.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

So I guess we should then file the bug against nvidia-driver-430 then?

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Since 430 is still from a PPA https://launchpad.net/~graphics-drivers/+archive/ubuntu/ I used 418 being the closest one.

Do you hit the same with 418?

Revision history for this message
Avi (ikes73) wrote :

Yes, had the same issue on 418.

Revision history for this message
post-factum (post-factum) wrote :

I don't think this is nvidia-related. I have the same on Arch with Intel card only.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in nvidia-graphics-drivers-418 (Ubuntu):
status: New → Confirmed
Revision history for this message
post-factum (post-factum) wrote :

Please check whether this patch fixes the issue: [1]

[1] https://lore.<email address hidden>/

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.