Kernel BUG (null ptr dereference) with linux-image-5.19.0-42-generic (HWE kernel on 22.04)

Bug #2020757 reported by Arvid Norlander
38
This bug affects 6 people
Affects Status Importance Assigned to Milestone
linux-signed-hwe-5.19 (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

I use Ubuntu 22.04 in VMWare Player 17.0.2 (Windows host, due to corporate IT...). I installed the HWE kernel since I have an Intel CPU with P/E-cores. 3D acceleration is enabled in VMWare.

With linux-image-5.19.0-41-generic it is rock solid. With linux-image-5.19.0-42-generic it invariably crashes after anything between a minute and an hour.

In the journalctl after rebooting it showed that X.Org had crashed followed by a Kernel BUG/Oops due to NULL pointer dereference in drm_gem_object_release_handle. A journalctl log from the Xorg crash up until the kernel OOPS has been attached to this case.

This issue happened twice with this kernel. In addition, after trying to reproduce it with kdump tools installed, I got a different issue: refcount_t: saturated; leaking memory. in the log, leading to the OOM killing everything on the system, including (based on what was printed on the terminal), systemd-udevd and journald. That OOM killing never made it into the journal however, as can be expected. I have attached what I could get from the journal for this case as well.

One thing that helped trigger this bug (but not every time) was launching Firefox, but once it didn't crash from that and only crashed much much later.

Unfortunately I could not get ubuntu-bug to work for this case. I tried "ubuntu-bug --package linux-image-5.19.0-42-generic" but it did absolutely nothing. I'm happy to provide any additional information you require however.

$ lsb_release -rd
Description: Ubuntu 22.04.2 LTS
Release: 22.04
$ apt-cache policy linux-generic-hwe-22.04
linux-generic-hwe-22.04:
  Installed: 5.19.0.42.43~22.04.15
  Candidate: 5.19.0.42.43~22.04.15
  Version table:
 *** 5.19.0.42.43~22.04.15 500
        500 http://se.archive.ubuntu.com/ubuntu jammy-updates/main amd64 Packages
        500 http://security.ubuntu.com/ubuntu jammy-security/main amd64 Packages
        100 /var/lib/dpkg/status
     5.15.0.25.27 500
        500 http://se.archive.ubuntu.com/ubuntu jammy/main amd64 Packages
$ apt-cache policy linux-image-5.19.0-42-generic
linux-image-5.19.0-42-generic:
  Installed: 5.19.0-42.43~22.04.1
  Candidate: 5.19.0-42.43~22.04.1
  Version table:
 *** 5.19.0-42.43~22.04.1 500
        500 http://se.archive.ubuntu.com/ubuntu jammy-updates/main amd64 Packages
        500 http://security.ubuntu.com/ubuntu jammy-security/main amd64 Packages
        100 /var/lib/dpkg/status

What I expected to happen: No crash.
What happened instead: The crash in the attached log.

Revision history for this message
Arvid Norlander (vorpalblade) wrote :
Revision history for this message
Arvid Norlander (vorpalblade) wrote :

Attached refcount / OOM log

description: updated
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux-signed-hwe-5.19 (Ubuntu):
status: New → Confirmed
Revision history for this message
Mateusz Mikuła (mati865) wrote :

The same issue with VirtualBox running on Windows (stupid corporate requirements).

Revision history for this message
Joe (joejgarcia) wrote :

It's affecting me as well. Had to roll back to 5.19.0-41

Revision history for this message
James Byrne (jbyrne-ffe) wrote :

Same issue for me using VMWare Player 17 with 5.19.0-42 and 5.19.0-43. Have rolled back to 5.19.0-41.

Revision history for this message
Erik Larsson (catacombae) wrote :

Confirmed with 5.19.0-46.47~22.04.1 running on VMware Fusion 13.0.2.

Revision history for this message
Vlad (vlad2017) wrote :

I have the same issue with Kubuntu VM running on VMWare Workstation 17.0.2.

In case of Kubuntu (Ubuntu with KDE desktop) the error appears immediately after the login. Thus the VM hangs/crashes just after the login (black screen).

As a workaround I have to start it with 5.19.0-41 kernel. All newer kernels, including the 5.19.0-50, hang/crash with the message as follows:
Aug 1 10:52:58 vladimir-vm kernel: [ 0.000000] Linux version 5.19.0-50-generic (buildd@lcy02-amd64-030) (x86_64-linux-gnu-gcc (Ubuntu 11.3.0-1ubuntu1~22.04.1) 11.3.0, GNU ld (GNU Binutils for Ubuntu) 2.38) #50-Ubuntu SMP PREEMPT_DYNAMIC Mon Jul 10 18:24:29 UTC 2023 (Ubuntu 5.19.0-50.50-generic 5.19.17)
--
Aug 1 10:53:02 vladimir-vm kernel: [ 5.531458] ------------[ cut here ]------------
Aug 1 10:53:02 vladimir-vm kernel: [ 5.531460] refcount_t: addition on 0; use-after-free.
Aug 1 10:53:02 vladimir-vm kernel: [ 5.531466] WARNING: CPU: 1 PID: 1827 at lib/refcount.c:25 refcount_warn_saturate+0xa3/0x150

Revision history for this message
Mike Bressem (mbr-75) wrote :

I had the same issue with kernel >= 5.19.0-42 (VMware Workstation 17.0.2).

Seems to be fixed in current 6.2.0-26... no problems so far.

Revision history for this message
Vlad (vlad2017) wrote :

Indeed with the recent kernel update to 6.2.0-26 my VM starts and runs without any problems.

Though there is still the warning on the each system start:
WARNING: CPU: 2 PID: 1374 at lib/refcount.c:28 refcount_warn_saturate+0xfb/0x150

But now it is logged just one time. With the previous kernel (from 5.19.0-42 to 5.19.0-50) the warning was logged three time.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.