Comment 11 for bug 1837810

Revision history for this message
Matthew Ruffell (mruffell) wrote :

Verification steps for focal:

Again, I made sure I can reproduce on the existing 5.4.0-42-generic kernel.

I copied ksm_refcnt_overflow.sh and zero_page_refcount.c to the VM, and built the kernel module, and inserted it into the kernel:

$ sudo insmod zero_page_refcount.ko
$ cat /proc/zero_page_refcount
Zero Page Refcount: 0x1 or 1

From there, I started running the ksm_refcnt_script.sh in another terminal. I checked to ensure VMs were running:

$ virsh list
 Id Name State
----------------------------
 1 instance-0 running
 2 instance-1 running
 3 instance-2 running

From there, we can see the reference counter increment:

$ cat /proc/zero_page_refcount
Zero Page Refcount: 0x1bd9 or 7129
$ cat /proc/zero_page_refcount
Zero Page Refcount: 0x1f9e or 8094
$ cat /proc/zero_page_refcount
Zero Page Refcount: 0x1fb0 or 8112

From there, I set the reference counter in an attempt to make it overflow:

$ cat /proc/zero_page_refcount
Zero Page Refcount: 0x7fffff15 or 2147483413
$ cat /proc/zero_page_refcount
Zero Page Refcount: 0x80000000 or -2147483648

From there, all vms became paused:

$ virsh list
 Id Name State
----------------------------
 137 instance-0 paused
 138 instance-1 paused
 139 instance-2 paused

We see the following oops in dmesg:

https://paste.ubuntu.com/p/3Dc73k9VYy/

I then rebooted the machine, enabled -proposed and installed 5.4.0-46-generic.

$ uname -rv
5.4.0-46-generic #50-Ubuntu SMP Fri Aug 28 15:33:36 UTC 2020

I rebooted, and built a new kernel module with the new headers, and inserted it into the running kernel:

$ sudo insmod zero_page_refcount.ko
[sudo] password for ubuntu:
ubuntu@ubuntu:~/module$ cat /proc/zero_page_refcount
Zero Page Refcount: 0x1 or 1

Again, I started the ksm_refcnt_overflow.sh script in another terminal,
and checked to see that VMs were being created:

$ virsh list
 Id Name State
----------------------------
 1 instance-0 running
 2 instance-1 running

When we check the value of the reference counter, it is still 1 and not incrementing:

$ cat /proc/zero_page_refcount
Zero Page Refcount: 0x1 or 1
$ cat /proc/zero_page_refcount
Zero Page Refcount: 0x1 or 1

When I attempt to trigger overflow:

$ cat /proc/zero_page_refcount_set
Zero Page Refcount set to 0x1FFFFFFFFF000

$ cat /proc/zero_page_refcount
Zero Page Refcount: 0x7fffff00 or 2147483392
$ cat /proc/zero_page_refcount
Zero Page Refcount: 0x7fffff00 or 2147483392

We never overflow. The problem is fixed. Marking the bug as verified for focal.