Ubuntu
linux package

Xen 32bit dom0 on 64bit hypervisor: bad page flags

Bug #1576564 reported by Stefan Bader on 2016-04-29

This bug affects 1 person

	Status	Importance	Assigned to
linux (Ubuntu)	Confirmed	High	Unassigned
Wily	Fix Released	High	Unassigned
Xenial	Fix Released	High	Unassigned
xen (Ubuntu)	Invalid	High	Unassigned
Wily	Invalid	Undecided	Unassigned
Xenial	Invalid	Undecided	Unassigned

Bug Description

This problem is a mix between running certain versions of 32bit Linux kernel dom0 on certain versions of 64bit Xen hypervisor, combined with certain memory clamping settings (dom0_mem=xM, without setting the max limit).

Xen 4.4.2 + Linux 3.13.x
Xen 4.5.0 + linux 3.19.x
Xen 4.6.0 + linux 4.0.x
Xen 4.6.0 + linux 4.1.x
-> all boot without messages
Xen 4.5.1 + Linux 4.2.x
Xen 4.6.0 + Linux 4.2.x
Xen 4.6.0 + Linux 4.3.x
* dom0_mem 512M, 4096M, or unlimited
-> boot without messages
* dom0_mem between 1024M and 3072M (inclusive)
-> bad page messages (but finishes boot)
Xen 4.6.0 + Linux 4.4.x
Xen 4.6.0 + Linux 4.5.x
Xen 4.6.0 + Linux 4.6-rc6
The boot for 512M,4096M, and unlimited looks good as well. Though trying to
start a domU without dom0_mem set caused a crash when ballooning (but I
think this should be a seperate bug)
Using a dom0_mem range between 1G and 3G it looks like still producing the
bad page flags bug message and additionally panicking + reboot.

The bad page bug generally looks like this (the pfn numbers seem to be towards the end of the allocated range.

[ 8.980150] BUG: Bad page state in process swapper/0 pfn:7fc22
[ 8.980238] page:f4566550 count:0 mapcount:0 mapping: (null) index:0x0
[ 8.980328] flags: 0x7000400(reserved)
[ 8.980486] page dumped because: PAGE_FLAGS_CHECK_AT_PREP flag set
[ 8.980575] bad because of flags:
[ 8.980688] flags: 0x400(reserved)
[ 8.980844] Modules linked in:
[ 8.980960] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G B 4.2.0-19-
generic #23-Ubuntu
[ 8.981084] Hardware name: Supermicro H8SGL/H8SGL, BIOS 3.0 08/31/2012
[ 8.981177] c1a649a7 23e07668 00000000 e9cafce4 c175e501 f4566550 e9cafd08 c
1166897
[ 8.981608] c19750a4 e9d183ec 0007fc22 007fffff c1975630 c1978e86 00000001 e
9cafd74
[ 8.982074] c1169f83 00000002 00000141 0004a872 c1af3644 00000000 ee44bce4 e
e44bce4
[ 8.982506] Call Trace:
[ 8.982582] [<c175e501>] dump_stack+0x41/0x52
[ 8.982666] [<c1166897>] bad_page+0xb7/0x110
[ 8.982749] [<c1169f83>] get_page_from_freelist+0x2d3/0x610
[ 8.982838] [<c116a4f3>] __alloc_pages_nodemask+0x153/0x910
[ 8.982926] [<c122ee62>] ? find_entry.isra.13+0x52/0x90
[ 8.983013] [<c11b0f75>] ? kmem_cache_alloc_trace+0x175/0x1e0
[ 8.983102] [<c10b1c96>] ? __raw_callee_save___pv_queued_spin_unlock+0x6/0x10
[ 8.983223] [<c11b0ddd>] ? __kmalloc+0x21d/0x240
[ 8.983308] [<c119cc2e>] __vmalloc_node_range+0x10e/0x210
[ 8.983433] [<c1148fa7>] ? bpf_prog_alloc+0x37/0xa0
[ 8.983518] [<c119cd96>] __vmalloc_node+0x66/0x70
[ 8.983604] [<c1148fa7>] ? bpf_prog_alloc+0x37/0xa0
[ 8.983689] [<c119cdd4>] __vmalloc+0x34/0x40
[ 8.983773] [<c1148fa7>] ? bpf_prog_alloc+0x37/0xa0
[ 8.983859] [<c1148fa7>] bpf_prog_alloc+0x37/0xa0
[ 8.983944] [<c167cc8c>] bpf_prog_create+0x2c/0x90
[ 8.984034] [<c1b6741e>] ? bsp_pm_check_init+0x11/0x11
[ 8.984121] [<c1b68401>] ptp_classifier_init+0x2b/0x44
[ 8.984207] [<c1b6749a>] sock_init+0x7c/0x83
[ 8.984291] [<c100211a>] do_one_initcall+0xaa/0x200
[ 8.984376] [<c1b6741e>] ? bsp_pm_check_init+0x11/0x11
[ 8.984463] [<c1b1654c>] ? repair_env_string+0x12/0x54
[ 8.984551] [<c1b16cf6>] ? kernel_init_freeable+0x126/0x1d9
[ 8.984726] [<c1755fb0>] kernel_init+0x10/0xe0
[ 8.984846] [<c10929b1>] ? schedule_tail+0x11/0x50
[ 8.984932] [<c1764141>] ret_from_kernel_thread+0x21/0x30
[ 8.985019] [<c1755fa0>] ? rest_init+0x70/0x70

break-fix: 92923ca3aacef63c92dc297a75ad0c6dfe4eab37 4b50bcc7eda4d3cc9e3f2a0aa60e590fedf728c5

See original description

Tags:

Revision history for this message

Stefan Bader (smb) wrote on 2016-04-29:

I marked this as affecting both Xen and the Linux kernel because of the interaction between both. Not even sure the panic/reboot is exactly due to the bad page problem or something else. On the other hand, having the dom0_mem at certain values avoids that panic, so maybe its just leading to more serious issues with newer kernels.

Changed in linux (Ubuntu):
status:	New → Triaged
importance:	Undecided → High
assignee:	nobody → Stefan Bader (smb)

Revision history for this message

Stefan Bader (smb) wrote on 2016-04-29:

Reformatted description and mention that the panic happens on newer kernels, too (tested with 4.5.2 and 4.6-rc5).

description:	updated
description:	updated

Revision history for this message

Stefan Bader (smb) wrote on 2016-05-02:

Today had one lucky case (but unable to repeat it) of booting a 4.4 kernel with 1024M dom0 memory. That was after cold booting into 3.19 with dom0=1024M,max:1024M which was ok, and then rebooting into 4.4 with the same argument. But no matter what I tried later, this seemed to be a one time lucky thing.

description:	updated
description:	updated

Revision history for this message

Stefan Bader (smb) wrote on 2016-05-02:

For reference some memory info from an Intel box with 8G physical memory, booted with 1024M dom0 memory on a 4.2 kernel. The range of bad page pfn from syslog was 3fc1e to 3fc3b.

(XEN) Xen-e820 RAM map:
(XEN) 0000000000000000 - 000000000009a400 (usable)
(XEN) 000000000009a400 - 00000000000a0000 (reserved)
(XEN) 00000000000e0000 - 0000000000100000 (reserved)
(XEN) 0000000000100000 - 0000000030a48000 (usable)
(XEN) 0000000030a48000 - 0000000030a49000 (reserved)
(XEN) 0000000030a49000 - 00000000a27f4000 (usable)
(XEN) 00000000a27f4000 - 00000000a2ab4000 (reserved)
(XEN) 00000000a2ab4000 - 00000000a2fb4000 (ACPI NVS)
(XEN) 00000000a2fb4000 - 00000000a2feb000 (ACPI data)
(XEN) 00000000a2feb000 - 00000000a3000000 (usable)
(XEN) 00000000a3000000 - 00000000afa00000 (reserved)
(XEN) 00000000e0000000 - 00000000f0000000 (reserved)
(XEN) 00000000fec00000 - 00000000fec01000 (reserved)
(XEN) 00000000fed00000 - 00000000fed04000 (reserved)
(XEN) 00000000fed10000 - 00000000fed1a000 (reserved)
(XEN) 00000000fed1c000 - 00000000fed20000 (reserved)
(XEN) 00000000fed84000 - 00000000fed85000 (reserved)
(XEN) 00000000fee00000 - 00000000fee01000 (reserved)
(XEN) 00000000ffc00000 - 0000000100000000 (reserved)
(XEN) 0000000100000000 - 000000024e600000 (usable)
...
(XEN) *** LOADING DOMAIN 0 ***
(XEN) Xen kernel: 64-bit, lsb, compat32
(XEN) Dom0 kernel: 32-bit, PAE, lsb, paddr 0x1000000 -> 0x1ecc000
(XEN) PHYSICAL MEMORY ARRANGEMENT:
(XEN) Dom0 alloc.: 0000000240000000->0000000244000000 (238639 pages to be all
ocated)
(XEN) Init. ramdisk: 000000024ca2f000->000000024e5ffbd4
(XEN) VIRTUAL MEMORY ARRANGEMENT:
(XEN) Loaded kernel: 00000000c1000000->00000000c1ecc000
(XEN) Init. ramdisk: 0000000000000000->0000000000000000
(XEN) Phys-Mach map: 00000000c1ecc000->00000000c1fcc000
(XEN) Start info: 00000000c1fcc000->00000000c1fcc4b4
(XEN) Page tables: 00000000c1fcd000->00000000c1fe4000
(XEN) Boot stack: 00000000c1fe4000->00000000c1fe5000
(XEN) TOTAL: 00000000c0000000->00000000c2400000
(XEN) ENTRY ADDRESS: 00000000c1ae8254

For reference some memory info from an Intel box with 8G physical memory, booted with 1024M dom0 memory on a 4.2 kernel. The range of bad page pfn from syslog was 3fc1e to 3fc3b.

(XEN) Xen-e820 RAM map:
(XEN)  0000000000000000 - 000000000009a400 (usable)
(XEN)  000000000009a400 - 00000000000a0000 (reserved)
(XEN)  00000000000e0000 - 0000000000100000 (reserved)
(XEN)  0000000000100000 - 0000000030a48000 (usable)
(XEN)  0000000030a48000 - 0000000030a49000 (reserved)
(XEN)  0000000030a49000 - 00000000a27f4000 (usable)
(XEN)  00000000a27f4000 - 00000000a2ab4000 (reserved)
(XEN)  00000000a2ab4000 - 00000000a2fb4000 (ACPI NVS)
(XEN)  00000000a2fb4000 - 00000000a2feb000 (ACPI data)
(XEN)  00000000a2feb000 - 00000000a3000000 (usable)
(XEN)  00000000a3000000 - 00000000afa00000 (reserved)
(XEN)  00000000e0000000 - 00000000f0000000 (reserved)
(XEN)  00000000fec00000 - 00000000fec01000 (reserved)
(XEN)  00000000fed00000 - 00000000fed04000 (reserved)
(XEN)  00000000fed10000 - 00000000fed1a000 (reserved)
(XEN)  00000000fed1c000 - 00000000fed20000 (reserved)
(XEN)  00000000fed84000 - 00000000fed85000 (reserved)
(XEN)  00000000fee00000 - 00000000fee01000 (reserved)
(XEN)  00000000ffc00000 - 0000000100000000 (reserved)
(XEN)  0000000100000000 - 000000024e600000 (usable)
...
(XEN) *** LOADING DOMAIN 0 ***
(XEN)  Xen  kernel: 64-bit, lsb, compat32
(XEN)  Dom0 kernel: 32-bit, PAE, lsb, paddr 0x1000000 -> 0x1ecc000
(XEN) PHYSICAL MEMORY ARRANGEMENT:
(XEN)  Dom0 alloc.:   0000000240000000->0000000244000000 (238639 pages to be all
ocated)
(XEN)  Init. ramdisk: 000000024ca2f000->000000024e5ffbd4
(XEN) VIRTUAL MEMORY ARRANGEMENT:
(XEN)  Loaded kernel: 00000000c1000000->00000000c1ecc000
(XEN)  Init. ramdisk: 0000000000000000->0000000000000000
(XEN)  Phys-Mach map: 00000000c1ecc000->00000000c1fcc000
(XEN)  Start info:    00000000c1fcc000->00000000c1fcc4b4
(XEN)  Page tables:   00000000c1fcd000->00000000c1fe4000
(XEN)  Boot stack:    00000000c1fe4000->00000000c1fe5000
(XEN)  TOTAL:         00000000c0000000->00000000c2400000
(XEN)  ENTRY ADDRESS: 00000000c1ae8254

Revision history for this message

Stefan Bader (smb) wrote on 2016-05-02:

Darn, I was chasing my own stupidity. Seems that when I added the memory limit I got the syntax wrong (which seems to cause less issues for 64bit than it does for 32bit). I was using "dom0_mem=xM:max=xM" which is wrong. The correct syntax is "dom0_mem=xM,max:xM". The result of the bad syntax most likely is that dom0 gets xM of memory active but the full memory as possible spare.

Stefan Bader (smb) on 2016-05-03

description:

updated

Revision history for this message

Stefan Bader (smb) wrote on 2016-05-03:

Xen and dom0 console log Edit (178.3 KiB, text/plain)

While not restricting the maximum dom0 memory maybe is not the optimal case, it still would be a valid config. So I tested more kernel versions. The bad page bug messages started to appear with kernel 4.2 and the panic is starting with 4.4.

Using sync_console and minicom capture mode, I finally got a complete log of the boot messages using a 4.6-rc6 kernel.

Revision history for this message

Stefan Bader (smb) wrote on 2016-05-11:

Proposed fix which was sent upstream for discussion Edit (2.3 KiB, text/plain)

Bisected the problem and found that the first bad commit is:

92923ca "mm: meminit: only set page reserved in the memblock region"

Further debugging showed that the problem is due to the arguments of the new reserve_bootmem_region() function. Those are start and end addresses of memory ranges. With PAE there can be ranges above 4G even for 32bit i386. Which is just what happens if dom0 memory is initially limited but dom0 is allowed to balloon for more memory.

The patch below fixes the bad page errors for me on 4.2 and 4.4 (and resolves the crash on 4.4 as well).

Changed in xen (Ubuntu):
status:	Triaged → Invalid

Revision history for this message

Stefan Bader (smb) wrote on 2016-05-11:

Marked Xen as invalid as this is caused by the dom0 kernel not Xen.

Ubuntu Foundations Team Bug Bot (crichton) on 2016-05-11

tags:

added: patch

Stefan Bader (smb) on 2016-05-23

description:	updated
Changed in xen (Ubuntu Wily):
status:	New → Invalid
Changed in xen (Ubuntu Xenial):
status:	New → Invalid
tags:	added: kernel-bug-break-fix

Revision history for this message

Stefan Bader (smb) wrote on 2016-07-26:

Fix released for Xenial in Ubuntu-4.4.0-25.44~37.

Changed in linux (Ubuntu Xenial):
status:	New → Fix Released
Changed in xen (Ubuntu):
assignee:	Stefan Bader (smb) → nobody
Changed in linux (Ubuntu):
status:	Triaged → Fix Released

Revision history for this message

Stefan Bader (smb) wrote on 2016-07-26:

#10

Fix released for Wily in Ubuntu-4.2.0-42.49~99.

Changed in linux (Ubuntu Wily):
importance:	Undecided → High
status:	New → Fix Released
Changed in linux (Ubuntu Xenial):
importance:	Undecided → High
Changed in linux (Ubuntu):
assignee:	Stefan Bader (smb) → nobody

Andy Whitcroft (apw) on 2016-07-27

Changed in linux (Ubuntu Wily):
status:	Fix Released → Confirmed
Changed in linux (Ubuntu Xenial):
status:	Fix Released → Confirmed
Changed in linux (Ubuntu):
status:	Fix Released → Confirmed