Bionic Server ISO soft lockup on Dell C6420 in swapper - Xenial appears OK
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Confirmed
|
High
|
Unassigned | ||
Bionic |
Confirmed
|
High
|
Unassigned |
Bug Description
Hello!
We are experiencing a soft lockup when booting the Bionic Server installer amd64 ISO (kernel 4.15.0-20). The installer seems to hang on boot: After GRUB, nothing ever appears on the display, and we only managed to get kernel logs through Serial-Over-LAN. The 16.04.4 Server installer ISO (kernel 4.4.0-16) does not seem to have this problem.
Our hardware is a Dell C6420. The C6420 is one node in a four-node, 2 U chassis (the C6400 chassis). The node's hardware is two Intel Xeon Gold 6134 CPUs (8 cores @ 3.2 GHz/core). RAM is 96 GB, as twelve 8 GB DIMMs. We have tested with Hyperthreading on and off, and with UFEI and BIOS boot modes, and we get the same results in both cases.
Right now, we have Hyperthreading on, and we are booting in UEFI mode. If you need us to change that for testing, let us know!
Here are the first 25 lines of the soft lockup:
[ 40.544002] watchdog: BUG: soft lockup - CPU#24 stuck for 22s! [swapper/0:1]
[ 40.628002] Modules linked in:
[ 40.664000] CPU: 24 PID: 1 Comm: swapper/0 Not tainted 4.15.0-20-generic #21-Ubuntu
[ 40.756002] Hardware name: Dell Inc. PowerEdge C6420/0K2TT6, BIOS 1.3.7 02/09/2018
[ 40.844002] RIP: 0010:smp_
[ 40.908002] RSP: 0000:ffffaa6100
[ 41.000000] RAX: 0000000000000004 RBX: ffff8eba9f5238c0 RCX: 0000000000000001
[ 41.084002] RDX: ffff8eba9f2a8f60 RSI: 0000000000000000 RDI: ffff8eba96754de0
[ 41.168002] RBP: ffffaa61000eb8a0 R08: fffffffffffffff0 R09: 00000000feffffff
[ 41.252003] R10: ffffecb61f56b380 R11: 0000000000000004 R12: 0000000000000100
[ 41.340002] R13: 0000000000023880 R14: ffffffffb4035030 R15: 0000000000000000
[ 41.424002] FS: 000000000000000
[ 41.520002] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 41.588002] CR2: 0000000000000000 CR3: 000000018c00a001 CR4: 00000000007606e0
[ 41.676002] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 41.760004] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 41.844002] PKRU: 00000000
[ 41.876002] Call Trace:
[ 41.908002] ? nsio_rw_
[ 41.952002] ? cpumask_
[ 42.000002] ? nsio_rw_
[ 42.044003] ? quirk_intel_
[ 42.112002] on_each_
[ 42.152000] ? nsio_rw_
[ 42.196003] text_poke_
The full output is attached as "kernel 4.15.0-20 boot.txt". We used the following kernel command line:
BOOT_IMAGE=
As I mentioned up top, we have started installing the Ubuntu Server 16.04.4 Server ISO, and this far the kernel is booting, and we have gotten through to the text-based installer.
I'm sorry that I couldn't use the normal kernel bug-reporting process, but since we're hitting this problem in the installer, I'm not sure what else we can do.
I also apologize that I don't have any more info, since it's after Midnight local time, but I wanted to make sure I got this information through to you ASAP.
At this point, we're probably going to move forward with the 16.04.4 Server installer for now, but we do have an identical C6420 node (in a different chassis), and we will probably be able to use that for testing for at least a little while! For example, we were thinking of trying the 16.04 HWE kernel, as an additional data point.
So, please let us know what additional information you would like, and what else we can try. Thanks very much!
affects: | linux-meta (Ubuntu) → linux (Ubuntu) |
Changed in linux (Ubuntu): | |
status: | Incomplete → Confirmed |
tags: | added: cscc |
This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:
apport-collect 1773100
and then change the status of the bug to 'Confirmed'.
If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.
This change has been made by an automated script, maintained by the Ubuntu Kernel Team.