NMI watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [docker:28587]

Bug #1483770 reported by Michael Salinger
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux-lts-vivid (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

With 14.04.2 and the vivid backported kernel, we are seeing occasional crashes (soft lockups) when using Docker.

kernel: [631576.200053] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [docker:28587]
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] Modules linked in: veth xt_nat xt_tcpudp tcp_diag inet_diag xt_addrtype xt_conntrack ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 iptable_filter ip_tables x_tables nf_nat nf_conntrack bridge stp llc overlay dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c dm_crypt ppdev xen_fbfront syscopyarea sysfillrect sysimgblt fb_sys_fops serio_raw parport_pc parport isofs crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd psmouse floppy
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] CPU: 0 PID: 28587 Comm: docker Not tainted 3.19.0-25-generic #26~14.04.1-Ubuntu
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] Hardware name: Xen HVM domU, BIOS 4.2.amazon 05/06/2015
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] task: ffff8800e88a3ae0 ti: ffff88002ec84000 task.ti: ffff88002ec84000
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] RIP: 0010:[<ffffffff813af79d>] [<ffffffff813af79d>] format_decode+0x21d/0x3e0
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] RSP: 0018:ffff88002ec87ce8 EFLAGS: 00000246
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] RAX: ffffffff81ab03aa RBX: ffffffff813b02bc RCX: 0000000000000075
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] RDX: 0000000000000075 RSI: 0000000000000075 RDI: ffffffff81ab03ab
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] RBP: ffff88002ec87d08 R08: 000000000000000a R09: 000000000000fffe
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] R10: 0000000000000000 R11: ffff88002ec87c8e R12: ffff88002ecba000
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] R13: ffff88002ecb9004 R14: ffffffff813af362 R15: ffff88002ec87c58
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] FS: 00000000015fa880(0063) GS:ffff8800efc00000(0000) knlGS:0000000000000000
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] CR2: 000000c208109000 CR3: 00000000290e6000 CR4: 00000000001406f0
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] Stack:
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] ffff88002ecba000 ffffffff81ab03ab ffff88002ecb900a ffff88002ecba000
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] ffff88002ec87d88 ffffffff813b20d8 ffff88002ec8ffff 0000000000000000
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] 000000000000fffd 0000000000000001 ffff88002ecb9000 0000000000001000
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] Call Trace:
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] [<ffffffff813b20d8>] vsnprintf+0x58/0x590
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] [<ffffffff81210725>] seq_vprintf+0x35/0x70
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] [<ffffffff8121079d>] seq_printf+0x3d/0x40
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] [<ffffffff81210ea8>] ? seq_path_root+0x68/0xf0
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] [<ffffffff8122a35b>] show_mountinfo+0x5b/0x290
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] [<ffffffff8120adf6>] m_show+0x16/0x20
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] [<ffffffff8121005a>] seq_read+0xea/0x370
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] [<ffffffff811ecc38>] __vfs_read+0x18/0x50
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] [<ffffffff811eccf6>] vfs_read+0x86/0x140
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] [<ffffffff811ecdf6>] SyS_read+0x46/0xb0
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] [<ffffffff817b668d>] system_call_fastpath+0x16/0x1b
Aug 5 22:49:13 <hostname redacted> kernel: [631576.200053] Code: 01 08 e9 57 ff ff ff 0f 1f 80 00 00 00 00 ba ff ff ff ff 66 89 53 04 0f b6
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [docker:28587]
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] Modules linked in: veth xt_nat xt_tcpudp tcp_diag inet_diag xt_addrtype xt_conntrack ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 iptable_filter ip_tables x_tables nf_nat nf_conntrack bridge stp llc overlay dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c dm_crypt ppdev xen_fbfront syscopyarea sysfillrect sysimgblt fb_sys_fops serio_raw parport_pc parport isofs crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd psmouse floppy
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] CPU: 0 PID: 28587 Comm: docker Tainted: G L 3.19.0-25-generic #26~14.04.1-Ubuntu
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] Hardware name: Xen HVM domU, BIOS 4.2.amazon 05/06/2015
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] task: ffff8800e88a3ae0 ti: ffff88002ec84000 task.ti: ffff88002ec84000
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] RIP: 0010:[<ffffffff813b24c0>] [<ffffffff813b24c0>] vsnprintf+0x440/0x590
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] RSP: 0018:ffff88002ec87d18 EFLAGS: 00000293
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] RAX: 0000000000000002 RBX: 0000000000000031 RCX: 0000000000000002
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] RDX: 0000000000000008 RSI: 00000000ffffffff RDI: ffffffff81ab03a5
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] RBP: ffff88002ec87d88 R08: 000000000000000a R09: 000000000000fffc
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] R10: 0000000000000000 R11: ffff88002ec87c8e R12: 0000000000000000
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] R13: ffff88002ec87c8e R14: 0000000a0000ffff R15: ffff88002ec87c8e
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] FS: 00000000015fa880(0063) GS:ffff8800efc00000(0000) knlGS:0000000000000000
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] CR2: 000000c208109000 CR3: 00000000290e6000 CR4: 00000000001406f0
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] Stack:
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] ffff88002ecbffff ffff88002ec87d7c 000000002ec87d80 0000000000000001
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] ffff88002ecb9000 0000000000001000 ffff88002ec87dc0 ffffffffff0a0210
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] 00000fef2ec87db8 ffff88002ecfcd80 ffff88002ecfcd80 ffff88000d55e6c8
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] Call Trace:
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] [<ffffffff81210725>] seq_vprintf+0x35/0x70
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] [<ffffffff8121079d>] seq_printf+0x3d/0x40
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] [<ffffffff81210ea8>] ? seq_path_root+0x68/0xf0
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] [<ffffffff8122a35b>] show_mountinfo+0x5b/0x290
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] [<ffffffff8120adf6>] m_show+0x16/0x20
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] [<ffffffff8121005a>] seq_read+0xea/0x370
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] [<ffffffff811ecc38>] __vfs_read+0x18/0x50
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] [<ffffffff811eccf6>] vfs_read+0x86/0x140
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] [<ffffffff811ecdf6>] SyS_read+0x46/0xb0
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] [<ffffffff817b668d>] system_call_fastpath+0x16/0x1b
Aug 5 22:49:41 <hostname redacted> kernel: [631604.200054] Code: b6 4d c9 eb 8b 41 8b 55 00 0f b6 4d c9 83 fa 30 0f 83 2b 01 00 00 89 d0 49 03 45 10 83 c2 08 41 89 55 00 48 8b 10 e9 65 ff ff ff <41> 8b 55 00 83 fa 30 0f 83 e7 00 00 00 89 d0 49 03 45 10 83 c2
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] INFO: rcu_sched self-detected stall on CPU { 0} (t=15000 jiffies g=15980250 c=15980249 q=0)
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] Task dump for CPU 0:
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] docker R running task 0 28587 7322 0x00000008
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] ffffffff81c55fc0 ffff8800efc03be8 ffffffff810a00f6 0000000000000000
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] ffffffff81c55fc0 ffff8800efc03c08 ffffffff810a371d 0000000000000087
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] 0000000000000001 ffff8800efc03c38 ffffffff810d3db0 ffff8800efc14bc0
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] Call Trace:
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] <IRQ> [<ffffffff810a00f6>] sched_show_task+0xb6/0x130
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff810a371d>] dump_cpu_task+0x3d/0x50
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff810d3db0>] rcu_dump_cpu_stacks+0x90/0xd0
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff810d7c6c>] rcu_check_callbacks+0x42c/0x670
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff810a4751>] ? account_process_tick+0x61/0x180
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff810dcba9>] update_process_times+0x39/0x60
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff810ec0d5>] tick_sched_handle.isra.16+0x25/0x60
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff810ec154>] tick_sched_timer+0x44/0x80
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff810dd847>] __run_hrtimer+0x77/0x1d0
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff810ec110>] ? tick_sched_handle.isra.16+0x60/0x60
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff810ddc27>] hrtimer_interrupt+0xe7/0x220
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff8100a7de>] xen_timer_interrupt+0x2e/0x150
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff814c588a>] ? add_interrupt_randomness+0x3a/0x1e0
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff810cbf6e>] handle_irq_event_percpu+0x3e/0x1a0
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff810cf5ed>] handle_percpu_irq+0x3d/0x60
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff810cb55b>] generic_handle_irq+0x2b/0x40
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff8147c08f>] evtchn_2l_handle_events+0x24f/0x260
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff81078f6c>] ? __do_softirq+0x16c/0x270
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff8147985f>] __xen_evtchn_do_upcall+0x4f/0x90
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff8147b5e4>] xen_evtchn_do_upcall+0x34/0x50
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff817b858d>] xen_hvm_callback_vector+0x6d/0x80
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] <EOI> [<ffffffff813b02bc>] ? number.isra.2+0x35c/0x3a0
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff813af6e2>] ? format_decode+0x162/0x3e0
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff813b20d8>] vsnprintf+0x58/0x590
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff81210725>] seq_vprintf+0x35/0x70
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff8121079d>] seq_printf+0x3d/0x40
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff81210ea8>] ? seq_path_root+0x68/0xf0
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff8122a35b>] show_mountinfo+0x5b/0x290
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff8120adf6>] m_show+0x16/0x20
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff8121005a>] seq_read+0xea/0x370
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff811ecc38>] __vfs_read+0x18/0x50
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff811eccf6>] vfs_read+0x86/0x140
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff811ecdf6>] SyS_read+0x46/0xb0
Aug 5 22:49:48 <hostname redacted> kernel: [631611.188053] [<ffffffff817b668d>] system_call_fastpath+0x16/0x1b

ProblemType: Bug
DistroRelease: Ubuntu 14.04
Package: linux-image-3.19.0-25-generic 3.19.0-25.26~14.04.1
ProcVersionSignature: Ubuntu 3.19.0-25.26~14.04.1-generic 3.19.8-ckt2
Uname: Linux 3.19.0-25-generic x86_64
ApportVersion: 2.14.1-0ubuntu3.11
Architecture: amd64
Date: Tue Aug 11 14:29:36 2015
Ec2AMI: ami-cc3b3ea4
Ec2AMIManifest: (unknown)
Ec2AvailabilityZone: us-east-1a
Ec2InstanceType: c3.large
Ec2Kernel: unavailable
Ec2Ramdisk: unavailable
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: linux-lts-vivid
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
Michael Salinger (mjsalinger) wrote :
Revision history for this message
Michael Salinger (mjsalinger) wrote :

Anyone have a chance to look at this?

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux-lts-vivid (Ubuntu):
status: New → Confirmed
Revision history for this message
Akihiro Suda (suda-kyoto) wrote :

Recently this was also reported in Docker's issue tracker: https://github.com/docker/docker/issues/19758
This seems to be a bug of overlay.

I can easily reproduce the bug by running `for f in $(seq 1 1000);do docker run -it --rm ubuntu echo $f; done` concurrently in 3 terminals.

(Still not sure whether this is related to concurrency, but I could not reproduce when ran 10k times in a single terminal)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.