Comment 0 for bug 1596866

Revision history for this message
Benjamin Kaehne (ben-kaehne) wrote :

I am receiving quite regular hardlockups on python (27) in xenial:

Linux rts-os-s-03 4.4.0-28-generic #47-Ubuntu SMP Fri Jun 24 10:09:13 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

Ubuntu 16.04 LTS \n \l

Python 27:
ii python 2.7.11-1 amd64 interactive high-level object-oriented language (default version)
ii python2.7 2.7.11-7ubuntu1 amd64 Interactive high-level object-oriented language (version 2.7)

Python 3:
ii python3 3.5.1-3 amd64 interactive high-level object-oriented language (default python3 version)

Jun 28 06:52:42 XXXX kernel: [ 1634.052991] NMI watchdog: Watchdog detected hard LOCKUP on cpu 0
Jun 28 06:52:42 XXXX kernel: [ 1634.059516] Modules linked in: iptable_raw kvm_intel ebtable_filter ebtables ip6table_filter ip6_tables xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat xt_tcpudp iptable_filter ip_tables x_tables veth bridge stp llc bonding dcdbas intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm irqbypass shpchp lpc_ich ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi openvswitch nf_defrag_ipv6 nf_conntrack autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd bnx2x ahci libahci tg3 megaraid_sas vxlan ip6_udp_tunnel udp_tunnel ptp pps_core mdio libcrc32c [last unloaded: kvm_intel]
Jun 28 06:52:42 XXXX kernel: [ 1634.059790] CPU: 0 PID: 52914 Comm: python Not tainted 4.4.0-28-generic #47-Ubuntu
Jun 28 06:52:42 XXXX kernel: [ 1634.059791] Hardware name: Dell Inc. PowerEdge R730/0H21J3, BIOS 1.5.4 10/002/2015
Jun 28 06:52:42 XXXX kernel: [ 1634.059792] 0000000000000086 000000002732bfd7 ffff887b254bbbd0 ffffffff813eb1a3
Jun 28 06:52:42 XXXX kernel: [ 1634.059794] 0000000000000000 0000000000000000 ffff887b254bbbe8 ffffffff8113b3bd
Jun 28 06:52:42 XXXX kernel: [ 1634.059796] ffff887e4da1a000 ffff887b254bbc20 ffffffff81183e4c 0000000000000001
Jun 28 06:52:42 XXXX kernel: [ 1634.059797] Call Trace:
Jun 28 06:52:42 XXXX kernel: [ 1634.059804] [<ffffffff813eb1a3>] dump_stack+0x63/0x90
Jun 28 06:52:42 XXXX kernel: [ 1634.059807] [<ffffffff8113b3bd>] watchdog_overflow_callback+0xbd/0xd0
Jun 28 06:52:42 XXXX kernel: [ 1634.059810] [<ffffffff81183e4c>] __perf_event_overflow+0x8c/0x1d0
Jun 28 06:52:42 XXXX kernel: [ 1634.059811] [<ffffffff81184a24>] perf_event_overflow+0x14/0x20
Jun 28 06:52:42 XXXX kernel: [ 1634.059814] [<ffffffff8100c4d1>] intel_pmu_handle_irq+0x1e1/0x4a0
Jun 28 06:52:42 XXXX kernel: [ 1634.059817] [<ffffffff81197001>] ? __alloc_pages_nodemask+0x1b1/0xb60
Jun 28 06:52:42 XXXX kernel: [ 1634.059821] [<ffffffff811fc3f4>] ? try_charge+0xd4/0x640
Jun 28 06:52:42 XXXX kernel: [ 1634.059823] [<ffffffff81200b4b>] ? mem_cgroup_try_charge+0x6b/0x1b0
Jun 28 06:52:42 XXXX kernel: [ 1634.059826] [<ffffffff8119e667>] ? lru_cache_add_active_or_unevictable+0x27/0xa0
Jun 28 06:52:42 XXXX kernel: [ 1634.059830] [<ffffffff811bfffa>] ? handle_mm_fault+0xcaa/0x1820
Jun 28 06:52:42 XXXX kernel: [ 1634.059831] [<ffffffff811c5fbe>] ? vma_merge+0x22e/0x330
Jun 28 06:52:42 XXXX kernel: [ 1634.059834] [<ffffffff810056dd>] perf_event_nmi_handler+0x2d/0x50
Jun 28 06:52:42 XXXX kernel: [ 1634.059837] [<ffffffff810323c9>] nmi_handle+0x69/0x120
Jun 28 06:52:42 XXXX kernel: [ 1634.059839] [<ffffffff81032900>] default_do_nmi+0x40/0x100
Jun 28 06:52:42 XXXX kernel: [ 1634.059841] [<ffffffff81032aa2>] do_nmi+0xe2/0x130
Jun 28 06:52:42 XXXX kernel: [ 1634.059844] [<ffffffff81829ac6>] nmi+0x56/0xa5

As suggested, this is causing hard lockups and/or pauses.