nfsd hangs
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
NFS-Utils |
New
|
Undecided
|
Unassigned | ||
nfs-utils (Ubuntu) |
Incomplete
|
Critical
|
Unassigned | ||
Trusty |
Incomplete
|
Critical
|
Unassigned |
Bug Description
On a relatively busy NFS server, the system hang on us with the following messages:
May 4 07:53:36 wol-nfs kernel: [487678.715589] INFO: task nfsd:2793 blocked for more than 120 seconds.
May 4 07:53:36 wol-nfs kernel: [487678.715653] Not tainted 3.13.0-24-generic #46-Ubuntu
May 4 07:53:36 wol-nfs kernel: [487678.715695] "echo 0 > /proc/sys/
May 4 07:53:36 wol-nfs kernel: [487678.715790] nfsd D ffff88023fc14440 0 2793 2 0x00000000
May 4 07:53:36 wol-nfs kernel: [487678.715800] ffff88023317fca0 0000000000000002 ffff880233268000 ffff88023317ffd8
May 4 07:53:36 wol-nfs kernel: [487678.715807] 0000000000014440 0000000000014440 ffff880233268000 ffffffffa03520a0
May 4 07:53:36 wol-nfs kernel: [487678.715811] ffffffffa03520a4 ffff880233268000 00000000ffffffff ffffffffa03520a8
May 4 07:53:36 wol-nfs kernel: [487678.715818] Call Trace:
May 4 07:53:36 wol-nfs kernel: [487678.715860] [<ffffffff8171a
May 4 07:53:36 wol-nfs kernel: [487678.715865] [<ffffffff8171c
May 4 07:53:36 wol-nfs kernel: [487678.715870] [<ffffffff8171c
May 4 07:53:36 wol-nfs kernel: [487678.715905] [<ffffffffa033b
May 4 07:53:36 wol-nfs kernel: [487678.715917] [<ffffffffa032e
May 4 07:53:36 wol-nfs kernel: [487678.715928] [<ffffffffa032f
May 4 07:53:36 wol-nfs kernel: [487678.715937] [<ffffffffa031b
May 4 07:53:36 wol-nfs kernel: [487678.715961] [<ffffffffa026a
May 4 07:53:36 wol-nfs kernel: [487678.715977] [<ffffffffa026a
May 4 07:53:36 wol-nfs kernel: [487678.715986] [<ffffffffa031b
May 4 07:53:36 wol-nfs kernel: [487678.715995] [<ffffffffa031b
May 4 07:53:36 wol-nfs kernel: [487678.716004] [<ffffffff8108b
May 4 07:53:36 wol-nfs kernel: [487678.716009] [<ffffffff8108b
May 4 07:53:36 wol-nfs kernel: [487678.716016] [<ffffffff81726
May 4 07:53:36 wol-nfs kernel: [487678.716020] [<ffffffff8108b
And many more with the exact same stack trace:
May 4 07:53:36 wol-nfs kernel: [487678.716025] INFO: task nfsd:2794 blocked for more than 120 seconds.
May 4 07:53:36 wol-nfs kernel: [487678.716500] INFO: task nfsd:2795 blocked for more than 120 seconds.
May 4 07:53:36 wol-nfs kernel: [487678.717166] INFO: task nfsd:2796 blocked for more than 120 seconds.
May 4 07:53:36 wol-nfs kernel: [487678.717657] INFO: task nfsd:2797 blocked for more than 120 seconds.
May 4 07:53:36 wol-nfs kernel: [487678.718150] INFO: task nfsd:2798 blocked for more than 120 seconds.
May 4 07:53:36 wol-nfs kernel: [487678.718743] INFO: task nfsd:2799 blocked for more than 120 seconds.
Except this one
May 4 07:53:36 wol-nfs kernel: [487678.719229] INFO: task nfsd:2800 blocked for more than 120 seconds.
May 4 07:53:36 wol-nfs kernel: [487678.719347] Not tainted 3.13.0-24-generic #46-Ubuntu
May 4 07:53:36 wol-nfs kernel: [487678.719605] "echo 0 > /proc/sys/
May 4 07:53:36 wol-nfs kernel: [487678.719741] nfsd D ffff88023fd94440 0 2800 2 0x00000000
May 4 07:53:36 wol-nfs kernel: [487678.719746] ffff8800b81f1b40 0000000000000002 ffff88022f96c7d0 ffff8800b81f1fd8
May 4 07:53:36 wol-nfs kernel: [487678.719751] 0000000000014440 0000000000014440 ffff88022f96c7d0 ffff8800b81f1ca8
May 4 07:53:36 wol-nfs kernel: [487678.719755] ffff8800b81f1cb0 7fffffffffffffff ffff88022f96c7d0 ffff8800b81f1c90
May 4 07:53:36 wol-nfs kernel: [487678.719760] Call Trace:
May 4 07:53:36 wol-nfs kernel: [487678.719766] [<ffffffff81719
May 4 07:53:36 wol-nfs kernel: [487678.719770] [<ffffffff81719
May 4 07:53:36 wol-nfs kernel: [487678.719775] [<ffffffff81719
May 4 07:53:36 wol-nfs kernel: [487678.719781] [<ffffffff8101b
May 4 07:53:36 wol-nfs kernel: [487678.719786] [<ffffffff8101b
May 4 07:53:36 wol-nfs kernel: [487678.719791] [<ffffffff8171a
May 4 07:53:36 wol-nfs kernel: [487678.719798] [<ffffffff8109a
May 4 07:53:36 wol-nfs kernel: [487678.719804] [<ffffffff81082
May 4 07:53:36 wol-nfs kernel: [487678.719818] [<ffffffffa0346
May 4 07:53:36 wol-nfs kernel: [487678.719829] [<ffffffffa033d
May 4 07:53:36 wol-nfs kernel: [487678.719840] [<ffffffffa033e
May 4 07:53:36 wol-nfs kernel: [487678.719849] [<ffffffffa032f
May 4 07:53:36 wol-nfs kernel: [487678.719857] [<ffffffffa031b
May 4 07:53:36 wol-nfs kernel: [487678.719872] [<ffffffffa026a
May 4 07:53:36 wol-nfs kernel: [487678.719885] [<ffffffffa026a
May 4 07:53:36 wol-nfs kernel: [487678.719893] [<ffffffffa031b
May 4 07:53:36 wol-nfs kernel: [487678.719901] [<ffffffffa031b
May 4 07:53:36 wol-nfs kernel: [487678.719905] [<ffffffff8108b
May 4 07:53:36 wol-nfs kernel: [487678.719909] [<ffffffff8108b
May 4 07:53:36 wol-nfs kernel: [487678.719914] [<ffffffff81726
May 4 07:53:36 wol-nfs kernel: [487678.719918] [<ffffffff8108b
It looks like the last thread just hung, keeping a lock and blocking out every single other thread/process of nfsd.
Preceding the crash, there were a few suspicious messages about a CPU soft lockup, with the following stack trace. This may or may not be related. It's days ago though, so it's probably nothing.
Apr 30 12:45:41 wol-nfs kernel: [159283.910727] BUG: soft lockup - CPU#2 stuck for 22s! [chown:6108]
Apr 30 12:45:41 wol-nfs kernel: [159283.910928] Call Trace:
Apr 30 12:45:41 wol-nfs kernel: [159283.910934] [<ffffffff81208
Apr 30 12:45:41 wol-nfs kernel: [159283.910937] [<ffffffff81209
Apr 30 12:45:41 wol-nfs kernel: [159283.910940] [<ffffffff811d5
Apr 30 12:45:41 wol-nfs kernel: [159283.910943] [<ffffffff811b6
Apr 30 12:45:41 wol-nfs kernel: [159283.910945] [<ffffffff811b8
Apr 30 12:45:41 wol-nfs kernel: [159283.910948] [<ffffffff81726
Apr 30 12:45:41 wol-nfs kernel: [159283.910949] Code: 39 d0 75 ea b8 01 00 00 00 5d c3 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 e9 06 00 00 00 66 83 07 02 c3 90 8b 37 f0 66 83 07 02 <f6> 47 02 01 74 f1 55 48 89 e5 e8 31 1b ff ff 5d c3 0f 1f 84 00
The relevant sections of kern.log are in an separate attachment.
ProblemType: Bug
DistroRelease: Ubuntu 14.04
Package: linux-generic 3.13.0.24.29
ProcVersionSign
Uname: Linux 3.13.0-24-generic x86_64
AlsaDevices:
total 0
crw-rw---- 1 root audio 116, 1 May 4 23:41 seq
crw-rw---- 1 root audio 116, 33 May 4 23:41 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
ApportVersion: 2.14.1-0ubuntu3
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: Error: [Errno 2] No such file or directory: 'iw'
CurrentDmesg:
[ 5.274819] NFSD: Using /var/lib/
[ 5.279871] NFSD: starting 90-second grace period (net ffffffff81cd9b00)
[ 5.518836] init: plymouth-
[ 12.233348] [UFW BLOCK] IN=eth0 OUT= MAC=00:
Date: Mon May 5 00:29:12 2014
HibernationDevice: RESUME=
InstallationDate: Installed on 2014-04-20 (14 days ago)
InstallationMedia: Ubuntu-Server 14.04 LTS "Trusty Tahr" - Release amd64 (20140416.2)
IwConfig:
eth0 no wireless extensions.
lo no wireless extensions.
Lsusb: Error: command ['lsusb'] failed with exit code 1: unable to initialize libusb: -99
MachineType: VMware, Inc. VMware Virtual Platform
PciMultimedia:
ProcFB: 0 svgadrmfb
ProcKernelCmdLine: BOOT_IMAGE=
RelatedPackageV
linux-
linux-
linux-firmware 1.127
RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 07/30/2013
dmi.bios.vendor: Phoenix Technologies LTD
dmi.bios.version: 6.00
dmi.board.name: 440BX Desktop Reference Platform
dmi.board.vendor: Intel Corporation
dmi.board.version: None
dmi.chassis.
dmi.chassis.type: 1
dmi.chassis.vendor: No Enclosure
dmi.chassis.
dmi.modalias: dmi:bvnPhoenixT
dmi.product.name: VMware Virtual Platform
dmi.product.
dmi.sys.vendor: VMware, Inc.
Changed in nfs-utils (Ubuntu): | |
importance: | Undecided → Critical |
Changed in nfs-utils (Ubuntu Trusty): | |
status: | New → Incomplete |
importance: | Undecided → Critical |
Status changed to 'Confirmed' because the bug affects multiple users.