When the lock is released all apps are going back to usual work. I wouldn't mind if the NFS transfer hangs for some time waiting for the target to complete some work but it just interferes any interaction with the machine so this is a real show-stopper.
However, there was a difference how those NFS export got mounted. The machine that still is fine mounts like this:
rw,rsize=32768,wsize=32768,hard,intr,noatime
...and the machine that just showed this issue again mounts like:
rw,hard,intr,noatime
So I wouldn't expect that setting the sizes changes anything but please give it a try if you haven't done so before.
Well, today on one of my machine here at home this issue is back:
[616201.460064] INFO: task kswapd0:52 blocked for more than 120 seconds. kernel/ hung_task_ timeout_ secs" disables this message. 2b0>] ? nfs_wait_ bit_uninterrupt ible+0x0/ 0x20 [nfs] b87>] io_schedule+ 0x47/0x70 2be>] nfs_wait_ bit_uninterrupt ible+0xe/ 0x20 [nfs] 3df>] __wait_ on_bit+ 0x5f/0x90 6a6>] ? __slab_ free+0x96/ 0x120 2b0>] ? nfs_wait_ bit_uninterrupt ible+0x0/ 0x20 [nfs] 488>] out_of_ line_wait_ on_bit+ 0x78/0x90 360>] ? wake_bit_ function+ 0x0/0x40 29f>] nfs_wait_ on_request+ 0x2f/0x40 [nfs] 6af>] nfs_wait_ on_requests_ locked+ 0x7f/0xd0 [nfs] aee>] nfs_sync_ mapping_ wait+0x9e/ 0x1a0 [nfs] c71>] nfs_wb_ page+0x81/ 0xe0 [nfs] b2f>] nfs_release_ page+0x5f/ 0x80 [nfs] bb2>] try_to_ release_ page+0x32/ 0x50 833>] shrink_ page_list+ 0x453/0x5f0 419>] ? mem_cgroup_ del_lru+ 0x39/0x40 517>] ? isolate_ lru_pages+ 0x227/0x260 cdd>] shrink_ inactive_ list+0x30d/ 0x7e0 6c0>] ? __switch_ to+0xd0/ 0x320 e2c>] ? lock_timer_ base+0x3c/ 0x70 8b5>] ? try_to_ del_timer_ sync+0x75/ 0xd0 241>] shrink_ list+0x91/ 0xf0 437>] shrink_ zone+0x197/ 0x240 4c9>] balance_ pgdat+0x659/ 0x6d0 550>] ? isolate_ pages_global+ 0x0/0x50 63e>] kswapd+0xfe/0x150 320>] ? autoremove_ wake_function+ 0x0/0x40 540>] ? kswapd+0x0/0x150 fa6>] kthread+0x96/0xa0 1ea>] child_rip+0xa/0x20 f10>] ? kthread+0x0/0xa0 1e0>] ? child_rip+0x0/0x20
[616201.460072] "echo 0 > /proc/sys/
[616201.460079] kswapd0 D 0000000000000000 0 52 2 0x00000000
[616201.460090] ffff880128d2f720 0000000000000046 0000000000015bc0 0000000000015bc0
[616201.460100] ffff88012af8df80 ffff880128d2ffd8 0000000000015bc0 ffff88012af8dbc0
[616201.460108] 0000000000015bc0 ffff880128d2ffd8 0000000000015bc0 ffff88012af8df80
[616201.460117] Call Trace:
[616201.460153] [<ffffffffa03a6
[616201.460166] [<ffffffff8153e
[616201.460192] [<ffffffffa03a6
[616201.460201] [<ffffffff8153f
[616201.460211] [<ffffffff81134
[616201.460235] [<ffffffffa03a6
[616201.460243] [<ffffffff8153f
[616201.460252] [<ffffffff81085
[616201.460277] [<ffffffffa03a6
[616201.460302] [<ffffffffa03aa
[616201.460329] [<ffffffffa03ab
[616201.460354] [<ffffffffa03ab
[616201.460376] [<ffffffffa039a
[616201.460384] [<ffffffff810f2
[616201.460392] [<ffffffff81101
[616201.460402] [<ffffffff8113b
[616201.460409] [<ffffffff81100
[616201.460417] [<ffffffff81101
[616201.460426] [<ffffffff81011
[616201.460434] [<ffffffff81076
[616201.460441] [<ffffffff81077
[616201.460449] [<ffffffff81102
[616201.460455] [<ffffffff81102
[616201.460463] [<ffffffff81103
[616201.460470] [<ffffffff81100
[616201.460477] [<ffffffff81103
[616201.460485] [<ffffffff81085
[616201.460492] [<ffffffff81103
[616201.460498] [<ffffffff81084
[616201.460506] [<ffffffff81014
[616201.460513] [<ffffffff81084
[616201.460520] [<ffffffff81014
This was an extract job for a rather small archive (just ~ 400 MiB) from the NAS, to the NAS (I know this is bad practice).
What also came up is that again KDE4 completely freezes, until the lock is released:
[616201.460551] INFO: task plasma-desktop:7429 blocked for more than 120 seconds. kernel/ hung_task_ timeout_ secs" disables this message. 2b0>] ? nfs_wait_ bit_uninterrupt ible+0x0/ 0x20 [nfs] b87>] io_schedule+ 0x47/0x70 2be>] nfs_wait_ bit_uninterrupt ible+0xe/ 0x20 [nfs]
[616201.460556] "echo 0 > /proc/sys/
[616201.460561] plasma-deskto D 0000000000000000 0 7429 1 0x00000000
[616201.460570] ffff88012766b158 0000000000000086 0000000000015bc0 0000000000015bc0
[616201.460579] ffff880127acdf80 ffff88012766bfd8 0000000000015bc0 ffff880127acdbc0
[616201.460587] 0000000000015bc0 ffff88012766bfd8 0000000000015bc0 ffff880127acdf80
[616201.460595] Call Trace:
[616201.460619] [<ffffffffa03a6
[616201.460628] [<ffffffff8153e
[616201.460651] [<ffffffffa03a6
[...]
When the lock is released all apps are going back to usual work. I wouldn't mind if the NFS transfer hangs for some time waiting for the target to complete some work but it just interferes any interaction with the machine so this is a real show-stopper.
However, there was a difference how those NFS export got mounted. The machine that still is fine mounts like this:
rw,rsize= 32768,wsize= 32768,hard, intr,noatime
...and the machine that just showed this issue again mounts like:
rw,hard, intr,noatime
So I wouldn't expect that setting the sizes changes anything but please give it a try if you haven't done so before.