Comment 16 for bug 561210

Revision history for this message
Ancoron Luziferis (ancoron) wrote :

Well, today on one of my machine here at home this issue is back:

[616201.460064] INFO: task kswapd0:52 blocked for more than 120 seconds.
[616201.460072] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[616201.460079] kswapd0 D 0000000000000000 0 52 2 0x00000000
[616201.460090] ffff880128d2f720 0000000000000046 0000000000015bc0 0000000000015bc0
[616201.460100] ffff88012af8df80 ffff880128d2ffd8 0000000000015bc0 ffff88012af8dbc0
[616201.460108] 0000000000015bc0 ffff880128d2ffd8 0000000000015bc0 ffff88012af8df80
[616201.460117] Call Trace:
[616201.460153] [<ffffffffa03a62b0>] ? nfs_wait_bit_uninterruptible+0x0/0x20 [nfs]
[616201.460166] [<ffffffff8153eb87>] io_schedule+0x47/0x70
[616201.460192] [<ffffffffa03a62be>] nfs_wait_bit_uninterruptible+0xe/0x20 [nfs]
[616201.460201] [<ffffffff8153f3df>] __wait_on_bit+0x5f/0x90
[616201.460211] [<ffffffff811346a6>] ? __slab_free+0x96/0x120
[616201.460235] [<ffffffffa03a62b0>] ? nfs_wait_bit_uninterruptible+0x0/0x20 [nfs]
[616201.460243] [<ffffffff8153f488>] out_of_line_wait_on_bit+0x78/0x90
[616201.460252] [<ffffffff81085360>] ? wake_bit_function+0x0/0x40
[616201.460277] [<ffffffffa03a629f>] nfs_wait_on_request+0x2f/0x40 [nfs]
[616201.460302] [<ffffffffa03aa6af>] nfs_wait_on_requests_locked+0x7f/0xd0 [nfs]
[616201.460329] [<ffffffffa03abaee>] nfs_sync_mapping_wait+0x9e/0x1a0 [nfs]
[616201.460354] [<ffffffffa03abc71>] nfs_wb_page+0x81/0xe0 [nfs]
[616201.460376] [<ffffffffa039ab2f>] nfs_release_page+0x5f/0x80 [nfs]
[616201.460384] [<ffffffff810f2bb2>] try_to_release_page+0x32/0x50
[616201.460392] [<ffffffff81101833>] shrink_page_list+0x453/0x5f0
[616201.460402] [<ffffffff8113b419>] ? mem_cgroup_del_lru+0x39/0x40
[616201.460409] [<ffffffff81100517>] ? isolate_lru_pages+0x227/0x260
[616201.460417] [<ffffffff81101cdd>] shrink_inactive_list+0x30d/0x7e0
[616201.460426] [<ffffffff810116c0>] ? __switch_to+0xd0/0x320
[616201.460434] [<ffffffff81076e2c>] ? lock_timer_base+0x3c/0x70
[616201.460441] [<ffffffff810778b5>] ? try_to_del_timer_sync+0x75/0xd0
[616201.460449] [<ffffffff81102241>] shrink_list+0x91/0xf0
[616201.460455] [<ffffffff81102437>] shrink_zone+0x197/0x240
[616201.460463] [<ffffffff811034c9>] balance_pgdat+0x659/0x6d0
[616201.460470] [<ffffffff81100550>] ? isolate_pages_global+0x0/0x50
[616201.460477] [<ffffffff8110363e>] kswapd+0xfe/0x150
[616201.460485] [<ffffffff81085320>] ? autoremove_wake_function+0x0/0x40
[616201.460492] [<ffffffff81103540>] ? kswapd+0x0/0x150
[616201.460498] [<ffffffff81084fa6>] kthread+0x96/0xa0
[616201.460506] [<ffffffff810141ea>] child_rip+0xa/0x20
[616201.460513] [<ffffffff81084f10>] ? kthread+0x0/0xa0
[616201.460520] [<ffffffff810141e0>] ? child_rip+0x0/0x20

This was an extract job for a rather small archive (just ~ 400 MiB) from the NAS, to the NAS (I know this is bad practice).

What also came up is that again KDE4 completely freezes, until the lock is released:

[616201.460551] INFO: task plasma-desktop:7429 blocked for more than 120 seconds.
[616201.460556] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[616201.460561] plasma-deskto D 0000000000000000 0 7429 1 0x00000000
[616201.460570] ffff88012766b158 0000000000000086 0000000000015bc0 0000000000015bc0
[616201.460579] ffff880127acdf80 ffff88012766bfd8 0000000000015bc0 ffff880127acdbc0
[616201.460587] 0000000000015bc0 ffff88012766bfd8 0000000000015bc0 ffff880127acdf80
[616201.460595] Call Trace:
[616201.460619] [<ffffffffa03a62b0>] ? nfs_wait_bit_uninterruptible+0x0/0x20 [nfs]
[616201.460628] [<ffffffff8153eb87>] io_schedule+0x47/0x70
[616201.460651] [<ffffffffa03a62be>] nfs_wait_bit_uninterruptible+0xe/0x20 [nfs]
[...]

When the lock is released all apps are going back to usual work. I wouldn't mind if the NFS transfer hangs for some time waiting for the target to complete some work but it just interferes any interaction with the machine so this is a real show-stopper.

However, there was a difference how those NFS export got mounted. The machine that still is fine mounts like this:

rw,rsize=32768,wsize=32768,hard,intr,noatime

...and the machine that just showed this issue again mounts like:

rw,hard,intr,noatime

So I wouldn't expect that setting the sizes changes anything but please give it a try if you haven't done so before.