cinder volume unavailable due to iSCSI target hangup

Bug #1750835 reported by Roman Safonov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Incomplete
Critical
Roman Safonov

Bug Description

Environment: MOS 7.0

On hypervisor (which also service as cinder-volume) the instance becomes unavailable. Cinder-volume and nova-compute stops writing logs, a large number of blkid processes are present in the system. qemu process gets stuck with the following messages in kern.log:

<3>Feb 23 18:43:16 node-10 kernel: [9726568.185235] INFO: task qemu-system-x86:32990 blocked for more than 120 seconds.
<3>Feb 23 18:43:16 node-10 kernel: [9726568.189275] Tainted: G OX 3.13.0-65-generic #105-Ubuntu
<3>Feb 23 18:43:16 node-10 kernel: [9726568.191890] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
<4>Feb 23 18:43:16 node-10 kernel: [9726568.198089] 0000000000000000 ffff881363955000 0000000000000000 ffff881396a51800
<4>Feb 23 18:43:16 node-10 kernel: [9726568.198096] Call Trace:
<4>Feb 23 18:43:16 node-10 kernel: [9726568.198105] [<ffffffff817286ad>] io_schedule+0x9d/0x140
<4>Feb 23 18:43:16 node-10 kernel: [9726568.198111] [<ffffffff811fc314>] do_blockdev_direct_IO+0x1ce4/0x2910
<4>Feb 23 18:43:16 node-10 kernel: [9726568.198116] [<ffffffff811f7180>] ? I_BDEV+0x10/0x10
<4>Feb 23 18:43:16 node-10 kernel: [9726568.198120] [<ffffffff811fcf95>] __blockdev_direct_IO+0x55/0x60
<4>Feb 23 18:43:16 node-10 kernel: [9726568.198124] [<ffffffff811f7180>] ? I_BDEV+0x10/0x10
<4>Feb 23 18:43:16 node-10 kernel: [9726568.198128] [<ffffffff811f7876>] blkdev_direct_IO+0x56/0x60
<4>Feb 23 18:43:16 node-10 kernel: [9726568.198131] [<ffffffff811f7180>] ? I_BDEV+0x10/0x10
<4>Feb 23 18:43:16 node-10 kernel: [9726568.198136] [<ffffffff81150cd1>] generic_file_direct_write+0xc1/0x180
<4>Feb 23 18:43:16 node-10 kernel: [9726568.198139] [<ffffffff81151095>] __generic_file_aio_write+0x305/0x3d0
<4>Feb 23 18:43:16 node-10 kernel: [9726568.198143] [<ffffffff811f8156>] blkdev_aio_write+0x46/0x90
<4>Feb 23 18:43:16 node-10 kernel: [9726568.198149] [<ffffffff811bdc9a>] do_sync_write+0x5a/0x90
<4>Feb 23 18:43:16 node-10 kernel: [9726568.198154] [<ffffffff811be424>] vfs_write+0xb4/0x1f0
<4>Feb 23 18:43:16 node-10 kernel: [9726568.198158] [<ffffffff811befd2>] SyS_pwrite64+0x72/0xb0
<4>Feb 23 18:43:16 node-10 kernel: [9726568.198162] [<ffffffff81734b5d>] system_call_fastpath+0x1a/0x1f
<3>Feb 23 18:43:16 node-10 kernel: [9726568.198166] INFO: task qemu-system-x86:38844 blocked for more than 120 seconds.
<3>Feb 23 18:43:16 node-10 kernel: [9726568.205364] Tainted: G OX 3.13.0-65-generic #105-Ubuntu
<4>Feb 23 18:43:16 node-10 kernel: [9726568.198070] ffff880071569a88 0000000000000082 ffff881396a51800 ffff880071569fd8
<4>Feb 23 18:43:16 node-10 kernel: [9726568.198083] 0000000000013180 0000000000013180 ffff881396a51800 ffff8827df833a18
<3>Feb 23 18:43:16 node-10 kernel: [9726568.209515] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
<4>Feb 23 18:43:16 node-10 kernel: [9726568.218573] ffff880056dd1ad8 0000000000000082 ffff880033bd6000 ffff880056dd1fd8
<4>Feb 23 18:43:16 node-10 kernel: [9726568.218580] 0000000000013180 0000000000013180 ffff880033bd6000 ffff8827df833a18
<4>Feb 23 18:43:16 node-10 kernel: [9726568.218586] 0000000000000000 ffff881363956180 0000000000000000 ffff880033bd6000
<4>Feb 23 18:43:16 node-10 kernel: [9726568.218592] Call Trace:
<4>Feb 23 18:43:16 node-10 kernel: [9726568.218597] [<ffffffff817286ad>] io_schedule+0x9d/0x140
<4>Feb 23 18:43:16 node-10 kernel: [9726568.218601] [<ffffffff811fc314>] do_blockdev_direct_IO+0x1ce4/0x2910
<4>Feb 23 18:43:16 node-10 kernel: [9726568.218605] [<ffffffff811f7180>] ? I_BDEV+0x10/0x10
<4>Feb 23 18:43:16 node-10 kernel: [9726568.218608] [<ffffffff811fcf95>] __blockdev_direct_IO+0x55/0x60
<4>Feb 23 18:43:16 node-10 kernel: [9726568.218611] [<ffffffff811f7180>] ? I_BDEV+0x10/0x10
<4>Feb 23 18:43:16 node-10 kernel: [9726568.218614] [<ffffffff811f7876>] blkdev_direct_IO+0x56/0x60
<4>Feb 23 18:43:16 node-10 kernel: [9726568.218617] [<ffffffff811f7180>] ? I_BDEV+0x10/0x10
<4>Feb 23 18:43:16 node-10 kernel: [9726568.218621] [<ffffffff811520ab>] generic_file_aio_read+0x69b/0x700
<4>Feb 23 18:43:16 node-10 kernel: [9726568.218626] [<ffffffff8108e720>] ? hrtimer_get_res+0x50/0x50
<4>Feb 23 18:43:16 node-10 kernel: [9726568.218630] [<ffffffff811f7cfb>] blkdev_aio_read+0x4b/0x70
<4>Feb 23 18:43:16 node-10 kernel: [9726568.218634] [<ffffffff811bdc0a>] do_sync_read+0x5a/0x90
<4>Feb 23 18:43:16 node-10 kernel: [9726568.218637] [<ffffffff811be2a5>] vfs_read+0x95/0x160
<4>Feb 23 18:43:16 node-10 kernel: [9726568.218641] [<ffffffff811bef22>] SyS_pread64+0x72/0xb0
<4>Feb 23 18:43:16 node-10 kernel: [9726568.218644] [<ffffffff81734b5d>] system_call_fastpath+0x1a/0x1f
<3>Feb 23 18:43:16 node-10 kernel: [9726568.218647] INFO: task qemu-system-x86:48268 blocked for more than 120 seconds.
<3>Feb 23 18:43:16 node-10 kernel: [9726568.229094] Tainted: G OX 3.13.0-65-generic #105-Ubuntu
<3>Feb 23 18:43:16 node-10 kernel: [9726568.234764] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
<4>Feb 23 18:43:16 node-10 kernel: [9726568.247962] ffff88002fca59d0 0000000000000082 ffff8813993f9800 ffff88002fca5fd8
<4>Feb 23 18:43:16 node-10 kernel: [9726568.248175] 0000000000013180 0000000000013180 ffff8813993f9800 ffff8827df833a18
<4>Feb 23 18:43:16 node-10 kernel: [9726568.248181] 0000000000000000 ffff881363954d80 0000000000000000 ffff8813993f9800
<4>Feb 23 18:43:16 node-10 kernel: [9726568.248187] Call Trace:
<4>Feb 23 18:43:16 node-10 kernel: [9726568.248191] [<ffffffff817286ad>] io_schedule+0x9d/0x140
<4>Feb 23 18:43:16 node-10 kernel: [9726568.248195] [<ffffffff811fc314>] do_blockdev_direct_IO+0x1ce4/0x2910
<4>Feb 23 18:43:16 node-10 kernel: [9726568.248199] [<ffffffff811f7180>] ? I_BDEV+0x10/0x10
<4>Feb 23 18:43:16 node-10 kernel: [9726568.248203] [<ffffffff811fcf95>] __blockdev_direct_IO+0x55/0x60
<4>Feb 23 18:43:16 node-10 kernel: [9726568.248206] [<ffffffff811f7180>] ? I_BDEV+0x10/0x10
<4>Feb 23 18:43:16 node-10 kernel: [9726568.248209] [<ffffffff811f7876>] blkdev_direct_IO+0x56/0x60
<4>Feb 23 18:43:16 node-10 kernel: [9726568.248212] [<ffffffff811f7180>] ? I_BDEV+0x10/0x10
<4>Feb 23 18:43:16 node-10 kernel: [9726568.248216] [<ffffffff81150cd1>] generic_file_direct_write+0xc1/0x180
<4>Feb 23 18:43:16 node-10 kernel: [9726568.248219] [<ffffffff81151095>] __generic_file_aio_write+0x305/0x3d0
<4>Feb 23 18:43:16 node-10 kernel: [9726568.248222] [<ffffffff811f8156>] blkdev_aio_write+0x46/0x90
<4>Feb 23 18:43:16 node-10 kernel: [9726568.248226] [<ffffffff811bdd1c>] do_sync_readv_writev+0x4c/0x80
<4>Feb 23 18:43:16 node-10 kernel: [9726568.248230] [<ffffffff811bf1e0>] do_readv_writev+0xb0/0x220
<4>Feb 23 18:43:16 node-10 kernel: [9726568.248235] [<ffffffff810db67e>] ? do_futex+0xde/0x760
<4>Feb 23 18:43:16 node-10 kernel: [9726568.248239] [<ffffffff811bf3d0>] vfs_writev+0x30/0x60
<4>Feb 23 18:43:16 node-10 kernel: [9726568.248242] [<ffffffff811bf6f2>] SyS_pwritev+0xa2/0xd0
<4>Feb 23 18:43:16 node-10 kernel: [9726568.248246] [<ffffffff81734b5d>] system_call_fastpath+0x1a/0x1f
<3>Feb 23 18:43:16 node-10 kernel: [9726568.248249] INFO: task qemu-system-x86:22338 blocked for more than 120 seconds.
<3>Feb 23 18:43:16 node-10 kernel: [9726568.262056] Tainted: G OX 3.13.0-65-generic #105-Ubuntu
<3>Feb 23 18:43:16 node-10 kernel: [9726568.269222] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
<4>Feb 23 18:43:16 node-10 kernel: [9726568.284607] ffff880049a65d80 0000000000000082 ffff88003ab58000 ffff880049a65fd8
<4>Feb 23 18:43:16 node-10 kernel: [9726568.284626] 0000000000013180 0000000000013180 ffff88003ab58000 ffff880049a65ea8
<4>Feb 23 18:43:16 node-10 kernel: [9726568.284632] ffff880049a65eb0 7fffffffffffffff ffff88003ab58000 00007f2b6bbd7700
<4>Feb 23 18:43:16 node-10 kernel: [9726568.284638] Call Trace:
<4>Feb 23 18:43:16 node-10 kernel: [9726568.284642] [<ffffffff81728389>] schedule+0x29/0x70
<4>Feb 23 18:43:16 node-10 kernel: [9726568.284646] [<ffffffff817275d9>] schedule_timeout+0x239/0x2d0
<4>Feb 23 18:43:16 node-10 kernel: [9726568.284658] [<ffffffff81339b43>] ? __blk_run_queue+0x33/0x40
<4>Feb 23 18:43:16 node-10 kernel: [9726568.284661] [<ffffffff8133d8b3>] ? blk_queue_bio+0x273/0x360
<4>Feb 23 18:43:16 node-10 kernel: [9726568.284665] [<ffffffff81728ea6>] wait_for_completion+0xa6/0x160
<4>Feb 23 18:43:16 node-10 kernel: [9726568.284671] [<ffffffff8109ac90>] ? wake_up_state+0x20/0x20
<4>Feb 23 18:43:16 node-10 kernel: [9726568.284675] [<ffffffff811f4bee>] submit_bio_wait+0x5e/0x70
<4>Feb 23 18:43:16 node-10 kernel: [9726568.284679] [<ffffffff8133f5ca>] blkdev_issue_flush+0x5a/0x90
<4>Feb 23 18:43:16 node-10 kernel: [9726568.284682] [<ffffffff811f7425>] blkdev_fsync+0x35/0x50
<4>Feb 23 18:43:16 node-10 kernel: [9726568.284687] [<ffffffff811ee551>] do_fsync+0x51/0x80
<4>Feb 23 18:43:16 node-10 kernel: [9726568.284691] [<ffffffff811ee803>] SyS_fdatasync+0x13/0x20
<4>Feb 23 18:43:16 node-10 kernel: [9726568.284694] [<ffffffff81734b5d>] system_call_fastpath+0x1a/0x1f
<3>Feb 23 18:43:16 node-10 kernel: [9726568.284708] INFO: task vgs:73235 blocked for more than 120 seconds.
<3>Feb 23 18:43:16 node-10 kernel: [9726568.292842] Tainted: G OX 3.13.0-65-generic #105-Ubuntu
<3>Feb 23 18:43:16 node-10 kernel: [9726568.301781] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
<4>Feb 23 18:43:16 node-10 kernel: [9726568.319496] ffff88005d37dae0 0000000000000082 ffff88007135e000 ffff88005d37dfd8
<4>Feb 23 18:43:16 node-10 kernel: [9726568.319505] 0000000000013180 0000000000013180 ffff88007135e000 ffff884fdf253a18
<4>Feb 23 18:43:16 node-10 kernel: [9726568.319510] 0000000000000000 ffff883b9a4e9b80 0000000000000000 ffff88007135e000
<4>Feb 23 18:43:16 node-10 kernel: [9726568.319520] Call Trace:
<4>Feb 23 18:43:16 node-10 kernel: [9726568.319525] [<ffffffff817286ad>] io_schedule+0x9d/0x140
<4>Feb 23 18:43:16 node-10 kernel: [9726568.319528] [<ffffffff811fc314>] do_blockdev_direct_IO+0x1ce4/0x2910
<4>Feb 23 18:43:16 node-10 kernel: [9726568.319534] [<ffffffff8116ecaf>] ? bdi_lock_two+0x2f/0x60
<4>Feb 23 18:43:16 node-10 kernel: [9726568.319538] [<ffffffff811f7180>] ? I_BDEV+0x10/0x10
<4>Feb 23 18:43:16 node-10 kernel: [9726568.319541] [<ffffffff811fcf95>] __blockdev_direct_IO+0x55/0x60
<4>Feb 23 18:43:16 node-10 kernel: [9726568.319544] [<ffffffff811f7180>] ? I_BDEV+0x10/0x10
<4>Feb 23 18:43:16 node-10 kernel: [9726568.319547] [<ffffffff811f7876>] blkdev_direct_IO+0x56/0x60
<4>Feb 23 18:43:16 node-10 kernel: [9726568.319550] [<ffffffff811f7180>] ? I_BDEV+0x10/0x10
<4>Feb 23 18:43:16 node-10 kernel: [9726568.319553] [<ffffffff811520ab>] generic_file_aio_read+0x69b/0x700
<4>Feb 23 18:43:16 node-10 kernel: [9726568.319575] [<ffffffff811ce322>] ? final_putname+0x22/0x50
<4>Feb 23 18:43:16 node-10 kernel: [9726568.319586] [<ffffffff810f44d2>] ? from_kgid_munged+0x12/0x20
<4>Feb 23 18:43:16 node-10 kernel: [9726568.319589] [<ffffffff811f7cfb>] blkdev_aio_read+0x4b/0x70
<4>Feb 23 18:43:16 node-10 kernel: [9726568.319592] [<ffffffff811bdc0a>] do_sync_read+0x5a/0x90
<4>Feb 23 18:43:16 node-10 kernel: [9726568.319596] [<ffffffff811be2a5>] vfs_read+0x95/0x160
<4>Feb 23 18:43:16 node-10 kernel: [9726568.319599] [<ffffffff811bedb9>] SyS_read+0x49/0xa0
<4>Feb 23 18:43:16 node-10 kernel: [9726568.319603] [<ffffffff81734b5d>] system_call_fastpath+0x1a/0x1f
<3>Feb 23 18:43:16 node-10 kernel: [9726568.319606] INFO: task blkid:73404 blocked for more than 120 seconds.
<3>Feb 23 18:43:16 node-10 kernel: [9726568.329084] Tainted: G OX 3.13.0-65-generic #105-Ubuntu
<3>Feb 23 18:43:16 node-10 kernel: [9726568.338860] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
<4>Feb 23 18:43:16 node-10 kernel: [9726568.358864] ffff8800631e5960 0000000000000082 ffff883b9c1fc800 ffff8800631e5fd8
<4>Feb 23 18:43:16 node-10 kernel: [9726568.358871] 0000000000013180 0000000000013180 ffff883b9c1fc800 ffff8800631e5aa0
<4>Feb 23 18:43:16 node-10 kernel: [9726568.358877] ffff8800631e5aa8 7fffffffffffffff ffff883b9c1fc800 ffff883b9c1fc800
<4>Feb 23 18:43:16 node-10 kernel: [9726568.358883] Call Trace:
<4>Feb 23 18:43:16 node-10 kernel: [9726568.358888] [<ffffffff81728389>] schedule+0x29/0x70
<4>Feb 23 18:43:16 node-10 kernel: [9726568.358891] [<ffffffff817275d9>] schedule_timeout+0x239/0x2d0
<4>Feb 23 18:43:16 node-10 kernel: [9726568.358895] [<ffffffff810985ed>] ? ttwu_do_activate.constprop.74+0x5d/0x70
<4>Feb 23 18:43:16 node-10 kernel: [9726568.358899] [<ffffffff8109ab6a>] ? try_to_wake_up+0x1fa/0x2c0
<4>Feb 23 18:43:16 node-10 kernel: [9726568.358903] [<ffffffff8137649a>] ? sg_init_table+0x1a/0x40
<4>Feb 23 18:43:16 node-10 kernel: [9726568.358906] [<ffffffff81728ea6>] wait_for_completion+0xa6/0x160
<4>Feb 23 18:43:16 node-10 kernel: [9726568.358910] [<ffffffff8109ac90>] ? wake_up_state+0x20/0x20
<4>Feb 23 18:43:16 node-10 kernel: [9726568.358914] [<ffffffff81084dbd>] flush_work+0xed/0x1b0
<4>Feb 23 18:43:16 node-10 kernel: [9726568.358918] [<ffffffff81081060>] ? wake_up_worker+0x30/0x30
<4>Feb 23 18:43:16 node-10 kernel: [9726568.358921] [<ffffffff81084f82>] __cancel_work_timer+0x92/0x1a0
<4>Feb 23 18:43:16 node-10 kernel: [9726568.358926] [<ffffffff8149c972>] ? kobj_lookup+0x112/0x170
<4>Feb 23 18:43:16 node-10 kernel: [9726568.358931] [<ffffffff81348400>] ? disk_map_sector_rcu+0x80/0x80
<4>Feb 23 18:43:16 node-10 kernel: [9726568.358935] [<ffffffff810850c3>] cancel_delayed_work_sync+0x13/0x20
<4>Feb 23 18:43:16 node-10 kernel: [9726568.358938] [<ffffffff8134a2f0>] disk_block_events+0x80/0x90
<4>Feb 23 18:43:16 node-10 kernel: [9726568.358941] [<ffffffff811f898b>] __blkdev_get+0x5b/0x4c0
<4>Feb 23 18:43:16 node-10 kernel: [9726568.358944] [<ffffffff811f8fb5>] blkdev_get+0x1c5/0x340
<4>Feb 23 18:43:16 node-10 kernel: [9726568.358947] [<ffffffff811f91db>] blkdev_open+0x5b/0x80
<4>Feb 23 18:43:16 node-10 kernel: [9726568.358951] [<ffffffff811bb8f3>] do_dentry_open+0x233/0x2e0
<4>Feb 23 18:43:16 node-10 kernel: [9726568.358954] [<ffffffff811f9180>] ? blkdev_get_by_dev+0x50/0x50
<4>Feb 23 18:43:16 node-10 kernel: [9726568.358957] [<ffffffff811bbc29>] vfs_open+0x49/0x50
<4>Feb 23 18:43:16 node-10 kernel: [9726568.358960] [<ffffffff811ccfd4>] do_last+0x564/0x1240
<4>Feb 23 18:43:16 node-10 kernel: [9726568.358963] [<ffffffff811caa91>] ? link_path_walk+0x71/0x880
<4>Feb 23 18:43:16 node-10 kernel: [9726568.358968] [<ffffffff8131618b>] ? apparmor_file_alloc_security+0x5b/0x180
<4>Feb 23 18:43:16 node-10 kernel: [9726568.358972] [<ffffffff811cdd6b>] path_openat+0xbb/0x650
<4>Feb 23 18:43:16 node-10 kernel: [9726568.358975] [<ffffffff811ce322>] ? final_putname+0x22/0x50
<4>Feb 23 18:43:16 node-10 kernel: [9726568.358978] [<ffffffff811ce529>] ? putname+0x29/0x40
<4>Feb 23 18:43:16 node-10 kernel: [9726568.358981] [<ffffffff811cf07f>] ? user_path_at_empty+0x5f/0x90
<4>Feb 23 18:43:16 node-10 kernel: [9726568.358985] [<ffffffff811cf16a>] do_filp_open+0x3a/0x90
<4>Feb 23 18:43:16 node-10 kernel: [9726568.358988] [<ffffffff811dbfe7>] ? __alloc_fd+0xa7/0x130
<4>Feb 23 18:43:16 node-10 kernel: [9726568.358992] [<ffffffff811bd749>] do_sys_open+0x129/0x280
<4>Feb 23 18:43:16 node-10 kernel: [9726568.358995] [<ffffffff811bd8be>] SyS_open+0x1e/0x20
<4>Feb 23 18:43:16 node-10 kernel: [9726568.359009] [<ffffffff81734b5d>] system_call_fastpath+0x1a/0x1f
<3>Feb 23 18:45:16 node-10 kernel: [9726688.428529] INFO: task qemu-system-x86:32990 blocked for more than 120 seconds.
<3>Feb 23 18:45:16 node-10 kernel: [9726688.450582] Tainted: G OX 3.13.0-65-generic #105-Ubuntu
<3>Feb 23 18:45:16 node-10 kernel: [9726688.461652] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
<4>Feb 23 18:45:16 node-10 kernel: [9726688.484149] ffff880071569a88 0000000000000082 ffff881396a51800 ffff880071569fd8
<4>Feb 23 18:45:16 node-10 kernel: [9726688.484159] 0000000000013180 0000000000013180 ffff881396a51800 ffff8827df833a18
<4>Feb 23 18:45:16 node-10 kernel: [9726688.484166] 0000000000000000 ffff881363955000 0000000000000000 ffff881396a51800
<4>Feb 23 18:45:16 node-10 kernel: [9726688.484172] Call Trace:
<4>Feb 23 18:45:16 node-10 kernel: [9726688.484181] [<ffffffff817286ad>] io_schedule+0x9d/0x140
<4>Feb 23 18:45:16 node-10 kernel: [9726688.484188] [<ffffffff811fc314>] do_blockdev_direct_IO+0x1ce4/0x2910
<4>Feb 23 18:45:16 node-10 kernel: [9726688.484192] [<ffffffff811f7180>] ? I_BDEV+0x10/0x10
<4>Feb 23 18:45:16 node-10 kernel: [9726688.484197] [<ffffffff811fcf95>] __blockdev_direct_IO+0x55/0x60
<4>Feb 23 18:45:16 node-10 kernel: [9726688.484200] [<ffffffff811f7180>] ? I_BDEV+0x10/0x10
<4>Feb 23 18:45:16 node-10 kernel: [9726688.484204] [<ffffffff811f7876>] blkdev_direct_IO+0x56/0x60
<4>Feb 23 18:45:16 node-10 kernel: [9726688.484207] [<ffffffff811f7180>] ? I_BDEV+0x10/0x10
<4>Feb 23 18:45:16 node-10 kernel: [9726688.484211] [<ffffffff81150cd1>] generic_file_direct_write+0xc1/0x180
<4>Feb 23 18:45:16 node-10 kernel: [9726688.484215] [<ffffffff81151095>] __generic_file_aio_write+0x305/0x3d0
<4>Feb 23 18:45:16 node-10 kernel: [9726688.484219] [<ffffffff811f8156>] blkdev_aio_write+0x46/0x90
<4>Feb 23 18:45:16 node-10 kernel: [9726688.484225] [<ffffffff811bdc9a>] do_sync_write+0x5a/0x90
<4>Feb 23 18:45:16 node-10 kernel: [9726688.484229] [<ffffffff811be424>] vfs_write+0xb4/0x1f0
<4>Feb 23 18:45:16 node-10 kernel: [9726688.484233] [<ffffffff811befd2>] SyS_pwrite64+0x72/0xb0
<4>Feb 23 18:45:16 node-10 kernel: [9726688.484237] [<ffffffff81734b5d>] system_call_fastpath+0x1a/0x1f
<3>Feb 23 18:45:16 node-10 kernel: [9726688.484241] INFO: task qemu-system-x86:38844 blocked for more than 120 seconds.
<3>Feb 23 18:45:16 node-10 kernel: [9726688.504971] Tainted: G OX 3.13.0-65-generic #105-Ubuntu
<3>Feb 23 18:45:16 node-10 kernel: [9726688.515485] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
<4>Feb 23 18:45:16 node-10 kernel: [9726688.536411] ffff880056dd1ad8 0000000000000082 ffff880033bd6000 ffff880056dd1fd8
<4>Feb 23 18:45:16 node-10 kernel: [9726688.536419] 0000000000013180 0000000000013180 ffff880033bd6000 ffff8827df833a18
<4>Feb 23 18:45:16 node-10 kernel: [9726688.536428] 0000000000000000 ffff881363956180 0000000000000000 ffff880033bd6000
<4>Feb 23 18:45:16 node-10 kernel: [9726688.536434] Call Trace:
<4>Feb 23 18:45:16 node-10 kernel: [9726688.536440] [<ffffffff817286ad>] io_schedule+0x9d/0x140
<4>Feb 23 18:45:16 node-10 kernel: [9726688.536451] [<ffffffff811fc314>] do_blockdev_direct_IO+0x1ce4/0x2910
<4>Feb 23 18:45:16 node-10 kernel: [9726688.536456] [<ffffffff811f7180>] ? I_BDEV+0x10/0x10
<4>Feb 23 18:45:16 node-10 kernel: [9726688.536460] [<ffffffff811fcf95>] __blockdev_direct_IO+0x55/0x60
<4>Feb 23 18:45:16 node-10 kernel: [9726688.536468] [<ffffffff811f7180>] ? I_BDEV+0x10/0x10
<4>Feb 23 18:45:16 node-10 kernel: [9726688.536473] [<ffffffff811f7876>] blkdev_direct_IO+0x56/0x60
<4>Feb 23 18:45:16 node-10 kernel: [9726688.536476] [<ffffffff811f7180>] ? I_BDEV+0x10/0x10
<4>Feb 23 18:45:16 node-10 kernel: [9726688.536479] [<ffffffff811520ab>] generic_file_aio_read+0x69b/0x700
<4>Feb 23 18:45:16 node-10 kernel: [9726688.536486] [<ffffffff8108e720>] ? hrtimer_get_res+0x50/0x50
<4>Feb 23 18:45:16 node-10 kernel: [9726688.536489] [<ffffffff811f7cfb>] blkdev_aio_read+0x4b/0x70
<4>Feb 23 18:45:16 node-10 kernel: [9726688.536493] [<ffffffff811bdc0a>] do_sync_read+0x5a/0x90
<4>Feb 23 18:45:16 node-10 kernel: [9726688.536497] [<ffffffff811be2a5>] vfs_read+0x95/0x160
<4>Feb 23 18:45:16 node-10 kernel: [9726688.536501] [<ffffffff811bef22>] SyS_pread64+0x72/0xb0
<4>Feb 23 18:45:16 node-10 kernel: [9726688.536505] [<ffffffff81734b5d>] system_call_fastpath+0x1a/0x1f
<3>Feb 23 18:45:16 node-10 kernel: [9726688.536508] INFO: task qemu-system-x86:48268 blocked for more than 120 seconds.
<3>Feb 23 18:45:16 node-10 kernel: [9726688.557487] Tainted: G OX 3.13.0-65-generic #105-Ubuntu
<3>Feb 23 18:45:16 node-10 kernel: [9726688.568196] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
<4>Feb 23 18:45:16 node-10 kernel: [9726688.589160] 0000000000000000 ffff881363954d80 0000000000000000 ffff8813993f9800
<4>Feb 23 18:45:16 node-10 kernel: [9726688.589167] Call Trace:
<4>Feb 23 18:45:16 node-10 kernel: [9726688.589172] [<ffffffff817286ad>] io_schedule+0x9d/0x140
<4>Feb 23 18:45:16 node-10 kernel: [9726688.589176] [<ffffffff811fc314>] do_blockdev_direct_IO+0x1ce4/0x2910
<4>Feb 23 18:45:16 node-10 kernel: [9726688.589181] [<ffffffff811f7180>] ? I_BDEV+0x10/0x10
<4>Feb 23 18:45:16 node-10 kernel: [9726688.589184] [<ffffffff811fcf95>] __blockdev_direct_IO+0x55/0x60
<4>Feb 23 18:45:16 node-10 kernel: [9726688.589187] [<ffffffff811f7180>] ? I_BDEV+0x10/0x10
<4>Feb 23 18:45:16 node-10 kernel: [9726688.589190] [<ffffffff811f7876>] blkdev_direct_IO+0x56/0x60
<4>Feb 23 18:45:16 node-10 kernel: [9726688.589193] [<ffffffff811f7180>] ? I_BDEV+0x10/0x10
<4>Feb 23 18:45:16 node-10 kernel: [9726688.589197] [<ffffffff81150cd1>] generic_file_direct_write+0xc1/0x180
<4>Feb 23 18:45:16 node-10 kernel: [9726688.589200] [<ffffffff81151095>] __generic_file_aio_write+0x305/0x3d0
<4>Feb 23 18:45:16 node-10 kernel: [9726688.589204] [<ffffffff811f8156>] blkdev_aio_write+0x46/0x90
<4>Feb 23 18:45:16 node-10 kernel: [9726688.589209] [<ffffffff811bdd1c>] do_sync_readv_writev+0x4c/0x80
<4>Feb 23 18:45:16 node-10 kernel: [9726688.589212] [<ffffffff811bf1e0>] do_readv_writev+0xb0/0x220
<4>Feb 23 18:45:16 node-10 kernel: [9726688.589217] [<ffffffff810db67e>] ? do_futex+0xde/0x760
<4>Feb 23 18:45:16 node-10 kernel: [9726688.589221] [<ffffffff811bf3d0>] vfs_writev+0x30/0x60
<4>Feb 23 18:45:16 node-10 kernel: [9726688.589225] [<ffffffff811bf6f2>] SyS_pwritev+0xa2/0xd0
<4>Feb 23 18:45:16 node-10 kernel: [9726688.589229] [<ffffffff81734b5d>] system_call_fastpath+0x1a/0x1f
<3>Feb 23 18:45:16 node-10 kernel: [9726688.589232] INFO: task qemu-system-x86:22338 blocked for more than 120 seconds.
<3>Feb 23 18:45:16 node-10 kernel: [9726688.610442] Tainted: G OX 3.13.0-65-generic #105-Ubuntu
<4>Feb 23 18:45:16 node-10 kernel: [9726688.589147] ffff88002fca59d0 0000000000000082 ffff8813993f9800 ffff88002fca5fd8
<4>Feb 23 18:45:16 node-10 kernel: [9726688.589154] 0000000000013180 0000000000013180 ffff8813993f9800 ffff8827df833a18
<3>Feb 23 18:45:16 node-10 kernel: [9726688.621348] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
<4>Feb 23 18:45:16 node-10 kernel: [9726688.642030] ffff880049a65d80 0000000000000082 ffff88003ab58000 ffff880049a65fd8
<4>Feb 23 18:45:16 node-10 kernel: [9726688.642037] 0000000000013180 0000000000013180 ffff88003ab58000 ffff880049a65ea8
<4>Feb 23 18:45:16 node-10 kernel: [9726688.642043] ffff880049a65eb0 7fffffffffffffff ffff88003ab58000 00007f2b6bbd7700
<4>Feb 23 18:45:16 node-10 kernel: [9726688.642049] Call Trace:
<4>Feb 23 18:45:16 node-10 kernel: [9726688.642055] [<ffffffff81728389>] schedule+0x29/0x70
<4>Feb 23 18:45:16 node-10 kernel: [9726688.642058] [<ffffffff817275d9>] schedule_timeout+0x239/0x2d0
<4>Feb 23 18:45:16 node-10 kernel: [9726688.642064] [<ffffffff81339b43>] ? __blk_run_queue+0x33/0x40
<4>Feb 23 18:45:16 node-10 kernel: [9726688.642067] [<ffffffff8133d8b3>] ? blk_queue_bio+0x273/0x360
<4>Feb 23 18:45:16 node-10 kernel: [9726688.642071] [<ffffffff81728ea6>] wait_for_completion+0xa6/0x160
<4>Feb 23 18:45:16 node-10 kernel: [9726688.642076] [<ffffffff8109ac90>] ? wake_up_state+0x20/0x20
<4>Feb 23 18:45:16 node-10 kernel: [9726688.642080] [<ffffffff811f4bee>] submit_bio_wait+0x5e/0x70
<4>Feb 23 18:45:16 node-10 kernel: [9726688.642083] [<ffffffff8133f5ca>] blkdev_issue_flush+0x5a/0x90
<4>Feb 23 18:45:16 node-10 kernel: [9726688.642087] [<ffffffff811f7425>] blkdev_fsync+0x35/0x50
<4>Feb 23 18:45:16 node-10 kernel: [9726688.642092] [<ffffffff811ee551>] do_fsync+0x51/0x80
<4>Feb 23 18:45:16 node-10 kernel: [9726688.642095] [<ffffffff811ee803>] SyS_fdatasync+0x13/0x20
<4>Feb 23 18:45:16 node-10 kernel: [9726688.642098] [<ffffffff81734b5d>] system_call_fastpath+0x1a/0x1f

Device /dev/sdb is unreadable (dd if=/dev/sdb ... hangs).

In case of restart of cinder-volume service, the qemu process becomes zombie.

In case of restart of tgt service, the qemu process gets killed, but the device sdb (mounted via iSCSI) disappears and can not be remounted via iscsiadm.

lsmod reports that scsi_tgt module is in use by 1 module, but there is no module specified.

The issue happens randomly, on different (but built with the same template) environments.

Changed in fuel:
importance: Undecided → Critical
tags: added: customer-found support
Revision history for this message
Dmitry Sutyagin (dsutyagin) wrote :

Issue has happened multiple times already and causes (partial) workload failure/outage, so I've bumped up the importance.

Revision history for this message
Denis Meltsaykin (dmeltsaykin) wrote :

please provide steps to reproduce

Changed in fuel:
assignee: nobody → Roman Safonov (rsafonov)
status: New → Incomplete
Revision history for this message
Roman Safonov (rsafonov) wrote :

I have put logs related to recent failure in bug description, please consider.

description: updated
Revision history for this message
Roman Safonov (rsafonov) wrote :
Download full text (41.1 KiB)

Recent failure is a bit different from others. In /var/log/messages in previous cases all errors were related to qemu process:

<6>Jan 16 01:13:32 node-7 kernel: [7064341.531377] qemu-system-x86 D ffff8827df833180 0 46940 1 0x00000000
<4>Jan 16 01:13:32 node-7 kernel: [7064341.531554] ffff880021f0fa88 0000000000000082 ffff88139c8b3000 ffff880021f0ffd8
<4>Jan 16 01:13:32 node-7 kernel: [7064341.531736] 0000000000013180 0000000000013180 ffff88139c8b3000 ffff8827df833a18
<4>Jan 16 01:13:32 node-7 kernel: [7064341.531742] 0000000000000000 ffff88132b196180 0000000000000000 ffff88139c8b3000
<4>Jan 16 01:13:32 node-7 kernel: [7064341.531749] Call Trace:
<4>Jan 16 01:13:32 node-7 kernel: [7064341.531760] [<ffffffff817286ad>] io_schedule+0x9d/0x140
<4>Jan 16 01:13:32 node-7 kernel: [7064341.531767] [<ffffffff811fc314>] do_blockdev_direct_IO+0x1ce4/0x2910
<4>Jan 16 01:13:32 node-7 kernel: [7064341.531772] [<ffffffff811f7180>] ? I_BDEV+0x10/0x10
<4>Jan 16 01:13:32 node-7 kernel: [7064341.531775] [<ffffffff811fcf95>] __blockdev_direct_IO+0x55/0x60
<4>Jan 16 01:13:32 node-7 kernel: [7064341.531778] [<ffffffff811f7180>] ? I_BDEV+0x10/0x10
<4>Jan 16 01:13:32 node-7 kernel: [7064341.531782] [<ffffffff811f7876>] blkdev_direct_IO+0x56/0x60
<4>Jan 16 01:13:32 node-7 kernel: [7064341.531785] [<ffffffff811f7180>] ? I_BDEV+0x10/0x10
<4>Jan 16 01:13:32 node-7 kernel: [7064341.531790] [<ffffffff81150cd1>] generic_file_direct_write+0xc1/0x180
<4>Jan 16 01:13:32 node-7 kernel: [7064341.531796] [<ffffffff8106b882>] ? current_fs_time+0x12/0x60
<4>Jan 16 01:13:32 node-7 kernel: [7064341.531800] [<ffffffff81151095>] __generic_file_aio_write+0x305/0x3d0
<4>Jan 16 01:13:32 node-7 kernel: [7064341.531804] [<ffffffff811f8156>] blkdev_aio_write+0x46/0x90
<4>Jan 16 01:13:32 node-7 kernel: [7064341.531811] [<ffffffff811bdc9a>] do_sync_write+0x5a/0x90
<4>Jan 16 01:13:32 node-7 kernel: [7064341.531816] [<ffffffff811be424>] vfs_write+0xb4/0x1f0
<4>Jan 16 01:13:32 node-7 kernel: [7064341.531819] [<ffffffff811befd2>] SyS_pwrite64+0x72/0xb0
<4>Jan 16 01:13:32 node-7 kernel: [7064341.531824] [<ffffffff81734b5d>] system_call_fastpath+0x1a/0x1f
<6>Jan 16 01:13:32 node-7 kernel: [7064341.558032] qemu-system-x86 D ffff8827df833180 0 26526 1 0x00000000
<4>Jan 16 01:13:32 node-7 kernel: [7064341.558036] ffff88356cf77a88 0000000000000082 ffff8813985f4800 ffff88356cf77fd8
<4>Jan 16 01:13:32 node-7 kernel: [7064341.558044] 0000000000013180 0000000000013180 ffff8813985f4800 ffff8827df833a18
<4>Jan 16 01:13:32 node-7 kernel: [7064341.558050] 0000000000000000 ffff88132b192300 0000000000000000 ffff8813985f4800
<4>Jan 16 01:13:32 node-7 kernel: [7064341.558056] Call Trace:
<4>Jan 16 01:13:32 node-7 kernel: [7064341.558062] [<ffffffff817286ad>] io_schedule+0x9d/0x140
<4>Jan 16 01:13:32 node-7 kernel: [7064341.558066] [<ffffffff811fc314>] do_blockdev_direct_IO+0x1ce4/0x2910
<4>Jan 16 01:13:32 node-7 kernel: [7064341.558071] [<ffffffff811f7180>] ? I_BDEV+0x10/0x10
<4>Jan 16 01:13:32 node-7 kernel: [7064341.558074] [<ffffffff811fcf95>] __blockdev_direct_IO+0x55/0x60
<4>Jan 16 01:13:32 node-7 kernel: [7064341.558078] [<ffffffff811f7180>] ? I_BDEV+0x10/0x10
<4>J...

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.