I can reproduce this problem with qemu.git/matser. It still exists in qemu.git/matser. I found that when an IO return in worker threads and want to call aio_notify to wake up main_loop, but it found that ctx->notify_me is cleared to 0 by main_loop in aio_ctx_check by calling atomic_and(&ctx->notify_me, ~1) . So worker thread won't write enventfd to notify main_loop.If such a scene happens, the main_loop will hang:
main loop worker thread1 worker thread2
-----------------------------------------------------------------------------------------------
qemu_poll_ns aio_worker qemu_bh_schedule(pool->completion_bh)
glib_pollfds_poll
g_main_context_check
aio_ctx_check
atomic_and(&ctx->notify_me, ~1) aio_worker qemu_bh_schedule(pool->completion_bh)
/* do something for event */
qemu_poll_ns
/* hangs !!!*/
As we known, ctx->notify_me will be visited by worker thread and main loop. I thank we should add a lock protection for ctx->notify_me to avoid this happend.what do you thank so?
I can reproduce this problem with qemu.git/matser. It still exists in qemu.git/matser. I found that when an IO return in worker threads and want to call aio_notify to wake up main_loop, but it found that ctx->notify_me is cleared to 0 by main_loop in aio_ctx_check by calling atomic_ and(&ctx- >notify_ me, ~1) . So worker thread won't write enventfd to notify main_loop.If such a scene happens, the main_loop will hang: ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ----
qemu_bh_ schedule( pool->completio n_bh) pollfds_ poll context_ check and(&ctx- >notify_ me, ~1) aio_worker
qemu_ bh_schedule( pool->completio n_bh)
main loop worker thread1 worker thread2
-------
qemu_poll_ns aio_worker
glib_
g_main_
aio_ctx_check
atomic_
/* do something for event */
qemu_poll_ns
/* hangs !!!*/
As we known, ctx->notify_me will be visited by worker thread and main loop. I thank we should add a lock protection for ctx->notify_me to avoid this happend.what do you thank so?