Based on Dann's feedback on testing, I'm failing to see where your patch fixes the "root" cause (despite being able to mitigate the issue by changing the aio notification mechanism).
I think the root cause is best described in this 2 emails from the thread:
So, by adding ctx->notify_for_convert, it is very likely you workarounded the issue by doing what Jan already said: removing both variables (ctx->list_lock and, in old case, ctx->notify_me, in your case, ctx->notify_for_convert) from the same cacheline and making the issue to "disappear" (as we would eventually do in a workaround patch).
What about aarch64 issue with both, ctx->list_lock and ctx->notify_for_convert, being synchronized by qemu used primitives, and being in the same cache line ?
Any "workaround" here would try to dodge the same cacheline situation, but, for upstream, I suppose Paolo wants to have something else regarding aarch64 ATOMIC_SEQ_CST.
Hello Fred,
Based on Dann's feedback on testing, I'm failing to see where your patch fixes the "root" cause (despite being able to mitigate the issue by changing the aio notification mechanism).
I think the root cause is best described in this 2 emails from the thread:
https:/ /lore.kernel. org/qemu- devel/201910090 80220.GA2905@ hc/
and
https://<email address hidden>/
So, by adding ctx->notify_ for_convert, it is very likely you workarounded the issue by doing what Jan already said: removing both variables (ctx->list_lock and, in old case, ctx->notify_me, in your case, ctx->notify_ for_convert) from the same cacheline and making the issue to "disappear" (as we would eventually do in a workaround patch).
What about aarch64 issue with both, ctx->list_lock and ctx->notify_ for_convert, being synchronized by qemu used primitives, and being in the same cache line ?
Any "workaround" here would try to dodge the same cacheline situation, but, for upstream, I suppose Paolo wants to have something else regarding aarch64 ATOMIC_SEQ_CST.
like describe in this part of the discussion:
https://<email address hidden>/
Unless I'm missing something, am I ?
Thank you!