I've looked into this on ThunderX2. The arm64 code generated for the
atomic_[add|sub] accesses of ctx->notify_me doesn't contain any
memory barriers. It is just plain ldaxr/stlxr.
From my understanding this is not sufficient for SMP sync.
If I read this comment correct:
void aio_notify(AioContext *ctx)
{
/* Write e.g. bh->scheduled before reading ctx->notify_me. Pairs
* with atomic_or in aio_ctx_prepare or atomic_add in aio_poll.
*/
smp_mb();
if (ctx->notify_me) {
it points out that the smp_mb() should be paired. But as
I said the used atomics don't generate any barriers at all.
I've tried to verify me theory with this patch and didn't run into the
issue for ~500 iterations (usually I would trigger the issue ~20 iterations).
I've looked into this on ThunderX2. The arm64 code generated for the
atomic_[add|sub] accesses of ctx->notify_me doesn't contain any
memory barriers. It is just plain ldaxr/stlxr.
From my understanding this is not sufficient for SMP sync.
If I read this comment correct:
void aio_notify( AioContext *ctx)
{
/* Write e.g. bh->scheduled before reading ctx->notify_me. Pairs
* with atomic_or in aio_ctx_prepare or atomic_add in aio_poll.
*/
smp_mb();
if (ctx->notify_me) {
it points out that the smp_mb() should be paired. But as
I said the used atomics don't generate any barriers at all.
I've tried to verify me theory with this patch and didn't run into the
issue for ~500 iterations (usually I would trigger the issue ~20 iterations).
--Jan
diff --git a/util/aio-posix.c b/util/aio-posix.c .d07dcd4e9993 100644
atomic_ add(&ctx- >notify_ me, 2);
index d8f0cb4af8dd.
--- a/util/aio-posix.c
+++ b/util/aio-posix.c
@@ -591,6 +591,7 @@ bool aio_poll(AioContext *ctx, bool blocking)
*/
if (blocking) {
+ smp_mb();
}
qemu_ lockcnt_ inc(&ctx- >list_lock) ;
@@ -632,6 +633,7 @@ bool aio_poll(AioContext *ctx, bool blocking)
if (blocking) {
atomic_ sub(&ctx- >notify_ me, 2);
+ smp_mb();
}
/* Adjust polling time */ .92ac209c4615 100644 prepare( GSource *source, gint *timeout)
diff --git a/util/async.c b/util/async.c
index 4dd9d95a9e73.
--- a/util/async.c
+++ b/util/async.c
@@ -222,6 +222,7 @@ aio_ctx_
AioContext *ctx = (AioContext *) source;
atomic_ or(&ctx- >notify_ me, 1);
+ smp_mb();
/* We assume there is no timeout already supplied */ ns_to_ms( aio_compute_ timeout( ctx)); check(GSource *source)
*timeout = qemu_timeout_
@@ -240,6 +241,7 @@ aio_ctx_
QEMUBH *bh;
atomic_ and(&ctx- >notify_ me, ~1); notify_ accept( ctx);
+ smp_mb();
aio_
for (bh = ctx->first_bh; bh; bh = bh->next) {