On Mon, Oct 07, 2019 at 04:58:30PM +0200, Paolo Bonzini wrote:
> On 07/10/19 16:44, dann frazier wrote:
> > On Mon, Oct 07, 2019 at 01:06:20PM +0200, Paolo Bonzini wrote:
> >> On 02/10/19 11:23, Jan Glauber wrote:
> >>> I've looked into this on ThunderX2. The arm64 code generated for the
> >>> atomic_[add|sub] accesses of ctx->notify_me doesn't contain any
> >>> memory barriers. It is just plain ldaxr/stlxr.
> >>>
> >>> From my understanding this is not sufficient for SMP sync.
> >>>
> >>> If I read this comment correct:
> >>>
> >>> void aio_notify(AioContext *ctx)
> >>> {
> >>> /* Write e.g. bh->scheduled before reading ctx->notify_me. Pairs
> >>> * with atomic_or in aio_ctx_prepare or atomic_add in aio_poll.
> >>> */
> >>> smp_mb();
> >>> if (ctx->notify_me) {
> >>>
> >>> it points out that the smp_mb() should be paired. But as
> >>> I said the used atomics don't generate any barriers at all.
> >>
> >> Based on the rest of the thread, this patch should also fix the bug:
> >>
> >> diff --git a/util/async.c b/util/async.c
> >> index 47dcbfa..721ea53 100644
> >> --- a/util/async.c
> >> +++ b/util/async.c
> >> @@ -249,7 +249,7 @@ aio_ctx_check(GSource *source)
> >> aio_notify_accept(ctx);
> >>
> >> for (bh = ctx->first_bh; bh; bh = bh->next) {
> >> - if (bh->scheduled) {
> >> + if (atomic_mb_read(&bh->scheduled)) {
> >> return true;
> >> }
> >> }
> >>
> >>
> >> And also the memory barrier in aio_notify can actually be replaced
> >> with a SEQ_CST load:
> >>
> >> diff --git a/util/async.c b/util/async.c
> >> index 47dcbfa..721ea53 100644
> >> --- a/util/async.c
> >> +++ b/util/async.c
> >> @@ -349,11 +349,11 @@ LinuxAioState *aio_get_linux_aio(AioContext *ctx)
> >>
> >> void aio_notify(AioContext *ctx)
> >> {
> >> - /* Write e.g. bh->scheduled before reading ctx->notify_me. Pairs
> >> - * with atomic_or in aio_ctx_prepare or atomic_add in aio_poll.
> >> + /* Using atomic_mb_read ensures that e.g. bh->scheduled is written before
> >> + * ctx->notify_me is read. Pairs with atomic_or in aio_ctx_prepare or
> >> + * atomic_add in aio_poll.
> >> */
> >> - smp_mb();
> >> - if (ctx->notify_me) {
> >> + if (atomic_mb_read(&ctx->notify_me)) {
> >> event_notifier_set(&ctx->notifier);
> >> atomic_mb_set(&ctx->notified, true);
> >> }
> >>
> >>
> >> Would you be able to test these (one by one possibly)?
> >
> > Paolo,
> > I tried them both separately and together on a Hi1620 system, each
> > time it hung in the first iteration. Here's a backtrace of a run with
> > both patches applied:
>
> Ok, not a great start... I'll find myself an aarch64 machine and look
> at it more closely. I'd like the patch to be something we can
> understand and document, since this is probably the second most-used
> memory barrier idiom in QEMU.
>
> Paolo
I'm still not sure what the actual issue is here, but could it be some bad
interaction between the notify_me and the list_lock? The are both 4 byte
and side-by-side:
...but if I bump notify_me size to uint64_t the issue goes away.
BTW, the image file I convert in the testcase is ~20 GB.
--Jan
diff --git a/include/block/aio.h b/include/block/aio.h
index a1d6b9e24939..e8a5ea3860bb 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -83,7 +83,7 @@ struct AioContext {
* Instead, the aio_poll calls include both the prepare and the
* dispatch phase, hence a simple counter is enough for them.
*/
- uint32_t notify_me;
+ uint64_t notify_me;
/* A lock to protect between QEMUBH and AioHandler adders and deleter,
* and to ensure that no callbacks are removed while we're walking and
On Mon, Oct 07, 2019 at 04:58:30PM +0200, Paolo Bonzini wrote: AioContext *ctx) check(GSource *source) accept( ctx); mb_read( &bh->scheduled) ) { linux_aio( AioContext *ctx) AioContext *ctx) mb_read( &ctx->notify_ me)) { set(&ctx- >notifier) ; mb_set( &ctx->notified, true);
> On 07/10/19 16:44, dann frazier wrote:
> > On Mon, Oct 07, 2019 at 01:06:20PM +0200, Paolo Bonzini wrote:
> >> On 02/10/19 11:23, Jan Glauber wrote:
> >>> I've looked into this on ThunderX2. The arm64 code generated for the
> >>> atomic_[add|sub] accesses of ctx->notify_me doesn't contain any
> >>> memory barriers. It is just plain ldaxr/stlxr.
> >>>
> >>> From my understanding this is not sufficient for SMP sync.
> >>>
> >>> If I read this comment correct:
> >>>
> >>> void aio_notify(
> >>> {
> >>> /* Write e.g. bh->scheduled before reading ctx->notify_me. Pairs
> >>> * with atomic_or in aio_ctx_prepare or atomic_add in aio_poll.
> >>> */
> >>> smp_mb();
> >>> if (ctx->notify_me) {
> >>>
> >>> it points out that the smp_mb() should be paired. But as
> >>> I said the used atomics don't generate any barriers at all.
> >>
> >> Based on the rest of the thread, this patch should also fix the bug:
> >>
> >> diff --git a/util/async.c b/util/async.c
> >> index 47dcbfa..721ea53 100644
> >> --- a/util/async.c
> >> +++ b/util/async.c
> >> @@ -249,7 +249,7 @@ aio_ctx_
> >> aio_notify_
> >>
> >> for (bh = ctx->first_bh; bh; bh = bh->next) {
> >> - if (bh->scheduled) {
> >> + if (atomic_
> >> return true;
> >> }
> >> }
> >>
> >>
> >> And also the memory barrier in aio_notify can actually be replaced
> >> with a SEQ_CST load:
> >>
> >> diff --git a/util/async.c b/util/async.c
> >> index 47dcbfa..721ea53 100644
> >> --- a/util/async.c
> >> +++ b/util/async.c
> >> @@ -349,11 +349,11 @@ LinuxAioState *aio_get_
> >>
> >> void aio_notify(
> >> {
> >> - /* Write e.g. bh->scheduled before reading ctx->notify_me. Pairs
> >> - * with atomic_or in aio_ctx_prepare or atomic_add in aio_poll.
> >> + /* Using atomic_mb_read ensures that e.g. bh->scheduled is written before
> >> + * ctx->notify_me is read. Pairs with atomic_or in aio_ctx_prepare or
> >> + * atomic_add in aio_poll.
> >> */
> >> - smp_mb();
> >> - if (ctx->notify_me) {
> >> + if (atomic_
> >> event_notifier_
> >> atomic_
> >> }
> >>
> >>
> >> Would you be able to test these (one by one possibly)?
> >
> > Paolo,
> > I tried them both separately and together on a Hi1620 system, each
> > time it hung in the first iteration. Here's a backtrace of a run with
> > both patches applied:
>
> Ok, not a great start... I'll find myself an aarch64 machine and look
> at it more closely. I'd like the patch to be something we can
> understand and document, since this is probably the second most-used
> memory barrier idiom in QEMU.
>
> Paolo
I'm still not sure what the actual issue is here, but could it be some bad
interaction between the notify_me and the list_lock? The are both 4 byte
and side-by-side:
address notify_me: 0xaaaadb528aa0 sizeof notify_me: 4
address list_lock: 0xaaaadb528aa4 sizeof list_lock: 4
AFAICS the generated code looks OK (all load/store exclusive done
with 32 bit size):
e6c: 885ffc01 ldaxr w1, [x0]
e70: 11000821 add w1, w1, #0x2
e74: 8802fc01 stlxr w2, w1, [x0]
...but if I bump notify_me size to uint64_t the issue goes away.
BTW, the image file I convert in the testcase is ~20 GB.
--Jan
diff --git a/include/ block/aio. h b/include/ block/aio. h .e8a5ea3860bb 100644 block/aio. h block/aio. h
index a1d6b9e24939.
--- a/include/
+++ b/include/
@@ -83,7 +83,7 @@ struct AioContext {
* Instead, the aio_poll calls include both the prepare and the
* dispatch phase, hence a simple counter is enough for them.
*/
- uint32_t notify_me;
+ uint64_t notify_me;
/* A lock to protect between QEMUBH and AioHandler adders and deleter,
* and to ensure that no callbacks are removed while we're walking and