Comment 4 for bug 803865

Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

I believe this is an issue with 64-bit CAS implementation on IA32 systems. Here is the relevant code from include/atomic/x86_gcc.h:

  asm volatile ("push %%ebx;"
                "movl (%%ecx), %%ebx;"
                "movl 4(%%ecx), %%ecx;"
                LOCK_prefix "; cmpxchg8b %0;"
                "setz %2; pop %%ebx"
                : "=m" (*a), "+A" (*cmp), "=c" (ret)
                : "c" (&set), "m" (*a)
                : "memory", "esp")

There are following issues that make it easy for GCC to break this code by optimization:
- The zero-th operand is marked as output-only, but it is an input-output one: CMPXCHG8B reads the memory pointed to it, and also sets EDX:EAX if the operand and EDX:EAX are non-equal. It is also an output operand: the instruction sets it to ECX:EBX.
- The code probably tried to say that zero-th operand is also an input one by specifying it as the 4th operand ("m" (*a)), but this fails, because GCC does not guarantee putting the input and output operands referenced by the same C expression at the same location. Also this operand is not referenced in the assembly at all.

While we are at it, there are following issues that make the resulting code suboptimal:
- "volatile" is not needed as all the output operands are specified fully (i.e. no undescribed side-effects in the assembly);
- Saving EBX in the code is not required, just tell GCC that the code clobbers EBX and GCC will handle this;
- No need to force using ECX for input &set and output ret;
- Clobber "memory" is redundant as the code does not access memory in a way undescribed by constraints;
- Clobber "esp", I fail to understand its point at all.

The rewritten code:

    int32 val_for_ebx= (set & 0xFFFFFFFF);
    int32 val_for_ecx= set >> 32;
    asm (LOCK_prefix "; cmpxchg8b %0;"
         "setz %1"
         : "+m" (*a), "=g" (ret), "+A" (*cmp)
         : "b" (val_for_ebx), "c" (val_for_ecx)
         : "flags")

Note that this is similar to
http://bugs.mysql.com/bug.php?id=52419 and
http://lists.mysql.com/commits/118512

but in fact the original bug there was that ebx operand had a wrong constraint (memory, failing because it is stack pointer-relative) when it should have been directly EBX register.

Before and after dissasambly (note the new version is shorter):

   0x08231070 <+219>: mov %esi,-0x30(%ebp)
   0x08231073 <+222>: mov %ebx,-0x2c(%ebp)
   0x08231076 <+225>: mov %edi,%ebx
   0x08231078 <+227>: mov (%edi),%eax
   0x0823107a <+229>: mov 0x4(%edi),%edx
   0x0823107d <+232>: lea -0x20(%ebp),%ecx
   0x08231080 <+235>: mov %ecx,-0x3c(%ebp)
   0x08231083 <+238>: mov -0x30(%ebp),%esi
   0x08231086 <+241>: mov -0x2c(%ebp),%edi
   0x08231089 <+244>: add %eax,%esi
   0x0823108b <+246>: adc %edx,%edi
   0x0823108d <+248>: mov %esi,-0x20(%ebp)
   0x08231090 <+251>: mov %edi,-0x1c(%ebp)
   0x08231093 <+254>: mov -0x3c(%ebp),%ecx
   0x08231096 <+257>: push %ebx
   0x08231097 <+258>: mov (%ecx),%ebx
   0x08231099 <+260>: mov 0x4(%ecx),%ecx
=> 0x0823109c <+263>: lock cmpxchg8b (%ebx)
   0x082310a0 <+267>: sete %cl
   0x082310a3 <+270>: pop %ebx

After:

   0x08231062 <+205>: mov %esi,-0x28(%ebp)
   0x08231065 <+208>: mov %ebx,-0x24(%ebp) ; Save EBX
   0x08231068 <+211>: mov %edi,%esi
   0x0823106a <+213>: mov (%edi),%eax
   0x0823106c <+215>: mov 0x4(%edi),%edx
   0x0823106f <+218>: mov -0x28(%ebp),%ecx
   0x08231072 <+221>: mov -0x24(%ebp),%ebx ; Restore EBX
   0x08231075 <+224>: add %eax,%ecx
   0x08231077 <+226>: adc %edx,%ebx
   0x08231079 <+228>: mov %ecx,-0x20(%ebp)
   0x0823107c <+231>: mov %ebx,-0x1c(%ebp)
   0x0823107f <+234>: mov %ecx,%ebx
   0x08231081 <+236>: mov -0x1c(%ebp),%ecx
   0x08231084 <+239>: lock cmpxchg8b (%esi)
   0x08231088 <+243>: sete %bl