Comment 9 for bug 803865

Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

Davi -

According to the GCC manual http://gcc.gnu.org/onlinedocs/gcc-4.6.1/gcc/Extended-Asm.html#Extended-Asm , "memory" clobber is: "If your assembler instructions access memory in an unpredictable fashion, add `memory' to the list of clobbered registers. This will cause GCC to not keep memory values cached in registers across the assembler instruction and not optimize stores or loads to that memory." This implies memory barrier, but also the key is "unpredictable fashion". When the inputs and outputs are completely described (i.e. no unpredictable pointer dereferences), and the variables in question are marked volatile, I don't think anything extra is achieved by adding memory clobber.

Re. volatile, the next sentence in the manual reads "You will also want to add the volatile keyword if the memory affected is not listed in the inputs or outputs of the asm, as the `memory' clobber does not count as a side-effect of the asm." Again, this is important if the memory is not listed in the inputs/outputs, and not if it is. Moreover, the manual says "If an asm has output operands, GCC assumes for optimization purposes the instruction has no side effects except to change the output operands. This does not mean instructions with a side effect cannot be used, but you must be careful, because the compiler may eliminate them if the output operands aren't used, or move them out of loops, or replace two with one if they constitute a common subexpression. [...] You can prevent an asm instruction from being deleted by writing the keyword volatile after the asm." This implies that volatile is used for data flow analysis decisions, which again shouldn't matter if the asm inputs and outputs are fully described, as it is visible to DF anyway.

In any case "memory" and "volatile" do not hurt even if not necessary, and what actually hurts are the 0th and 4th operand description errors in my analysis above.

Regarding my other points above, I was wrong about EBX (specifying that it is clobbered will not stop GCC from using it, also will cause GCC errors about impossible reloads quite often), and now I see the point about specifying ESP clobber - although after EBX experience I'm not sure if GCC honors it or if it just works silently since GCC (as opposed to ICC) does not use ESP much anyway for addressing.

The final code we are going with is in the SO answer. I will submit a bug report upstream once we test it some more and are confident in it. GCC inline asm is tricky and even after "best efforts" I do not claim to have a full understanding of it.