GCC ICE: in create_fix_barrier, at config/arm/arm.c:17891

Bug #1953128 reported by Simon Chopin
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
gcc
Confirmed
Medium
gcc-11 (Ubuntu)
New
Undecided
Unassigned
postgresql-14 (Ubuntu)
Fix Released
High
Christian Ehrhardt 

Bug Description

This ICE was hit when building Postgresql-14 on GCC 11.2 on armhf

A similar issue has been found in Fedora when building qemu:
https://<email address hidden>/thread/GD3ABSWD6HHTNEKV2EJY4PXABQ245UCZ/

The issue is tracked upstream at
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103395

Tags: patch
Revision history for this message
In , Jakub-gcc (jakub-gcc) wrote :

Since GCC 5 (r208990 still works fine, r215478 ICEs already up to current trunk)
the following testcase ICEs on armv7hl-linux-gnueabi with -O2:
void
foo (void)
{
  __asm__ ("\n\n\n\n\n\n\n\n\n\n"
    "\n\n\n\n\n\n\n\n\n\n"
    "\n\n\n\n\n\n\n\n\n\n"
    "\n\n\n\n\n\n\n\n\n\n"
    "\n\n\n\n\n\n\n\n\n\n"
    "\n\n\n\n\n\n\n\n\n\n"
           "\n" : : "nor" (0), "nor" (0));
}
Removing just one newline makes the ICE go away. get_attr_length for such inline asm is 248 bytes (estimation) and the arm minipool code is apparently upset about it.

Revision history for this message
In , Richard Jones (rjones-redhat) wrote :

Nice reproducer!

Here's the original thread where the bug was reported when compiling qemu on Fedora Rawhide for armv7: https://<email address hidden>/thread/GD3ABSWD6HHTNEKV2EJY4PXABQ245UCZ/

Revision history for this message
In , Pinskia (pinskia) wrote :

Confirmed.

Revision history for this message
In , Jakub-gcc (jakub-gcc) wrote :

Note, in qemu it isn't just those newlines but simply a long inline asm.
It can very well be something that is empty or very short in the .text section,
e.g. can .pushsection; .section whatever; put lots of stuff in there; .popsection.
But as inline asm is a black box to gcc, we have only very rough estimation for the inline asm text length, which just counts how many newlines and typically ; characters there are and multiple that by some insn size.

Revision history for this message
In , Rearnsha (rearnsha) wrote :

I suspect the best we're likely to be able to do is to downgrade the ICE into an error if there's a long inline ASM in the code, much like the way we handle invalid constraints in inline ASMs.

Revision history for this message
In , Jakub-gcc (jakub-gcc) wrote :

That will still mean qemu will not compile.
I admit I know next to nothing about the arm minipool stuff, but can't whatever needs to be inserted be inserted either before or after the inline asm, if it is needed for the asm inputs then likely before that?

Revision history for this message
In , Rearnsha (rearnsha) wrote :

It depends on the the reference. Some minipool references can only be forwards due to the nature of the instruction making the reference. I'll need to take a look, and I might also need a real testcase, not just the reduced one below.

Revision history for this message
In , Jakub-gcc (jakub-gcc) wrote :

Created attachment 51867
qemuice.i.xz

Unreduced preprocessed source.

Revision history for this message
In , Rearnsha (rearnsha) wrote :

OK, so the real problem in the test case is the constraint ("nor") is meaningless on Arm (because there is no instruction in the architecture that can accept both a constant and a memref) and this confuses the minipool code because it exploits this restriction to detect insns that need to be reworked by the md_reorg pass.

When processing an ASM we allow only a forward literal pool reference and it must be less than 250 bytes beyond the /start/ of the pattern (because we don't know where in the asm it gets used). So we have to deduct from that range 4 bytes for every asm statement: add too many lines to the asm and we reach the point where it is impossible to place the literal pool even directly after the asm.

So I think really this is an ICE on invalid, because the constraint is not meaningful on Arm.

Revision history for this message
In , Jakub-gcc (jakub-gcc) wrote :

CCing Frank as this is systemtap sys/sdt.h
which has:
# ifndef STAP_SDT_ARG_CONSTRAINT
# if defined __powerpc__
# define STAP_SDT_ARG_CONSTRAINT nZr
# else
# define STAP_SDT_ARG_CONSTRAINT nor
# endif
# endif

All of n, o and r are generic constraints and const0_rtx is valid for the "n" constraint, so why is the backend trying to put it into memory at all?
What is systemtap trying to do is not use those operands in any instruction, but note for the debugger how to find out the value of the asm input operand (read some register, some memory or the immediate constant).

Revision history for this message
In , Rearnsha (rearnsha) wrote :

It's been this way now for over 20 years. A single instruction cannot take a constant and a memory for the same operand, so this has been used in the backend to trigger the minipool generation: a constant in an operand that takes a memory triggers a minipool spill to be pushed.

If we changed this now we risk breaking existing inline asms that exploit this feature (good or bad) and expect a constant to be pushed into a minipool entry.

Revision history for this message
In , Jakub-gcc (jakub-gcc) wrote :

Inline asm that only works with memory but in constraints says it accepts both immediate constant and memory is IMNSHO just broken, it is just fine if the compiler makes a different choice.
If "nor" with constant input on arm has meant actually just "or", then sure, systemtap could be changed and after a couple of years it will propagate to all stap copies used in the wild, but it is quite severe misoptimization of one of the most common cases. The systemtap macros don't really know what argument will be passed to them, whether a constant, something that lives in memory, something that lives in a register and ideally wants as few actual instructions before those macros as possible to arrange the arguments so that debugger can inspect them.

Revision history for this message
In , Jakub-gcc (jakub-gcc) wrote :

Note, sys/sdt.h is using the "nor" constraints for 11 years already.

Revision history for this message
In , Fw-6 (fw-6) wrote :

Maybe it's possible to provide specific, architecture-independent constraints for Systemtap-like use cases?

Revision history for this message
In , Jakub-gcc (jakub-gcc) wrote :

If it can be proven that all gcc versions until now treat "nor" constraint as ignoring the n in there and pushing all constants into constant pool, I think it could be changed into "or" for arm32. But it would be IMNSHO unnecessary pessimization (but it could e.g. be done for GCC < 12 or whenever this would be fixed).
Another option is to tweak whatever generates those large inline asms.
In the qemu case it is created with
/usr/bin/python3 ../scripts/tracetool.py --backend=dtrace --group=util --format=h /builddir/build/BUILD/qemu-6.1.0/util/trace-events trace/trace-util.h
whatever that is (but that means I haven't actually seen what it generates).
Note, apparently several other packages are affected, so not sure what changed recently in systemtap-sdt-devel or whatever else that adds up to this.
In the preprocessed source I got I see several blocks of
   ".ascii \"\\x20\""
   "\n"
   "_SDT_SIGN %n[_SDT_S4]"
   "\n"
   "_SDT_SIZE %n[_SDT_S4]"
   "\n"
   "_SDT_TYPE %n[_SDT_S4]"
   "\n"
   ".ascii \"%[_SDT_A4]\""
   "\n"
It already uses assembler macros, perhaps adding a macro to do all 3 at once or perhaps with something extra could bring the number of newlines down...

Revision history for this message
In , Jakub-gcc (jakub-gcc) wrote :

Apparently the change on the systemtap side was:
https://sourceware.org/git/?p=systemtap.git;a=commit;f=includes/sys/sdt.h;h=eaa15b047688175a94e3ae796529785a3a0af208
which indeed adds a lot of newlines to the inline asms.
But when already using gas macros, I wonder if all that
#define _SDT_ASM_TEMPLATE_1 _SDT_ARGFMT(1)
#define _SDT_ASM_TEMPLATE_2 _SDT_ASM_TEMPLATE_1 _SDT_ASM_BLANK _SDT_ARGFMT(2)
#define _SDT_ASM_TEMPLATE_3 _SDT_ASM_TEMPLATE_2 _SDT_ASM_BLANK _SDT_ARGFMT(3)
#define _SDT_ASM_TEMPLATE_4 _SDT_ASM_TEMPLATE_3 _SDT_ASM_BLANK _SDT_ARGFMT(4)
#define _SDT_ASM_TEMPLATE_5 _SDT_ASM_TEMPLATE_4 _SDT_ASM_BLANK _SDT_ARGFMT(5)
#define _SDT_ASM_TEMPLATE_6 _SDT_ASM_TEMPLATE_5 _SDT_ASM_BLANK _SDT_ARGFMT(6)
#define _SDT_ASM_TEMPLATE_7 _SDT_ASM_TEMPLATE_6 _SDT_ASM_BLANK _SDT_ARGFMT(7)
#define _SDT_ASM_TEMPLATE_8 _SDT_ASM_TEMPLATE_7 _SDT_ASM_BLANK _SDT_ARGFMT(8)
#define _SDT_ASM_TEMPLATE_9 _SDT_ASM_TEMPLATE_8 _SDT_ASM_BLANK _SDT_ARGFMT(9)
#define _SDT_ASM_TEMPLATE_10 _SDT_ASM_TEMPLATE_9 _SDT_ASM_BLANK _SDT_ARGFMT(10)
#define _SDT_ASM_TEMPLATE_11 _SDT_ASM_TEMPLATE_10 _SDT_ASM_BLANK _SDT_ARGFMT(11)
#define _SDT_ASM_TEMPLATE_12 _SDT_ASM_TEMPLATE_11 _SDT_ASM_BLANK _SDT_ARGFMT(12)
couldn't be rewritten into use of another .macro _SDT_ASM_TEMPLATE that just takes emits it all.
See the
             .macro sum from=0, to=5
             .long \from
             .if \to-\from
             sum "(\from+1)",\to
             .endif
             .endm
macro from gas documentation for inspiration.

Revision history for this message
In , Jakub-gcc (jakub-gcc) wrote :

Note, the %n[_SDT_S##no] in there need to stay (dunno about the
_SDT_ASM_SUBSTR(_SDT_ARGTMPL(_SDT_A##no)) stuff), but that could be achieved
by giving the macro from, to, arg, args:vararg arguments and use it like:
  _SDT_ASM_TEMPLATE 1, 4, %n[_SDT_S1], %n[_SDT_S2], %n[_SDT_S3], %n[_SDT_S4]

Revision history for this message
In , Daniel Berrange (berrange) wrote :

FYI, I've opened a bug against systemtap in Fedora to track this problem on that side too https://bugzilla.redhat.com/show_bug.cgi?id=2026858

Revision history for this message
In , Jakub-gcc (jakub-gcc) wrote :

Note that we document how the size of asm is estimated:
https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html
and unfortunately asm inline ("..." ...) makes the size estimation 0 only for inlining purposes and not for others too.
So, for systemtap it is still desirable to use as few newlines/; characters in the pattern as possible. If one macro is used to handle 1-12 operands through recursions, one way to save a few newlines would be avoid all those other 5 macros, each one is used just once for each parameter and avoiding their .macro, .endm and .purgem lines will get rid of 15 lines...
Similarly, not doing .pushsection/.popsection 3 times for each argument but just once would help etc.

Revision history for this message
Simon Chopin (schopin) wrote :

Workaround for the failure in postgresql-14

tags: added: patch
Revision history for this message
Bryce Harrington (bryce) wrote :

Thanks simon, I've put this on the server team's priority list so that one of us will get to it soonish.

tags: added: server-next
Changed in postgresql-14 (Ubuntu):
importance: Undecided → High
status: New → Triaged
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

https://launchpad.net/ubuntu/+source/postgresql-14/14.1-1ubuntu1 is already in proposed and fine in regard to the build - so I guess this can be "Fix committed".

Furthermore I found that Debian is affected by the very same. The build structure changed a bit, but I submitted it there (to be able to make it a sync again) via:
https://salsa.debian.org/postgresql/postgresql-common/-/merge_requests/15

Due to a centralization of the build structure this then needs the new postgresql-common followed by making postgresql-14 a sync again afterwards.

Changed in postgresql-14 (Ubuntu):
status: Triaged → Fix Committed
assignee: nobody → Christian Ehrhardt  (paelzer)
Revision history for this message
Simon Chopin (schopin) wrote :

Oh yes sorry, I should have notified the bug manually. Thanks for forwarding it to Debian!

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package postgresql-14 - 14.1-1ubuntu1

---------------
postgresql-14 (14.1-1ubuntu1) jammy; urgency=medium

  * d/rules: Work around an ICE in latest GCC for armhf, see
    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103395 (LP: #1953128)

 -- Simon Chopin <email address hidden> Thu, 02 Dec 2021 13:37:16 +0100

Changed in postgresql-14 (Ubuntu):
status: Fix Committed → Fix Released
tags: removed: server-next
Revision history for this message
In , Rguenth (rguenth) wrote :

GCC 9 branch is being closed

Revision history for this message
Lucas Kanashiro (lucaskanashiro) wrote :

I do not see a needed fix for gcc as part of this bug work. I am setting the gcc task to invalid. Please, correct me if I am wrong.

Changed in gcc-11 (Ubuntu):
status: New → Invalid
Revision history for this message
Lucas Kanashiro (lucaskanashiro) wrote :

Sergio checked this out and the gcc task is still valid, so reverting what I did, getting it back to New.

Changed in gcc-11 (Ubuntu):
status: Invalid → New
Revision history for this message
In , Jakub-gcc (jakub-gcc) wrote :

GCC 10.4 is being released, retargeting bugs to GCC 10.5.

Revision history for this message
In , Rguenth (rguenth) wrote :

GCC 10 branch is being closed.

Changed in gcc:
importance: Unknown → Medium
status: Unknown → Confirmed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.