Ubuntu
mumax3 package

mumax3 test suite fails against glibc 2.38

Bug #2032624 reported by Simon Chopin on 2023-08-22

This bug affects 1 person

	Status	Importance	Assigned to
GLibC	New	Medium	sourceware-bugs #30909
cbmc (Ubuntu)	New	Undecided	Unassigned
cxref (Ubuntu)	New	Undecided	Unassigned
gauche-c-wrapper (Ubuntu)	New	Undecided	Unassigned
glibc (Ubuntu)	In Progress	Medium	Unassigned
mumax3 (Ubuntu)	New	Critical	Unassigned
nvidia-nccl (Ubuntu)	New	Undecided	Unassigned
rocm-hipamd (Ubuntu)	New	Undecided	Unassigned
stdgpu-contrib (Ubuntu)	New	Undecided	Unassigned

Bug Description

The autopkgtests fail with the following error:

921s nvcc -std=c++03 -ccbin=/usr/bin/cuda-gcc --compiler-options -Werror --compiler-options -Wall -Xptxas -O3 -ptx -arch=compute_50 -code=sm_50 copypadmul2.cu -o copypadmul2_50.ptx
922s /usr/include/aarch64-linux-gnu/bits/math-vector.h(30): error: identifier "__Float32x4_t" is undefined
922s
922s /usr/include/aarch64-linux-gnu/bits/math-vector.h(31): error: identifier "__Float64x2_t" is undefined
922s
922s /usr/include/aarch64-linux-gnu/bits/math-vector.h(40): error: identifier "__SVFloat32_t" is undefined
922s
922s /usr/include/aarch64-linux-gnu/bits/math-vector.h(41): error: identifier "__SVFloat64_t" is undefined
922s
922s /usr/include/aarch64-linux-gnu/bits/math-vector.h(42): error: identifier "__SVBool_t" is undefined

Marking as critical as this blocks the glibc transition.

Tags:

CVE References

Simon Chopin (schopin) on 2023-08-22

Changed in glibc (Ubuntu):
importance:	Undecided → Critical
tags:	added: update-excuse
tags:	added: foundations-todo

Revision history for this message

Mitchell Dzurick (mitchdz) wrote on 2023-08-23:

This commit https://sourceware.org/git/?p=glibc.git;a=commit;h=cd94326a1326c4e3f1ee7a8d0a161cc0bdcaf07e added the file `sysdeps/aarch64/fpu/bits/math-vector.h.

On a mantic system, the header file gets placed at /usr/include/aarch64-linux-gnu/bits/math-vector.h, which used to do only a single thing for aarch64, which was:
#include <bits/libm-simd-decl-stubs.h>

And after the commit, a few types are added such as

#if __GNUC_PREREQ(9, 0)
# define __ADVSIMD_VEC_MATH_SUPPORTED
typedef __Float32x4_t __f32x4_t;
typedef __Float64x2_t __f64x2_t;
...

Simply commenting out the new types is enough to fix this issue, but completely removing the newly added support for libmvec is not a great idea.

Perhaps nvidia-cuda-toolkit-gcc needs to be rebuilt with support for these types?

Revision history for this message

Graham Inggs (ginggs) wrote on 2023-08-23:

The nvidia-cuda-toolkit-gcc package only contains the /usr/bin/cuda-g++ and /usr/bin/cuda-gcc wrappers and has a dependency on the highest supported g++, currently g++-12.

See: https://packages.ubuntu.com/mantic/devel/nvidia-cuda-toolkit-gcc

Revision history for this message

Mitchell Dzurick (mitchdz) wrote on 2023-08-23:

Tried a no-change rebuild of nvidia-cuda-toolkit (https://launchpad.net/~mitchdz/+archive/ubuntu/nvidia-cuda-toolkit-mantic-merge) using the proposed archive and that did not solve the problem.

Revision history for this message

Mitchell Dzurick (mitchdz) wrote on 2023-08-23:

Ah I posted my comment right after your ginggs. Thanks for the pointer! You're right, on my system cuda-gcc just points to gcc-12

$ ll $(which /usr/bin/cuda-gcc)
lrwxrwxrwx 1 root root 6 Aug 23 14:17 /usr/bin/cuda-gcc -> gcc-12*

I tried using gcc-13 instead as I would hope that version would see these new types, but I'm still seeing __Float32x4_t undefined, in addition to some new types being undefined

nvcc -std=c++11 -ccbin=/usr/bin/g++-13 --allow-unsupported-compiler --compiler-options -Werror --compiler-options -Wall -Xptxas -O3 -ptx -arch=compute_50 -code=sm_50 copypadmul2.cu -o copypadmul2_50.ptx
...
/usr/include/stdlib.h(147): error: identifier "_Float64" is undefined
/usr/include/stdlib.h(153): error: identifier "_Float128" is undefined
/usr/include/stdlib.h(159): error: identifier "_Float32x" is undefined
/usr/include/stdlib.h(165): error: identifier "_Float64x" is undefined
...

Also another note, these particular CUDA code snippets don't really need these types, so finding a way to not include them will work (maybe patching libc6-dev to include another preprocessor directive) but I think ultimately that's a bad idea because someone could want a .cu file that uses both arm SIMD extensions in addition to the CUDA code.

Revision history for this message

Graham Inggs (ginggs) wrote on 2023-08-25:

Some similar reports I found (although from some years ago):

https://forums.developer.nvidia.com/t/nvcc-compilation-errors-on-24-2-l4t-platform-tx1/45937

https://github.com/InsightSoftwareConsortium/ITK/issues/1959

"The user space in R23.x is 32-bit. NEON is also from the 32-bit compatibility mode that makes ARMv8 able to execute armhf. The errors tend to imply that some 32-bit compatibility mode library for NEON is missing."

Seems to imply some mismatch between NEON (32-bit) and arm64?

Revision history for this message

Graham Inggs (ginggs) wrote on 2023-08-25:

We'll ignore this failure and allow glibc to migrate, and that does not preclude further investigation.

Note that mumax3/arm64 is not built in Debian, and did not built in jammy, so we may end up removing the arm64 binary.

Revision history for this message

Matthias Klose (doko) wrote on 2023-08-28:

Removing packages from mantic:
mumax3 3.10-8 in mantic arm64
Comment: LP: #2032624, remove mumax3 binary on arm64
1 package successfully removed.

Revision history for this message

Daniel van Vugt (vanvugt) wrote on 2023-09-04:

This seems to be causing bug 2033747 too.

Revision history for this message

Graham Inggs (ginggs) wrote on 2023-09-20:

This also seems to cause nvidia-nccl to FTBFS on arm64 in the test rebuild

https://people.canonical.com/~ginggs/ftbfs-report/test-rebuild-20230830-mantic-mantic.html

tags:

added: ftbfs

Revision history for this message

Graham Inggs (ginggs) wrote on 2023-09-20:

#10

cxref also FTBFS on arm64 in the test rebuild

Revision history for this message

Graham Inggs (ginggs) wrote on 2023-09-20:

#11

Also gauche-c-wrapper, rocm-hipamd and stdgpu-contrib

Revision history for this message

Heinrich Schuchardt (xypron) wrote on 2023-09-21:

#12

cbmc fails to build from source on arm64 with LTO disabled as reported in LP 2036745:

Failed test: fmod1
CBMC version 5.89.0 (cbmc-5.89.0) 64-bit arm64 linux
Parsing main.c
file /usr/include/aarch64-linux-gnu/bits/math-vector.h line 30: syntax error before '__f32x4_t'
PARSING ERROR

https://launchpadlibrarian.net/688275364/buildlog_ubuntu-mantic-arm64.cbmc_5.89.0-2ubuntu1~ppa1_BUILDING.txt.gz

Revision history for this message

In Sourceware.org Bugzilla #30909, Simon Chopin (schopin) wrote on 2023-09-27:

#14

The use of vector types such as __Float32x4_t in the aarch64 math-vector.h header breaks quite a few programs that are essentially parsing C code but using GCC as their preprocessor. GCC expands to the paths using its own intrinsic types, which aren't implemented by the consuming programs.

I'm not sure if this qualifies as a bug in glibc, as it seems reasonable to rely on those types, but we've seen this happen in quite a few instances in Ubuntu:

https://bugs.launchpad.net/ubuntu/+source/mumax3/+bug/2032624

Revision history for this message

Simon Chopin (schopin) wrote on 2023-09-27 (last edit on 2023-09-27):

#13

Reported upstream at https://sourceware.org/bugzilla/show_bug.cgi?id=30909

Bug Watch Updater (bug-watch-updater) on 2023-09-27

Changed in glibc:
importance:	Unknown → Medium
status:	Unknown → New

Revision history for this message

In Sourceware.org Bugzilla #30909, Simon Chopin (schopin) wrote on 2023-09-27:

#16

I posted a tentative patch adding a way to work around those types at https://sourceware.org/pipermail/libc-alpha/2023-September/151770.html

I'll ship it in my next Ubuntu upload for Mantic as a way to unblock us due to our fairly tight schedule, but I'm hoping we can come up with a better long-term solution.

Revision history for this message

Simon Chopin (schopin) wrote on 2023-09-27:

#15

I'll be shipping a temporary workaround patch that disables the vec types if __ARM_VEC_MATH_DISABLED is defined. We still need to patch each failure individually to add that flag to the preprocessor step (not at build time but at runtime!), but at least the patching should be easier and quicker than providing proper support for the various vector types.

We shouldn't bother upstreaming those fixes to Debian, as I'm pretty sure the final glibc part of the solution will look fairly different than my current patch, but at least we can get those packages working in the mean time.

Revision history for this message

Launchpad Janitor (janitor) wrote on 2023-09-30:

#17

This bug was fixed in the package glibc - 2.38-1ubuntu5

---------------
glibc (2.38-1ubuntu5) mantic; urgency=medium

  * Update from upstream release branche:
    - CVE-2023-4527: Stack read overflow with large TCP responses in
      no-aaaa mode
    - CVE-2023-4806: use after free in getcanonname
    - LP: #2031909: Fix oversized __io_vtables
  * d/p/u/0001-Fix-leak-in-getaddrinfo-introduced-by-the-fix-for-CV:
    Cherry-picked to fix a regression in one of the previous CVE fixes
    (LP: #2037516, CVE-2023-5156)
  * d/p/lp2032624.patch: add an escape hatch in arm64 math-vector.h.
    This should help fixing multiple FTBFS (LP: #2032624)

-- Simon Chopin <email address hidden> Wed, 27 Sep 2023 16:38:18 +0200

Changed in glibc (Ubuntu):
status:	New → Fix Released

Revision history for this message

Simon Chopin (schopin) wrote on 2023-10-05:

#18

Reopening in glibc as I had some upstream feedback that basically mean my workaround is not a good idea. I agree with them, and thus we should drop it, both in upcoming releases but also in the upcoming Mantic SRU to avoid users starting to depend on it, however unlikely that would be.

Changed in glibc (Ubuntu):
importance:	Critical → Medium
status:	Fix Released → In Progress

Revision history for this message

In Sourceware.org Bugzilla #30909, Connor-baker (connor-baker) wrote on 2023-11-02:

#19

Adding some additional context:

We're running into this issue in Nixpkgs: https://github.com/NixOS/nixpkgs/pull/264599#pullrequestreview-1707381631.

The GLIBC 2.38 update introduces intrinsics for `aarch64-linux` in `math.h`.

NVCC (NVIDIA's CUDA Compiler) declares itself to be the same compiler as its host compiler. This causes inclusion of unsupported `aarch64-linux` intrinsics. NVCC is now unable to compile any CUDA file for `aarch64-linux` because it does not support these intrinsics: https://forums.developer.nvidia.com/t/nvcc-fails-to-build-with-arm-neon-instructions-cpp-vs-cu/248355/2.

I'll be submitting the same patch I've made for Nixpkgs.

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

sourceware-bugs #30909
[UNCONFIRMED] Edit
auto-github-insightsoftwareconsortium-itk #1959
[closed type:Bug] Edit

Bug watches keep track of this bug in other bug trackers.

Ubuntumumax3 package

mumax3 test suite fails against glibc 2.38

Bug Description

CVE References

Other bug subscribers

Remote bug watches

Ubuntu
mumax3 package