Comment 8 for bug 1926379

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

So I don't know exactly what was going on with sarnold's machine but I now finally understand why the 0ubuntu9.3 update caused problems:

The tls accounting patch added a glibc tunable (https://www.gnu.org/software/libc/manual/html_node/Tunables.html). A tunable is defined internally as a name and a type (and some other data) but during the build it also gets assigned an ID and unfortunately the tunable added by the tls accounting patch ends changing the ID of the glibc.pthread.mutex_spin_count tunable. The problems occur when you have a new dynamic linker / ld.so but an old libpthread.so: libpthread.so's _init function calls get_tunable with the ID for glibc.pthread.mutex_spin_count, but get_tunable is implemented in ld.so, where this ID corresponds to the new glibc.rtld.nns tunable. The type for glibc.pthread.mutex_spin_count is int32 and the type for glibc.rtld.nns is size_t, so when get_tunable writes the value into the pointer it is passed, it does indeed smash the stack. Even if this doesn't happen, libpthread might well misbehave in all sorts of ways if gets back values appropriate for glibc.rtld.nns when it's expecting values for glibc.pthread.mutex_spin_count.

So this explains the behaviour seen in bug 1926355, completely. What I don't understand wrt this bug is that "new ld.so / old libpthread.so" should be a very temporary situation during an upgrade. I guess a process that has the old ld.so loaded might dlopen the new libpthread.so and experience a similar issue, although dlopening libpthread isn't really a think that works aiui. But it could be a similar problem with some other library.

Unfortunately, this means that upgrades from 0ubuntu9.3 to 0ubuntu9.4 are vulnerable to the same issue.