I encountered this bug because mosh stopped to work after debian updated the libc to the 2.22 [1][2]. After few tests I discovered that the problem was related to a strange combination of switch and libs (see below).
The minimal test case to reproduce the problem is the following:
Expected result: the program doesn't have to crash
Result: the program crashes :-)
The fatal combination seems to be "-lpthread", "-Wl,-z,now" a call to fork() and the glibc-2.22. The crash happens near the fork.
The bug happened in mosh because:
- mosh is linked against libprotobuffer and libutempter
- mosh uses the "-Wl,-z,now" switch
- libprotobuffer via pkg-config suggests the -lpthread switch
- and libutempter uses the fork() function.
Together created the condition for the bug.
Looking at the commits between the 2.21 and 2.22 regarding nptl/pt-fork.c, I found the following one:
commit beff1d132c16aedd87a3f1bc7b572c8e69819015
Author: Roland McGrath <email address hidden>
Date: Fri Feb 6 10:53:07 2015 -0800
Clean up NPTL fork to be compat-only
Reverting it, the problem seems to disappear.
Florian Weimer, made some further investigation:
(gdb) break dofork
Breakpoint 1 at 0x4005b0
(gdb) r
Starting program: /home/fweimer/boom
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Breakpoint 1, 0x00007ffff79bd6d4 in dofork () from
/home/fweimer/libdofork.so
(gdb) disassemble
Dump of assembler code for function dofork:
0x00007ffff79bd6d0 <+0>: push %rbp
0x00007ffff79bd6d1 <+1>: mov %rsp,%rbp
=> 0x00007ffff79bd6d4 <+4>: callq 0x7ffff79bd5c0 <fork@plt>
0x00007ffff79bd6d9 <+9>: nop
0x00007ffff79bd6da <+10>: pop %rbp
0x00007ffff79bd6db <+11>: retq
End of assembler dump.
(gdb) si
0x00007ffff79bd5c0 in fork@plt () from /home/fweimer/libdofork.so
(gdb) disassemble
Dump of assembler code for function fork@plt:
=> 0x00007ffff79bd5c0 <+0>: jmpq *0x200a0a(%rip) #
0x7ffff7bbdfd0 <email address hidden>
0x00007ffff79bd5c6 <+6>: pushq $0x2
0x00007ffff79bd5cb <+11>: jmpq 0x7ffff79bd590
End of assembler dump.
(gdb) print *(void **)0x7ffff7bbdfd0
$1 = (void *) 0x0
(gdb)
The commit beff1d132c16aedd87a3f1bc7b572c8e69819015,
assumes that __libc_fork has been relocated before the IFUNC resolver
for the libpthread fork definition runs, which is not always true.
I encountered this bug because mosh stopped to work after debian updated the libc to the 2.22 [1][2]. After few tests I discovered that the problem was related to a strange combination of switch and libs (see below).
The minimal test case to reproduce the problem is the following:
$ cat boom.c
extern void dofork();
int main() {
dofork();
}
$ cat dofork.c
#include <unistd.h>
void dofork() {
fork();
}
$ gcc -fPIC -c dofork.c PATH=$( pwd) ./boom
$ gcc -shared -Wl,-z,now -o libdofork.so dofork.o
$ gcc -o boom boom.c -lpthread -L$(pwd) -ldofork
$ LD_LIBRARY_
Segmentation fault
Expected result: the program doesn't have to crash
Result: the program crashes :-)
The fatal combination seems to be "-lpthread", "-Wl,-z,now" a call to fork() and the glibc-2.22. The crash happens near the fork.
The bug happened in mosh because:
- mosh is linked against libprotobuffer and libutempter
- mosh uses the "-Wl,-z,now" switch
- libprotobuffer via pkg-config suggests the -lpthread switch
- and libutempter uses the fork() function.
Together created the condition for the bug.
Looking at the commits between the 2.21 and 2.22 regarding nptl/pt-fork.c, I found the following one:
commit beff1d132c16aed d87a3f1bc7b572c 8e69819015
Author: Roland McGrath <email address hidden>
Date: Fri Feb 6 10:53:07 2015 -0800
Clean up NPTL fork to be compat-only
Reverting it, the problem seems to disappear.
Florian Weimer, made some further investigation:
(gdb) break dofork libthread_ db.so.1" .
Breakpoint 1 at 0x4005b0
(gdb) r
Starting program: /home/fweimer/boom
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/
Breakpoint 1, 0x00007ffff79bd6d4 in dofork () from libdofork. so 9bd6d0 <+0>: push %rbp 9bd6d1 <+1>: mov %rsp,%rbp 9bd6d9 <+9>: nop 9bd6da <+10>: pop %rbp 9bd6db <+11>: retq libdofork. so 9bd5c6 <+6>: pushq $0x2 9bd5cb <+11>: jmpq 0x7ffff79bd590
/home/fweimer/
(gdb) disassemble
Dump of assembler code for function dofork:
0x00007ffff7
0x00007ffff7
=> 0x00007ffff79bd6d4 <+4>: callq 0x7ffff79bd5c0 <fork@plt>
0x00007ffff7
0x00007ffff7
0x00007ffff7
End of assembler dump.
(gdb) si
0x00007ffff79bd5c0 in fork@plt () from /home/fweimer/
(gdb) disassemble
Dump of assembler code for function fork@plt:
=> 0x00007ffff79bd5c0 <+0>: jmpq *0x200a0a(%rip) #
0x7ffff7bbdfd0 <email address hidden>
0x00007ffff7
0x00007ffff7
End of assembler dump.
(gdb) print *(void **)0x7ffff7bbdfd0
$1 = (void *) 0x0
(gdb)
The commit beff1d132c16aed d87a3f1bc7b572c 8e69819015,
assumes that __libc_fork has been relocated before the IFUNC resolver
for the libpthread fork definition runs, which is not always true.
Florian
------- ------- ------- ------- ------ /bugs.debian. org/cgi- bin/bugreport. cgi?bug= 817929 /github. com/mobile- shell/mosh/ issues/ 727
[1] https:/
[2] https:/