I checked the backtrace of a crashed dhcpd running on 4.4.1-2.1ubuntu5.
(gdb) info threads
Id Target Id Frame
* 1 Thread 0x7fb4ddecb700 (LWP 3170) __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
2 Thread 0x7fb4dd6ca700 (LWP 3171) __lll_lock_wait (futex=futex@entry=0x7fb4de6d2028, private=0) at lowlevellock.c:52
3 Thread 0x7fb4de6cc700 (LWP 3169) futex_wake (private=<optimized out>, processes_to_wake=1, futex_word=<optimized out>) at ../sysdeps/nptl/futex-internal.h:364
4 Thread 0x7fb4de74f740 (LWP 3148) futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x7fb4de6cd0d0) at ../sysdeps/nptl/futex-internal.h:183
(gdb) frame 2
#2 0x00007fb4dec85985 in isc_assertion_failed (file=file@entry=0x7fb4decd8878 "../../../../lib/isc/unix/socket.c", line=line@entry=3361, type=type@entry=isc_assertiontype_insist,
cond=cond@entry=0x7fb4decda033 "!sock->pending_send") at ../../../lib/isc/assertions.c:52
(gdb) bt
#1 0x00007fb4deaa7859 in __GI_abort () at abort.c:79
#2 0x00007fb4dec85985 in isc_assertion_failed (file=file@entry=0x7fb4decd8878 "../../../../lib/isc/unix/socket.c", line=line@entry=3361, type=type@entry=isc_assertiontype_insist,
cond=cond@entry=0x7fb4decda033 "!sock->pending_send") at ../../../lib/isc/assertions.c:52
#3 0x00007fb4decc17e1 in dispatch_send (sock=0x7fb4de6d4990) at ../../../../lib/isc/unix/socket.c:4041
#4 process_fd (writeable=<optimized out>, readable=<optimized out>, fd=11, manager=0x7fb4de6d0010) at ../../../../lib/isc/unix/socket.c:4054
#5 process_fds (writefds=<optimized out>, readfds=0x7fb4de6d1090, maxfd=13, manager=0x7fb4de6d0010) at ../../../../lib/isc/unix/socket.c:4211
#6 watcher (uap=0x7fb4de6d0010) at ../../../../lib/isc/unix/socket.c:4397
#7 0x00007fb4dea68609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#8 0x00007fb4deba4103 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
(gdb) frame 3
#3 0x00007fb4decc17e1 in dispatch_send (sock=0x7fb4de6d4990) at ../../../../lib/isc/unix/socket.c:4041
4041 in ../../../../lib/isc/unix/socket.c
(gdb) p sock->pending_send
$2 = 1
""Now if a wakeup event occurres the socket would be dispatched for
processing regardless which kind of event (timer?) triggered the wakeup.
At least I did not find any sanity checks in process_fds() except
SOCK_DEAD(sock).
This leads to the following situation: The sock is not dead yet but it
is still pending when it is dispatched again.
I would now check sock->pending_send before calling dispatch_send().This
would at least prevent the assertion failure - well knowing that the
situation described above ( not dead but still pending and alerting ) is
not a very pleasant one - until someone comes up with a better solution.
"""
0) The reproducer doesn't seems consistent and seems to be related to a race
condition associated with a internal timer/futex.
1) Can anyone confirm that a pristine upstream 4.4.1 doesn't reproduces the issue?
Hello,
I checked the backtrace of a crashed dhcpd running on 4.4.1-2.1ubuntu5.
(gdb) info threads unix/sysv/ linux/raise. c:50 futex@entry= 0x7fb4de6d2028, private=0) at lowlevellock.c:52 to_wake= 1, futex_word= <optimized out>) at ../sysdeps/ nptl/futex- internal. h:364 cancelable (private=<optimized out>, expected=0, futex_word= 0x7fb4de6cd0d0) at ../sysdeps/ nptl/futex- internal. h:183
Id Target Id Frame
* 1 Thread 0x7fb4ddecb700 (LWP 3170) __GI_raise (sig=sig@entry=6) at ../sysdeps/
2 Thread 0x7fb4dd6ca700 (LWP 3171) __lll_lock_wait (futex=
3 Thread 0x7fb4de6cc700 (LWP 3169) futex_wake (private=<optimized out>, processes_
4 Thread 0x7fb4de74f740 (LWP 3148) futex_wait_
(gdb) frame 2 failed (file=file@ entry=0x7fb4dec d8878 "../../ ../../lib/ isc/unix/ socket. c", line=line@ entry=3361, type=type@ entry=isc_ assertiontype_ insist, cond@entry= 0x7fb4decda033 "!sock- >pending_ send") at ../../. ./lib/isc/ assertions. c:52 failed (file=file@ entry=0x7fb4dec d8878 "../../ ../../lib/ isc/unix/ socket. c", line=line@ entry=3361, type=type@ entry=isc_ assertiontype_ insist, cond@entry= 0x7fb4decda033 "!sock- >pending_ send") at ../../. ./lib/isc/ assertions. c:52 d4990) at ../../. ./../lib/ isc/unix/ socket. c:4041 <optimized out>, readable=<optimized out>, fd=11, manager= 0x7fb4de6d0010) at ../../. ./../lib/ isc/unix/ socket. c:4054 <optimized out>, readfds= 0x7fb4de6d1090, maxfd=13, manager= 0x7fb4de6d0010) at ../../. ./../lib/ isc/unix/ socket. c:4211 0010) at ../../. ./../lib/ isc/unix/ socket. c:4397 create. c:477 unix/sysv/ linux/x86_ 64/clone. S:95
#2 0x00007fb4dec85985 in isc_assertion_
cond=
(gdb) bt
#1 0x00007fb4deaa7859 in __GI_abort () at abort.c:79
#2 0x00007fb4dec85985 in isc_assertion_
cond=
#3 0x00007fb4decc17e1 in dispatch_send (sock=0x7fb4de6
#4 process_fd (writeable=
#5 process_fds (writefds=
#6 watcher (uap=0x7fb4de6d
#7 0x00007fb4dea68609 in start_thread (arg=<optimized out>) at pthread_
#8 0x00007fb4deba4103 in clone () at ../sysdeps/
(gdb) frame 3 d4990) at ../../. ./../lib/ isc/unix/ socket. c:4041 ./../lib/ isc/unix/ socket. c
#3 0x00007fb4decc17e1 in dispatch_send (sock=0x7fb4de6
4041 in ../../.
(gdb) p sock->pending_send
$2 = 1
The code is crashing on this assertion: https:/ /gitlab. isc.org/ isc-projects/ bind9/- /blob/v9_ 11_3/lib/ isc/unix/ socket. c#L3364
This was already reported and marked as fixed in debian (?) via [0]
""Now if a wakeup event occurres the socket would be dispatched for
processing regardless which kind of event (timer?) triggered the wakeup.
At least I did not find any sanity checks in process_fds() except
SOCK_DEAD(sock).
This leads to the following situation: The sock is not dead yet but it
is still pending when it is dispatched again.
I would now check sock->pending_send before calling dispatch_ send(). This
would at least prevent the assertion failure - well knowing that the
situation described above ( not dead but still pending and alerting ) is
not a very pleasant one - until someone comes up with a better solution.
"""
https:/ /bugs.debian. org/cgi- bin/bugreport. cgi?bug= 430065# 20
** Follow up questions:
0) The reproducer doesn't seems consistent and seems to be related to a race
condition associated with a internal timer/futex.
1) Can anyone confirm that a pristine upstream 4.4.1 doesn't reproduces the issue?