2020-04-10 18:19:24 |
Andrew Welham |
bug |
|
|
added bug |
2020-04-27 14:48:27 |
Launchpad Janitor |
isc-dhcp (Ubuntu): status |
New |
Confirmed |
|
2020-05-14 18:23:50 |
Robert Stroetgen |
bug |
|
|
added subscriber Robert Stroetgen |
2020-05-15 19:39:26 |
Stephan Spees |
bug |
|
|
added subscriber Stephan Spees |
2020-05-31 07:02:16 |
Ross |
bug |
|
|
added subscriber Ross |
2020-06-05 03:32:30 |
atom |
bug |
|
|
added subscriber atom |
2020-06-15 15:40:40 |
Brian Murray |
tags |
|
focal rls-ff-incoming |
|
2020-06-19 16:56:33 |
Konstantin |
bug |
|
|
added subscriber Konstantin |
2020-06-21 20:51:50 |
Peter A. Bigot |
bug |
|
|
added subscriber Peter A. Bigot |
2020-07-31 11:42:23 |
Richard Laager |
bug |
|
|
added subscriber Richard Laager |
2020-08-01 20:31:40 |
Richard Laager |
bug watch added |
|
https://gitlab.isc.org/isc-projects/dhcp/issues/128 |
|
2020-08-01 20:31:40 |
Richard Laager |
bug task added |
|
dhcp |
|
2020-08-01 23:17:52 |
Bug Watch Updater |
dhcp: status |
Unknown |
New |
|
2020-08-04 01:55:43 |
Jorge Niedbalski |
nominated for series |
|
Ubuntu Groovy |
|
2020-08-04 01:55:43 |
Jorge Niedbalski |
bug task added |
|
isc-dhcp (Ubuntu Groovy) |
|
2020-08-04 01:55:43 |
Jorge Niedbalski |
nominated for series |
|
Ubuntu Focal |
|
2020-08-04 01:55:43 |
Jorge Niedbalski |
bug task added |
|
isc-dhcp (Ubuntu Focal) |
|
2020-08-04 01:56:32 |
Jorge Niedbalski |
bug task added |
|
bind9-libs (Ubuntu) |
|
2020-08-04 03:01:05 |
Jorge Niedbalski |
bug watch added |
|
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=430065 |
|
2020-08-04 16:14:22 |
Jorge Niedbalski |
bug |
|
|
added subscriber Jorge Niedbalski |
2020-08-05 17:36:54 |
Andrew Welham |
attachment added |
|
_usr_sbin_dhcpd.0.crash https://bugs.launchpad.net/ubuntu/+source/isc-dhcp/+bug/1872118/+attachment/5398988/+files/_usr_sbin_dhcpd.0.crash |
|
2020-08-06 20:02:07 |
Richard Laager |
bug watch added |
|
https://gitlab.isc.org/isc-projects/dhcp/-/issues/121 |
|
2020-08-11 17:49:59 |
Jorge Niedbalski |
bind9-libs (Ubuntu Focal): status |
New |
In Progress |
|
2020-08-11 17:50:03 |
Jorge Niedbalski |
bind9-libs (Ubuntu Groovy): status |
New |
In Progress |
|
2020-08-11 17:50:08 |
Jorge Niedbalski |
isc-dhcp (Ubuntu Focal): status |
New |
In Progress |
|
2020-08-11 17:50:11 |
Jorge Niedbalski |
isc-dhcp (Ubuntu Groovy): status |
Confirmed |
In Progress |
|
2020-08-11 17:50:14 |
Jorge Niedbalski |
bind9-libs (Ubuntu Focal): assignee |
|
Jorge Niedbalski (niedbalski) |
|
2020-08-11 17:50:17 |
Jorge Niedbalski |
bind9-libs (Ubuntu Groovy): assignee |
|
Jorge Niedbalski (niedbalski) |
|
2020-08-11 17:50:20 |
Jorge Niedbalski |
isc-dhcp (Ubuntu Focal): assignee |
|
Jorge Niedbalski (niedbalski) |
|
2020-08-11 17:50:26 |
Jorge Niedbalski |
isc-dhcp (Ubuntu Groovy): assignee |
|
Jorge Niedbalski (niedbalski) |
|
2020-08-11 19:15:47 |
Jorge Niedbalski |
summary |
DHCP Cluster crashes after a few hours |
[SRU] DHCP Cluster crashes after a few hours |
|
2020-08-11 19:38:25 |
Jorge Niedbalski |
description |
I have a pair of DHCP serevrs running in a cluster on ubuntu 20.04, All worked perfectly until recently, when they started stopping with code=killed, status=6/ABRT.
This is being fixed by
https://bugs.launchpad.net/bugs/1870729
However now one stops after a few hours with the following errors. One can stay on line but not both.
Syslog shows
Apr 10 17:20:15 dhcp-primary sh[6828]: ../../../../lib/isc/unix/socket.c:3361: INSIST(!sock->pending_send) failed, back trace
Apr 10 17:20:15 dhcp-primary sh[6828]: #0 0x7fbe78702a4a in ??
Apr 10 17:20:15 dhcp-primary sh[6828]: #1 0x7fbe78702980 in ??
Apr 10 17:20:15 dhcp-primary sh[6828]: #2 0x7fbe7873e7e1 in ??
Apr 10 17:20:15 dhcp-primary sh[6828]: #3 0x7fbe784e5609 in ??
Apr 10 17:20:15 dhcp-primary sh[6828]: #4 0x7fbe78621103 in ??
nothing in kern.log
apport.log shows
ERROR: apport (pid 6850) Fri Apr 10 17:20:15 2020: called for pid 6828, signal 6, core limit 0, dump mode 2
ERROR: apport (pid 6850) Fri Apr 10 17:20:15 2020: not creating core for pid with dump mode of 2
ERROR: apport (pid 6850) Fri Apr 10 17:20:15 2020: executable: /usr/sbin/dhcpd (command line "dhcpd -user dhcpd -group dhcpd -f -4 -pf /run/dhcp-server/dhcpd.pid -cf /etc/dhcp/dhcpd.conf")
ERROR: apport (pid 6850) Fri Apr 10 17:20:15 2020: is_closing_session(): no DBUS_SESSION_BUS_ADDRESS in environment
ERROR: apport (pid 6850) Fri Apr 10 17:20:15 2020: wrote report /var/crash/_usr_sbin_dhcpd.0.crash
/var/crash/_usr_sbin_dhcpd.0.crash shows
ProblemType: Crash
Architecture: amd64
CrashCounter: 1
Date: Fri Apr 10 17:20:15 2020
DistroRelease: Ubuntu 20.04
ExecutablePath: /usr/sbin/dhcpd
ExecutableTimestamp: 1586210315
ProcCmdline: dhcpd -user dhcpd -group dhcpd -f -4 -pf /run/dhcp-server/dhcpd.pid -cf /etc/dhcp/dhcpd.conf
ProcEnviron: Error: [Errno 13] Permission denied: 'environ'
ProcMaps: Error: [Errno 13] Permission denied: 'maps'
ProcStatus:
Name: dhcpd
Umask: 0022
State: D (disk sleep)
Tgid: 6828
Ngid: 0
Pid: 6828
PPid: 1
TracerPid: 0
Uid: 113 113 113 113
Gid: 118 118 118 118
FDSize: 128
Groups:
NStgid: 6828
NSpid: 6828
NSpgid: 6828
NSsid: 6828
VmPeak: 236244 kB
VmSize: 170764 kB
VmLck: 0 kB
VmPin: 0 kB
VmHWM: 12064 kB
VmRSS: 12064 kB
RssAnon: 5940 kB
RssFile: 6124 kB
RssShmem: 0 kB
VmData: 30792 kB
VmStk: 132 kB
VmExe: 592 kB
VmLib: 5424 kB
VmPTE: 76 kB
VmSwap: 0 kB
HugetlbPages: 0 kB
CoreDumping: 1
THP_enabled: 1
Threads: 4
SigQ: 0/7609
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000001000
SigCgt: 0000000180000000
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: 0000003fffffffff
CapAmb: 0000000000000000
NoNewPrivs: 0
Seccomp: 0
Speculation_Store_Bypass: thread vulnerable
Cpus_allowed: 3
Cpus_allowed_list: 0-1
Mems_allowed: 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,0000
0000,00000000,00000000,00000001
Mems_allowed_list: 0
voluntary_ctxt_switches: 111
nonvoluntary_ctxt_switches: 144
Signal: 6
Uname: Linux 5.4.0-21-generic x86_64
UserGroups: |
[Description]
isc-dhcp-server uses libisc-export (coming from bin9-libs package) for handling the socket event(s) when configured in peer mode (master/secondary). It's possible that a sequence of messages dispatched by the master that requires acknowledgment from its peers holds a socket
in a pending to send state, a timer or a subsequent write request can be scheduled into this socket and the !sock->pending_send assertion
will be raised when trying to write again at the time data hasn't been flushed entirely and the pending_send flag hasn't been reset to 0 state.
If this race condition happens, the following stacktrace will be
hit:
(gdb) info threads
Id Target Id Frame
* 1 Thread 0x7fb4ddecb700 (LWP 3170) __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
2 Thread 0x7fb4dd6ca700 (LWP 3171) __lll_lock_wait (futex=futex@entry=0x7fb4de6d2028, private=0) at lowlevellock.c:52
3 Thread 0x7fb4de6cc700 (LWP 3169) futex_wake (private=<optimized out>, processes_to_wake=1, futex_word=<optimized out>) at ../sysdeps/nptl/futex-internal.h:364
4 Thread 0x7fb4de74f740 (LWP 3148) futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x7fb4de6cd0d0) at ../sysdeps/nptl/futex-internal.h:183
(gdb) frame 2
#2 0x00007fb4dec85985 in isc_assertion_failed (file=file@entry=0x7fb4decd8878 "../../../../lib/isc/unix/socket.c", line=line@entry=3361, type=type@entry=isc_assertiontype_insist,
cond=cond@entry=0x7fb4decda033 "!sock->pending_send") at ../../../lib/isc/assertions.c:52
(gdb) bt
#1 0x00007fb4deaa7859 in __GI_abort () at abort.c:79
#2 0x00007fb4dec85985 in isc_assertion_failed (file=file@entry=0x7fb4decd8878 "../../../../lib/isc/unix/socket.c", line=line@entry=3361, type=type@entry=isc_assertiontype_insist,
cond=cond@entry=0x7fb4decda033 "!sock->pending_send") at ../../../lib/isc/assertions.c:52
#3 0x00007fb4decc17e1 in dispatch_send (sock=0x7fb4de6d4990) at ../../../../lib/isc/unix/socket.c:4041
#4 process_fd (writeable=<optimized out>, readable=<optimized out>, fd=11, manager=0x7fb4de6d0010) at ../../../../lib/isc/unix/socket.c:4054
#5 process_fds (writefds=<optimized out>, readfds=0x7fb4de6d1090, maxfd=13, manager=0x7fb4de6d0010) at ../../../../lib/isc/unix/socket.c:4211
#6 watcher (uap=0x7fb4de6d0010) at ../../../../lib/isc/unix/socket.c:4397
#7 0x00007fb4dea68609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#8 0x00007fb4deba4103 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
(gdb) frame 3
#3 0x00007fb4decc17e1 in dispatch_send (sock=0x7fb4de6d4990) at ../../../../lib/isc/unix/socket.c:4041
4041 in ../../../../lib/isc/unix/socket.c
(gdb) p sock->pending_send
$2 = 1
[TEST CASE]
1) Install isc-dhcp-server in 2 focal machine(s).
2) Configure peer/cluster mode as follows:
Primary configuration: https://pastebin.ubuntu.com/p/XYj648MghK/
Secondary configuration: https://pastebin.ubuntu.com/p/PYkcshZCWK/
2) Run dhcpd as follows in both machine(s)
# dhcpd -f -d -4 -cf /etc/dhcp/dhcpd.conf --no-pid ens4
3) Leave the cluster running for a long (2h) period until the crash/race condition is reproduced.
[REGRESSION POTENTIAL]
* The fix will prevent the assertion to happen in the dispatch_send
path, later versions of isch-dhcp upstream lack this logic and entirely removed the existence of this flag. Therefore, removing the need for this
assertion at process_fd shouldn't be problematic. |
|
2020-08-11 19:40:05 |
Jorge Niedbalski |
description |
[Description]
isc-dhcp-server uses libisc-export (coming from bin9-libs package) for handling the socket event(s) when configured in peer mode (master/secondary). It's possible that a sequence of messages dispatched by the master that requires acknowledgment from its peers holds a socket
in a pending to send state, a timer or a subsequent write request can be scheduled into this socket and the !sock->pending_send assertion
will be raised when trying to write again at the time data hasn't been flushed entirely and the pending_send flag hasn't been reset to 0 state.
If this race condition happens, the following stacktrace will be
hit:
(gdb) info threads
Id Target Id Frame
* 1 Thread 0x7fb4ddecb700 (LWP 3170) __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
2 Thread 0x7fb4dd6ca700 (LWP 3171) __lll_lock_wait (futex=futex@entry=0x7fb4de6d2028, private=0) at lowlevellock.c:52
3 Thread 0x7fb4de6cc700 (LWP 3169) futex_wake (private=<optimized out>, processes_to_wake=1, futex_word=<optimized out>) at ../sysdeps/nptl/futex-internal.h:364
4 Thread 0x7fb4de74f740 (LWP 3148) futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x7fb4de6cd0d0) at ../sysdeps/nptl/futex-internal.h:183
(gdb) frame 2
#2 0x00007fb4dec85985 in isc_assertion_failed (file=file@entry=0x7fb4decd8878 "../../../../lib/isc/unix/socket.c", line=line@entry=3361, type=type@entry=isc_assertiontype_insist,
cond=cond@entry=0x7fb4decda033 "!sock->pending_send") at ../../../lib/isc/assertions.c:52
(gdb) bt
#1 0x00007fb4deaa7859 in __GI_abort () at abort.c:79
#2 0x00007fb4dec85985 in isc_assertion_failed (file=file@entry=0x7fb4decd8878 "../../../../lib/isc/unix/socket.c", line=line@entry=3361, type=type@entry=isc_assertiontype_insist,
cond=cond@entry=0x7fb4decda033 "!sock->pending_send") at ../../../lib/isc/assertions.c:52
#3 0x00007fb4decc17e1 in dispatch_send (sock=0x7fb4de6d4990) at ../../../../lib/isc/unix/socket.c:4041
#4 process_fd (writeable=<optimized out>, readable=<optimized out>, fd=11, manager=0x7fb4de6d0010) at ../../../../lib/isc/unix/socket.c:4054
#5 process_fds (writefds=<optimized out>, readfds=0x7fb4de6d1090, maxfd=13, manager=0x7fb4de6d0010) at ../../../../lib/isc/unix/socket.c:4211
#6 watcher (uap=0x7fb4de6d0010) at ../../../../lib/isc/unix/socket.c:4397
#7 0x00007fb4dea68609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#8 0x00007fb4deba4103 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
(gdb) frame 3
#3 0x00007fb4decc17e1 in dispatch_send (sock=0x7fb4de6d4990) at ../../../../lib/isc/unix/socket.c:4041
4041 in ../../../../lib/isc/unix/socket.c
(gdb) p sock->pending_send
$2 = 1
[TEST CASE]
1) Install isc-dhcp-server in 2 focal machine(s).
2) Configure peer/cluster mode as follows:
Primary configuration: https://pastebin.ubuntu.com/p/XYj648MghK/
Secondary configuration: https://pastebin.ubuntu.com/p/PYkcshZCWK/
2) Run dhcpd as follows in both machine(s)
# dhcpd -f -d -4 -cf /etc/dhcp/dhcpd.conf --no-pid ens4
3) Leave the cluster running for a long (2h) period until the crash/race condition is reproduced.
[REGRESSION POTENTIAL]
* The fix will prevent the assertion to happen in the dispatch_send
path, later versions of isch-dhcp upstream lack this logic and entirely removed the existence of this flag. Therefore, removing the need for this
assertion at process_fd shouldn't be problematic. |
[Description]
isc-dhcp-server uses libisc-export (coming from bind9-libs package) for handling the socket event(s) when configured in peer mode (master/secondary). It's possible that a sequence of messages dispatched by the master that requires acknowledgment from its peers holds a socket
in a pending to send state, a timer or a subsequent write request can be scheduled into this socket and the !sock->pending_send assertion
will be raised when trying to write again at the time data hasn't been flushed entirely and the pending_send flag hasn't been reset to 0 state.
If this race condition happens, the following stacktrace will be
hit:
(gdb) info threads
Id Target Id Frame
* 1 Thread 0x7fb4ddecb700 (LWP 3170) __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
2 Thread 0x7fb4dd6ca700 (LWP 3171) __lll_lock_wait (futex=futex@entry=0x7fb4de6d2028, private=0) at lowlevellock.c:52
3 Thread 0x7fb4de6cc700 (LWP 3169) futex_wake (private=<optimized out>, processes_to_wake=1, futex_word=<optimized out>) at ../sysdeps/nptl/futex-internal.h:364
4 Thread 0x7fb4de74f740 (LWP 3148) futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x7fb4de6cd0d0) at ../sysdeps/nptl/futex-internal.h:183
(gdb) frame 2
#2 0x00007fb4dec85985 in isc_assertion_failed (file=file@entry=0x7fb4decd8878 "../../../../lib/isc/unix/socket.c", line=line@entry=3361, type=type@entry=isc_assertiontype_insist,
cond=cond@entry=0x7fb4decda033 "!sock->pending_send") at ../../../lib/isc/assertions.c:52
(gdb) bt
#1 0x00007fb4deaa7859 in __GI_abort () at abort.c:79
#2 0x00007fb4dec85985 in isc_assertion_failed (file=file@entry=0x7fb4decd8878 "../../../../lib/isc/unix/socket.c", line=line@entry=3361, type=type@entry=isc_assertiontype_insist,
cond=cond@entry=0x7fb4decda033 "!sock->pending_send") at ../../../lib/isc/assertions.c:52
#3 0x00007fb4decc17e1 in dispatch_send (sock=0x7fb4de6d4990) at ../../../../lib/isc/unix/socket.c:4041
#4 process_fd (writeable=<optimized out>, readable=<optimized out>, fd=11, manager=0x7fb4de6d0010) at ../../../../lib/isc/unix/socket.c:4054
#5 process_fds (writefds=<optimized out>, readfds=0x7fb4de6d1090, maxfd=13, manager=0x7fb4de6d0010) at ../../../../lib/isc/unix/socket.c:4211
#6 watcher (uap=0x7fb4de6d0010) at ../../../../lib/isc/unix/socket.c:4397
#7 0x00007fb4dea68609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#8 0x00007fb4deba4103 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
(gdb) frame 3
#3 0x00007fb4decc17e1 in dispatch_send (sock=0x7fb4de6d4990) at ../../../../lib/isc/unix/socket.c:4041
4041 in ../../../../lib/isc/unix/socket.c
(gdb) p sock->pending_send
$2 = 1
[TEST CASE]
1) Install isc-dhcp-server in 2 focal machine(s).
2) Configure peer/cluster mode as follows:
Primary configuration: https://pastebin.ubuntu.com/p/XYj648MghK/
Secondary configuration: https://pastebin.ubuntu.com/p/PYkcshZCWK/
2) Run dhcpd as follows in both machine(s)
# dhcpd -f -d -4 -cf /etc/dhcp/dhcpd.conf --no-pid ens4
3) Leave the cluster running for a long (2h) period until the crash/race condition is reproduced.
[REGRESSION POTENTIAL]
* The fix will prevent the assertion to happen in the dispatch_send
path, later versions of isch-dhcp upstream lack this logic and entirely removed the existence of this flag. Therefore, removing the need for this
assertion at process_fd shouldn't be problematic. |
|
2020-08-11 19:49:26 |
Jorge Niedbalski |
attachment added |
|
lp-1872118-groovy.debdiff https://bugs.launchpad.net/ubuntu/+source/isc-dhcp/+bug/1872118/+attachment/5400760/+files/lp-1872118-groovy.debdiff |
|
2020-08-11 19:49:46 |
Jorge Niedbalski |
attachment added |
|
lp-1872118-focal.debdiff https://bugs.launchpad.net/ubuntu/+source/isc-dhcp/+bug/1872118/+attachment/5400761/+files/lp-1872118-focal.debdiff |
|
2020-08-11 20:27:56 |
Ubuntu Foundations Team Bug Bot |
tags |
focal rls-ff-incoming |
focal patch rls-ff-incoming |
|
2020-08-11 20:28:03 |
Ubuntu Foundations Team Bug Bot |
bug |
|
|
added subscriber Ubuntu Sponsors Team |
2020-08-12 16:50:02 |
Gianfranco Costamagna |
isc-dhcp (Ubuntu Focal): status |
In Progress |
Invalid |
|
2020-08-12 16:50:05 |
Gianfranco Costamagna |
isc-dhcp (Ubuntu Groovy): status |
In Progress |
Incomplete |
|
2020-08-12 16:50:09 |
Gianfranco Costamagna |
isc-dhcp (Ubuntu Groovy): status |
Incomplete |
Invalid |
|
2020-08-12 22:39:13 |
Launchpad Janitor |
bind9-libs (Ubuntu Groovy): status |
In Progress |
Fix Released |
|
2020-08-13 11:34:57 |
Gianfranco Costamagna |
bind9-libs (Ubuntu Focal): importance |
Undecided |
High |
|
2020-08-14 09:57:27 |
Timo Aaltonen |
bind9-libs (Ubuntu Focal): status |
In Progress |
Fix Committed |
|
2020-08-14 09:57:30 |
Timo Aaltonen |
bug |
|
|
added subscriber Ubuntu Stable Release Updates Team |
2020-08-14 09:57:32 |
Timo Aaltonen |
bug |
|
|
added subscriber SRU Verification |
2020-08-14 09:57:38 |
Timo Aaltonen |
tags |
focal patch rls-ff-incoming |
focal patch rls-ff-incoming verification-needed verification-needed-focal |
|
2020-08-17 00:02:07 |
Richard Laager |
tags |
focal patch rls-ff-incoming verification-needed verification-needed-focal |
focal patch rls-ff-incoming verification-done verification-done-focal |
|
2020-08-17 20:28:09 |
Jorge Niedbalski |
bug task deleted |
isc-dhcp (Ubuntu) |
|
|
2020-08-17 20:28:16 |
Jorge Niedbalski |
bug task deleted |
isc-dhcp (Ubuntu Focal) |
|
|
2020-08-17 20:28:21 |
Jorge Niedbalski |
bug task deleted |
isc-dhcp (Ubuntu Groovy) |
|
|
2020-08-17 20:29:50 |
Jorge Niedbalski |
bind9-libs (Ubuntu Groovy): importance |
Undecided |
High |
|
2020-08-18 09:33:57 |
ghyster |
bug |
|
|
added subscriber ghyster |
2020-08-21 20:52:57 |
Mark Foster (ExtraHop) |
bug |
|
|
added subscriber Mark Foster (ExtraHop) |
2020-08-25 18:19:20 |
Launchpad Janitor |
bind9-libs (Ubuntu Focal): status |
Fix Committed |
Fix Released |
|
2020-08-25 18:19:26 |
Brian Murray |
removed subscriber Ubuntu Stable Release Updates Team |
|
|
|
2020-08-25 19:03:38 |
Andrew Benson |
removed subscriber Andrew Benson |
|
|
|
2020-10-27 18:12:17 |
Brian Murray |
description |
[Description]
isc-dhcp-server uses libisc-export (coming from bind9-libs package) for handling the socket event(s) when configured in peer mode (master/secondary). It's possible that a sequence of messages dispatched by the master that requires acknowledgment from its peers holds a socket
in a pending to send state, a timer or a subsequent write request can be scheduled into this socket and the !sock->pending_send assertion
will be raised when trying to write again at the time data hasn't been flushed entirely and the pending_send flag hasn't been reset to 0 state.
If this race condition happens, the following stacktrace will be
hit:
(gdb) info threads
Id Target Id Frame
* 1 Thread 0x7fb4ddecb700 (LWP 3170) __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
2 Thread 0x7fb4dd6ca700 (LWP 3171) __lll_lock_wait (futex=futex@entry=0x7fb4de6d2028, private=0) at lowlevellock.c:52
3 Thread 0x7fb4de6cc700 (LWP 3169) futex_wake (private=<optimized out>, processes_to_wake=1, futex_word=<optimized out>) at ../sysdeps/nptl/futex-internal.h:364
4 Thread 0x7fb4de74f740 (LWP 3148) futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x7fb4de6cd0d0) at ../sysdeps/nptl/futex-internal.h:183
(gdb) frame 2
#2 0x00007fb4dec85985 in isc_assertion_failed (file=file@entry=0x7fb4decd8878 "../../../../lib/isc/unix/socket.c", line=line@entry=3361, type=type@entry=isc_assertiontype_insist,
cond=cond@entry=0x7fb4decda033 "!sock->pending_send") at ../../../lib/isc/assertions.c:52
(gdb) bt
#1 0x00007fb4deaa7859 in __GI_abort () at abort.c:79
#2 0x00007fb4dec85985 in isc_assertion_failed (file=file@entry=0x7fb4decd8878 "../../../../lib/isc/unix/socket.c", line=line@entry=3361, type=type@entry=isc_assertiontype_insist,
cond=cond@entry=0x7fb4decda033 "!sock->pending_send") at ../../../lib/isc/assertions.c:52
#3 0x00007fb4decc17e1 in dispatch_send (sock=0x7fb4de6d4990) at ../../../../lib/isc/unix/socket.c:4041
#4 process_fd (writeable=<optimized out>, readable=<optimized out>, fd=11, manager=0x7fb4de6d0010) at ../../../../lib/isc/unix/socket.c:4054
#5 process_fds (writefds=<optimized out>, readfds=0x7fb4de6d1090, maxfd=13, manager=0x7fb4de6d0010) at ../../../../lib/isc/unix/socket.c:4211
#6 watcher (uap=0x7fb4de6d0010) at ../../../../lib/isc/unix/socket.c:4397
#7 0x00007fb4dea68609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#8 0x00007fb4deba4103 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
(gdb) frame 3
#3 0x00007fb4decc17e1 in dispatch_send (sock=0x7fb4de6d4990) at ../../../../lib/isc/unix/socket.c:4041
4041 in ../../../../lib/isc/unix/socket.c
(gdb) p sock->pending_send
$2 = 1
[TEST CASE]
1) Install isc-dhcp-server in 2 focal machine(s).
2) Configure peer/cluster mode as follows:
Primary configuration: https://pastebin.ubuntu.com/p/XYj648MghK/
Secondary configuration: https://pastebin.ubuntu.com/p/PYkcshZCWK/
2) Run dhcpd as follows in both machine(s)
# dhcpd -f -d -4 -cf /etc/dhcp/dhcpd.conf --no-pid ens4
3) Leave the cluster running for a long (2h) period until the crash/race condition is reproduced.
[REGRESSION POTENTIAL]
* The fix will prevent the assertion to happen in the dispatch_send
path, later versions of isch-dhcp upstream lack this logic and entirely removed the existence of this flag. Therefore, removing the need for this
assertion at process_fd shouldn't be problematic. |
[Description]
isc-dhcp-server uses libisc-export1105 (coming from bind9-libs package) for handling the socket event(s) when configured in peer mode (master/secondary). It's possible that a sequence of messages dispatched by the master that requires acknowledgment from its peers holds a socket
in a pending to send state, a timer or a subsequent write request can be scheduled into this socket and the !sock->pending_send assertion
will be raised when trying to write again at the time data hasn't been flushed entirely and the pending_send flag hasn't been reset to 0 state.
If this race condition happens, the following stacktrace will be
hit:
(gdb) info threads
Id Target Id Frame
* 1 Thread 0x7fb4ddecb700 (LWP 3170) __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
2 Thread 0x7fb4dd6ca700 (LWP 3171) __lll_lock_wait (futex=futex@entry=0x7fb4de6d2028, private=0) at lowlevellock.c:52
3 Thread 0x7fb4de6cc700 (LWP 3169) futex_wake (private=<optimized out>, processes_to_wake=1, futex_word=<optimized out>) at ../sysdeps/nptl/futex-internal.h:364
4 Thread 0x7fb4de74f740 (LWP 3148) futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x7fb4de6cd0d0) at ../sysdeps/nptl/futex-internal.h:183
(gdb) frame 2
#2 0x00007fb4dec85985 in isc_assertion_failed (file=file@entry=0x7fb4decd8878 "../../../../lib/isc/unix/socket.c", line=line@entry=3361, type=type@entry=isc_assertiontype_insist,
cond=cond@entry=0x7fb4decda033 "!sock->pending_send") at ../../../lib/isc/assertions.c:52
(gdb) bt
#1 0x00007fb4deaa7859 in __GI_abort () at abort.c:79
#2 0x00007fb4dec85985 in isc_assertion_failed (file=file@entry=0x7fb4decd8878 "../../../../lib/isc/unix/socket.c", line=line@entry=3361, type=type@entry=isc_assertiontype_insist,
cond=cond@entry=0x7fb4decda033 "!sock->pending_send") at ../../../lib/isc/assertions.c:52
#3 0x00007fb4decc17e1 in dispatch_send (sock=0x7fb4de6d4990) at ../../../../lib/isc/unix/socket.c:4041
#4 process_fd (writeable=<optimized out>, readable=<optimized out>, fd=11, manager=0x7fb4de6d0010) at ../../../../lib/isc/unix/socket.c:4054
#5 process_fds (writefds=<optimized out>, readfds=0x7fb4de6d1090, maxfd=13, manager=0x7fb4de6d0010) at ../../../../lib/isc/unix/socket.c:4211
#6 watcher (uap=0x7fb4de6d0010) at ../../../../lib/isc/unix/socket.c:4397
#7 0x00007fb4dea68609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#8 0x00007fb4deba4103 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
(gdb) frame 3
#3 0x00007fb4decc17e1 in dispatch_send (sock=0x7fb4de6d4990) at ../../../../lib/isc/unix/socket.c:4041
4041 in ../../../../lib/isc/unix/socket.c
(gdb) p sock->pending_send
$2 = 1
[TEST CASE]
1) Install isc-dhcp-server in 2 focal machine(s).
2) Configure peer/cluster mode as follows:
Primary configuration: https://pastebin.ubuntu.com/p/XYj648MghK/
Secondary configuration: https://pastebin.ubuntu.com/p/PYkcshZCWK/
2) Run dhcpd as follows in both machine(s)
# dhcpd -f -d -4 -cf /etc/dhcp/dhcpd.conf --no-pid ens4
3) Leave the cluster running for a long (2h) period until the crash/race condition is reproduced.
[REGRESSION POTENTIAL]
* The fix will prevent the assertion to happen in the dispatch_send
path, later versions of isch-dhcp upstream lack this logic and entirely removed the existence of this flag. Therefore, removing the need for this
assertion at process_fd shouldn't be problematic. |
|
2021-07-28 22:54:06 |
Launchpad Janitor |
merge proposal linked |
|
https://code.launchpad.net/~sergiodj/ubuntu/+source/bind9-libs/+git/bind9-libs/+merge/406332 |
|