haproxy crashes on in __pool_get_first if unique-id-header is used
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
HAProxy |
Fix Released
|
Unknown
|
|||
haproxy (Debian) |
Fix Released
|
Unknown
|
|||
haproxy (Ubuntu) |
Fix Released
|
High
|
Unassigned | ||
Bionic |
Fix Released
|
Undecided
|
Christian Ehrhardt |
Bug Description
[Impact]
* The handling of locks in haproxy led to a state that between idle http
connections one could have indicated a connection was destroyed. In that
case the code went on and accessed a just freed resource. As upstream
puts it "It can have random implications between requests as
it may lead a wrong connection's polling to be re-enabled or disabled
for example, especially with threads."
* Backport the fix from upstreams 1.8 stable branch
[Test Case]
* It is a race and might be hard to trigger.
An haproxy config to be in front of three webservers can be seen below.
Setting up three apaches locally didn't trigger the same bug, but we
know it is timing sensitive.
* Simon (anbox) has a setup which reliably triggers this and will run the
tests there.
* The bad case will trigger a crash as reported below.
[Regression Potential]
* This change is in >=Disco and has no further bugs reported against it
(no follow on change) which should make it rather safe. Also no other
change to that file context in 1.8 stable since then.
The change is on the locking of connections. So if we want to expect
regressions, then they would be at the handling of concurrent
connections.
[Other Info]
* Strictly speaking it is a race, so triggering it depends on load and
machine cpu/IO speed.
---
Version 1.8.8-1ubuntu0.10 of haproxy in Ubuntu 18.04 (bionic) crashes with
-------
Thread 2.1 "haproxy" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xfffff77b1010 (LWP 17174)]
__pool_get_first (pool=0xaaaaaac
124 include/
(gdb) bt
#0 __pool_get_first (pool=0xaaaaaac
#1 pool_alloc_dirty (pool=0xaaaaaac
#2 pool_alloc (pool=0xaaaaaac
#3 conn_new () at include/
#4 cs_new (conn=0x0) at include/
#5 connect_conn_chk (t=0xaaaaaacb8820) at src/checks.c:1553
#6 process_chk_conn (t=0xaaaaaacb8820) at src/checks.c:2135
#7 process_chk (t=0xaaaaaacb8820) at src/checks.c:2281
#8 0x0000aaaaaabca0b4 in process_
#9 0x0000aaaaaab76f44 in run_poll_loop () at src/haproxy.c:2399
#10 run_thread_
#11 0x0000aaaaaaad79ec in main (argc=<optimized out>, argv=0xaaaaaac6
-------
when running on an ARM64 system. The haproxy.cfg looks like this:
-------
global
log /dev/log local0
log /dev/log local1 notice
maxconn 4096
user haproxy
group haproxy
spread-checks 0
tune.
ssl-
defaults
log global
mode tcp
option httplog
option dontlognull
retries 3
timeout queue 20000
timeout client 50000
timeout connect 5000
timeout server 50000
frontend anbox-stream-
bind 0.0.0.0:80
default_backend api_http
mode http
http-request redirect scheme https
backend api_http
mode http
frontend anbox-stream-
bind 0.0.0.0:443 ssl crt /var/lib/
default_backend app-anbox-
mode http
backend app-anbox-
mode http
balance leastconn
server anbox-stream-
server anbox-stream-
server anbox-stream-
-------
The crash occurs after a first few HTTP requests going through and happens again when systemd restarts the service.
The bug is already reported in Debian https:/
Using the 1.8.19-1+deb10u2 package from Debian fixes the crash.
Related branches
- Rafael David Tinoco (community): Approve
- Canonical Server packageset reviewers: Pending requested
- Canonical Server: Pending requested
- git-ubuntu developers: Pending requested
-
Diff: 73 lines (+51/-0)3 files modifieddebian/changelog (+6/-0)
debian/patches/lp-1884149-BUG-MEDIUM-mux_pt-dereference-the-connection-with-ca.patch (+44/-0)
debian/patches/series (+1/-0)
Changed in haproxy: | |
status: | Unknown → Fix Released |
Changed in haproxy (Ubuntu Bionic): | |
assignee: | nobody → Christian Ehrhardt (paelzer) |
Changed in haproxy (Debian): | |
status: | Unknown → Fix Released |
description: | updated |
Based on the upstream report this doesn't appear to be arm64 specific.