FFE: HAproxy dropping connections (RST) during config reload / support seamless reload
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
haproxy (Ubuntu) |
Fix Released
|
Undecided
|
Dave Chiluk |
Bug Description
Reloading haproxy causes TCP resets for active connections. This can be a serious issue for clouds that rely on haproxy for load balancing, and as a result are restarting it frequently.
Full related blog post is here.
https:/
FFE Justification
- Description- The patchset fixes the issue by adding the -x option to haproxy. This option is used for passing the unix stats socket from the old haproxy to the new one. The old haproxy then passes connections to the new haproxy using this socket *(simplified explanation). The changes are largely isolated to new functions that implement this functionality.
- Rationale - The change is largely isolated to the new option, but for those running clouds this could be potentially very important. Clouds that are "doing it right", and treating instances as cattle are constantly tearing down and rebuilding instances. This has the side effect of constantly reloading haproxy. For example at Indeed on a few of our clouds haproxy gets restarted roughly every second. My tests show that this causes a connection reset rate of about 18 resets for 50k connections. The haproxy teams are showing 11 tcp resets for 2k connections. Either way it's greater than 0 and it's dependent on how many connections you receive and how fast you are restarting haproxy. I've chosen not to enable this by default in the systemd unit files, as enabling that the stats socket in the haproxy config match the one passed with the -x command. However for those that are seeing this problem only having to make the config and unit file changes should be a better user experience than hand building packages.
Configuration
1. Add the following lines to your haproxy.cfg
" # turn on stats unix socket
stats socket /var/lib/
stats bind-process 1
"
2. Add "HAPROXY_
3. $ sudo systemctl daemon-reload
4. $ systemctl reload haproxy.
It's important to use reload as it is accomplished using the haproxy-
Testing.
1. Configure as above.
2. put haproxy in a reload loop
$ while true ; do sudo systemctl reload haproxy ; sleep 3 ; done
3. run apache bench against it.
$ ab -r -c 20 -n 100000 http://
Results:
With these patches:
Complete requests: 100000
Failed requests: 0
Without these patches:
Complete requests: 100000
Failed requests: 81
(Connect: 0, Receive: 27, Length: 27, Exceptions: 27)
description: | updated |
description: | updated |
Changed in haproxy (Ubuntu): | |
status: | Triaged → Fix Committed |
I have created a ppa with this current patch to assist with testing.
https:/ /launchpad. net/~chiluk/ +archive/ ubuntu/ lp1712925
This patchset is still in the early stages, so use/test this with extreme caution.