HAProxy healthchecks sometimes fail on a FIPS-enabled control plane
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
tripleo |
Confirmed
|
Medium
|
Damien Ciabrini |
Bug Description
On a FIPS-enabled HA control plane deployed on VMs, we are seeing a high number of healthchecks failure in HAProxy logs for all services, but most of them impact the mysql service.
[WARNING] (730596) : Health check for backup server mysql/controlle
[WARNING] (730596) : Health check for backup server mysql/controlle
This seems to be a systemic issue on the environment, which is consuming a lot of sys time (from 10 to 30 sys time in top). Under this situation, the galera service itself is working, but the healthcheck are incorrectly parsed by HAProxy, due to what seems to be a race condition in socket closure between HAProxy and the healthcheck script `clustercheck`
Related fix proposed to branch: stable/zed /review. opendev. org/c/openstack /puppet- tripleo/ +/884158
Review: https:/