Keepalived scripts are not getting executed

Bug #1806004 reported by Benoît Knecht
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
openstack-ansible
Fix Released
Undecided
Benoît Knecht
keepalived (Ubuntu)
Triaged
High
Unassigned

Bug Description

After deploying OpenStack Ansible 18.1.0 on Ubuntu 18.04, I noticed the following Keepalived logs:

root@controller-dc1r02n01:~# journalctl -eu keepalived.service
Nov 28 11:11:39 controller-dc1r02n01 systemd[1]: Starting Keepalive Daemon (LVS and VRRP)...
Nov 28 11:11:39 controller-dc1r02n01 Keepalived[24979]: Starting Keepalived v1.3.9 (10/21,2017)
Nov 28 11:11:39 controller-dc1r02n01 Keepalived[24979]: Opening file '/etc/keepalived/keepalived.conf'.
Nov 28 11:11:39 controller-dc1r02n01 systemd[1]: Started Keepalive Daemon (LVS and VRRP).
Nov 28 11:11:39 controller-dc1r02n01 Keepalived[24980]: Starting Healthcheck child process, pid=24981
Nov 28 11:11:39 controller-dc1r02n01 Keepalived_healthcheckers[24981]: Opening file '/etc/keepalived/keepalived.conf'.
Nov 28 11:11:39 controller-dc1r02n01 Keepalived[24980]: Starting VRRP child process, pid=24982
Nov 28 11:11:39 controller-dc1r02n01 Keepalived_vrrp[24982]: Registering Kernel netlink reflector
Nov 28 11:11:39 controller-dc1r02n01 Keepalived_vrrp[24982]: Registering Kernel netlink command channel
Nov 28 11:11:39 controller-dc1r02n01 Keepalived_vrrp[24982]: Registering gratuitous ARP shared channel
Nov 28 11:11:39 controller-dc1r02n01 Keepalived_vrrp[24982]: Opening file '/etc/keepalived/keepalived.conf'.
Nov 28 11:11:39 controller-dc1r02n01 Keepalived_vrrp[24982]: WARNING - default user 'keepalived_script' for script execution does not exist - please create.
Nov 28 11:11:39 controller-dc1r02n01 Keepalived_vrrp[24982]: Failed to set default user for notify script /etc/keepalived/haproxy_notify.sh - ignoring
Nov 28 11:11:39 controller-dc1r02n01 Keepalived_vrrp[24982]: Unable to set default user for vrrp script haproxy_check_script - removing
Nov 28 11:11:39 controller-dc1r02n01 Keepalived_vrrp[24982]: Unable to set default user for vrrp script pingable_check_script - removing
Nov 28 11:11:39 controller-dc1r02n01 Keepalived_vrrp[24982]: Truncating auth_pass to 8 characters
Nov 28 11:11:39 controller-dc1r02n01 Keepalived_vrrp[24982]: (internal): track script haproxy_check_script not found, ignoring...
Nov 28 11:11:39 controller-dc1r02n01 Keepalived_vrrp[24982]: (internal): track script pingable_check_script not found, ignoring...
Nov 28 11:11:39 controller-dc1r02n01 Keepalived_vrrp[24982]: Truncating auth_pass to 8 characters
Nov 28 11:11:39 controller-dc1r02n01 Keepalived_vrrp[24982]: (external): track script haproxy_check_script not found, ignoring...
Nov 28 11:11:39 controller-dc1r02n01 Keepalived_vrrp[24982]: (external): track script pingable_check_script not found, ignoring...

None of the check scripts are getting executed because the keepalived_script user doesn't exist on the system, and in any case, the haproxy_check_script (which is "/bin/kill -0 `cat /var/run/haproxy.pid`") needs to run as root.

The keepalived.conf man page says that "If [script_user] is not specified, the user defaults to keepalived_script if that user exists, otherwise root", but it doesn't seem to fallback to root in this case (maybe because of enable_script_security, but it's only supposed to prevent scripts from running as root if part of the path is writable by non-root, which isn't the case here).

Anyway, setting

keepalived_global_defs:
  - enable_script_security
  - script_user root

in user_variables.yml fixes the issue:

root@controller-dc1r02n01:~# journalctl -eu keepalived.service
Nov 30 09:07:13 controller-dc1r02n01 systemd[1]: Starting Keepalive Daemon (LVS and VRRP)...
Nov 30 09:07:14 controller-dc1r02n01 Keepalived[17543]: Starting Keepalived v1.3.9 (10/21,2017)
Nov 30 09:07:14 controller-dc1r02n01 Keepalived[17543]: Opening file '/etc/keepalived/keepalived.conf'.
Nov 30 09:07:14 controller-dc1r02n01 Keepalived[17544]: Starting Healthcheck child process, pid=17546
Nov 30 09:07:14 controller-dc1r02n01 Keepalived_healthcheckers[17546]: Opening file '/etc/keepalived/keepalived.conf'.
Nov 30 09:07:14 controller-dc1r02n01 systemd[1]: Started Keepalive Daemon (LVS and VRRP).
Nov 30 09:07:14 controller-dc1r02n01 Keepalived[17544]: Starting VRRP child process, pid=17549
Nov 30 09:07:14 controller-dc1r02n01 Keepalived_vrrp[17549]: Registering Kernel netlink reflector
Nov 30 09:07:14 controller-dc1r02n01 Keepalived_vrrp[17549]: Registering Kernel netlink command channel
Nov 30 09:07:14 controller-dc1r02n01 Keepalived_vrrp[17549]: Registering gratuitous ARP shared channel
Nov 30 09:07:14 controller-dc1r02n01 Keepalived_vrrp[17549]: Opening file '/etc/keepalived/keepalived.conf'.
Nov 30 09:07:14 controller-dc1r02n01 Keepalived_vrrp[17549]: Truncating auth_pass to 8 characters
Nov 30 09:07:14 controller-dc1r02n01 Keepalived_vrrp[17549]: Truncating auth_pass to 8 characters
Nov 30 09:07:14 controller-dc1r02n01 Keepalived_vrrp[17549]: Using LinkWatch kernel netlink reflector...
Nov 30 09:07:14 controller-dc1r02n01 Keepalived_vrrp[17549]: VRRP_Script(pingable_check_script) succeeded
Nov 30 09:07:14 controller-dc1r02n01 Keepalived_vrrp[17549]: VRRP_Script(haproxy_check_script) succeeded
Nov 30 09:07:14 controller-dc1r02n01 Keepalived_vrrp[17549]: VRRP_Instance(internal) Transition to MASTER STATE
Nov 30 09:07:15 controller-dc1r02n01 Keepalived_vrrp[17549]: VRRP_Instance(external) Transition to MASTER STATE
Nov 30 09:07:15 controller-dc1r02n01 Keepalived_vrrp[17549]: VRRP_Instance(internal) Entering MASTER STATE
Nov 30 09:07:15 controller-dc1r02n01 Keepalived_vrrp[17549]: VRRP_Group(haproxy) Syncing instances to MASTER state
Nov 30 09:07:15 controller-dc1r02n01 Keepalived_vrrp[17549]: Opening script file /etc/keepalived/haproxy_notify.sh
Nov 30 09:07:16 controller-dc1r02n01 Keepalived_vrrp[17549]: VRRP_Instance(external) Entering MASTER STATE

I'll submit a patch to set "script_user root" by default.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to openstack-ansible (master)

Fix proposed to branch: master
Review: https://review.openstack.org/621125

Changed in openstack-ansible:
assignee: nobody → Benoît Knecht (benoit-knecht)
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to openstack-ansible (master)

Reviewed: https://review.openstack.org/621125
Committed: https://git.openstack.org/cgit/openstack/openstack-ansible/commit/?id=d6ca5af79ec10d1a6c2cd98b27ca7e755abfc9b5
Submitter: Zuul
Branch: master

commit d6ca5af79ec10d1a6c2cd98b27ca7e755abfc9b5
Author: Benoît Knecht <email address hidden>
Date: Fri Nov 30 10:36:05 2018 +0100

    Set Keepalived script_user to root

    Otherwise, Keepalived tries to execute its scripts as
    `keepalived_script`, which doesn't exist, so none of the scripts get
    executed at all. On top of that, some of the scripts require root
    privileges, so `script_user` needs to be set to `root`.

    Change-Id: Ia5cc0154cf520132d133679b64fc5f5c698dce85
    Closes-Bug: 1806004

Changed in openstack-ansible:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/openstack-ansible 19.0.0.0b1

This issue was fixed in the openstack/openstack-ansible 19.0.0.0b1 development milestone.

Revision history for this message
Bryce Harrington (bryce) wrote :

Thanks for the fix Benoît.

The upstream fix is attached, it's a one line yaml fix to openstack-ansible's keepalived configuration. The fix looks straightforward enough, though this bug will need a test case described to reproduce the fault, esp. if this needs sru'd to bionic or other stable releases.

Changed in keepalived (Ubuntu):
importance: Undecided → High
status: New → Triaged
Revision history for this message
Bryce Harrington (bryce) wrote :

Ah looks like Ubuntu does not package openstack-ansible, so the patch is not relevant here.

Regarding the keepalived configuration defaults, I see this is hardcoded at line 983 of lib/notify.c to "keepalived_script"; as you mentioned, it indeed doesn't appear to be doing a check that the user exists. Can you please report that portion of the problem to keepalived upstream? Please add a link to that bug report here so we can track its progress.

A second issue is the lack of a 'keepalived_script" user/group being created by the package installation. For this, a keepalived.postinst and corresponding keepalived.postrm should be added by debian to create and remove the user and group.

Revision history for this message
Bryce Harrington (bryce) wrote :

There's probably a few ways to do the user/group add. Here's one typical approach.

Revision history for this message
Bryce Harrington (bryce) wrote :

...and corresponding postrm

Revision history for this message
Bryce Harrington (bryce) wrote :

Debian should be consulted about this issue, to see if they have particular druthers regarding the addition of the user, or if they'd prefer to see this issue fixed some other way (or left as-is for users to configure themselves). That way we can continue to avoid having a ubuntu delta for this package.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.