Comment 0 for bug 1868557

Revision history for this message
Edward Hope-Morley (hopem) wrote :

On a node that has multiple networks configured and vaultlocker is used for decrypting ceph osds, if vaultlocker starts (specifically vaultlocker-decrypt systemd units) prior to dns being configured, it appears that it will spin forever when the vault url contains hostnames (i.e. not IP addresses). What we see is that there are no crypt- devices and there are per-osd vaultocker processes running that if we strace we see are spinning in select(NULL, NULL, ...) which is socket.gethostname() at [1]. The only way to fix this currently is to manually restart the vaultlocker process so that current dns settings are picked up. It appears that this behavior was introduced by the fix for bug 1838607 [2] which means that vaultlocker no longer waits for all networking to be UP and ready and therefor does not wait for dns to be setup.

We tried adding After=nss-lookup.target to the vaultlocker-decrypt unit configs and rebooted the node and that resolved the problem.

[1] https://github.com/openstack-charmers/vaultlocker/blob/master/vaultlocker/shell.py#L54
[2] https://github.com/openstack-charmers/vaultlocker/pull/7/files