If I deploy a cluster using `openstack overcloud ceph deploy` and then deploy an overcloud which uses it where CephClusterName is set to "central", or anything but it's default ("ceph"), then the overcloud deployment fails with this:
000000000073 | FATAL | Assimilate configuration from tripleo_cephadm_assimilate_conf | oc0-controller-0 | error={"changed": false, "cmd": ["podman", "run", "--rm", "--net=host", "--ipc=host", "--volume", "/etc/ceph:/etc/ceph:z", "--volume", "/home/ceph-admin/assimilate_central.conf:/home/assimilate_central.conf:z", "--entrypoint", "ceph", "undercloud.ctlplane.mydomain.tld:8787/ceph/daemon:v6.0.7-stable-6.0-pacific-centos-stream8", "--fsid", "c7b1574d-40f6-5d6a-8a86-c387957696ed", "-c", "/etc/ceph/central.conf", "-k", "/etc/ceph/central.client.admin.keyring", "config", "assimilate-conf", "-i", "/home/assimilate_central.conf"], "delta": "0:00:00.448455", "end": "2022-03-26 18:14:40.294976", "msg": "non-zero return code", "rc": 1, "start": "2022-03-26 18:14:39.846521", "stderr": "Error initializing cluster client: ObjectNotFound('RADOS object not found (error calling conf_read_file)',)", "stderr_lines": ["Error initializing cluster client: ObjectNotFound('RADOS object not found (error calling conf_read_file)',)"], "stdout": "", "stdout_lines": []}
The above failure comes from the following command being run during overcloud deployment:
podman run --rm --net=host --ipc=host --volume /etc/ceph:/etc/ceph:z --volume /home/ceph-admin/assimilate_central.conf:/home/assimilate_central.conf:z --entrypoint ceph undercloud.ctlplane.mydomain.tld:8787/ceph/daemon:v6.0.7-stable-6.0-pacific-centos-stream8 --fsid c7b1574d-40f6-5d6a-8a86-c387957696ed -c /etc/ceph/central.conf -k /etc/ceph/central.client.admin.keyring config assimilate-conf -i /home/assimilate_central.conf
Because neither /etc/ceph/central.client.admin.keyring nor /etc/ceph/central.conf exist. If I change the name in the above command to:
podman run --rm --net=host --ipc=host --volume /etc/ceph:/etc/ceph:z --volume /home/ceph-admin/assimilate_central.conf:/home/assimilate_central.conf:z --entrypoint ceph undercloud.ctlplane.mydomain.tld:8787/ceph/daemon:v6.0.7-stable-6.0-pacific-centos-stream8 --fsid c7b1574d-40f6-5d6a-8a86-c387957696ed -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring config assimilate-conf -i /home/assimilate_central.conf
Then it's fine. However, we need the ability to deploy differently named conf files and cephx keys when deploying DCN as we'll have multiple conf files and cephx keys on DCN nodes. Even though the FSID in the path keeps overwrites from happening, it will break the behavior of CephExternalMultiConfig.
To address this we should give the deployed ceph user a --cluster-name option which overrides the tripleo_cephadm_cluster variable in tripleo_ansible so that the naming convention can follow the needs of the CephExternalMultiConfig paramter.
https://github.com/openstack/tripleo-ansible/blob/master/tripleo_ansible/roles/tripleo_cephadm/defaults/main.yml#L17-L19
Related fix proposed to branch: master /review. opendev. org/c/openstack /tripleo- ansible/ +/835372
Review: https:/