Manila ceph configuration won't work in HA mode
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
kolla-ansible |
Fix Committed
|
Undecided
|
Unassigned |
Bug Description
**Bug Report**
What happened:
When more than one manila_share instance is setup with external ceph, problems occur as in the current kolla-ansible configuration the containers share the same auth ID. (With each restart of the manila_share container on a controller the ceph sessions for the other controllers are evicted).
In the manila-share logs:
In the below chunk of logs you can see each instance of manila seems to run a chunk of code to evict all other clients named manila... happens at 14:49:16.939 (06-37), 14:49:17.031 (06-39), and 14:49:17.022 (06-41).
$ ansible -i /etc/kolla/
B-06-37-
2020-11-24 14:49:16.459 6 INFO oslo_service.
2020-11-24 14:49:16.475 6 INFO oslo_service.
2020-11-24 14:49:16.483 19 INFO manila.service [-] Starting manila-share node (version 10.0.2)
2020-11-24 14:49:16.918 19 INFO manila.
2020-11-24 14:49:16.939 19 INFO ceph_volume_client [req-71c01790-
2020-11-24 14:49:16.944 19 INFO ceph_volume_client [req-71c01790-
2020-11-24 14:49:16.949 19 INFO manila.
2020-11-24 14:49:16.962 19 INFO manila.
2020-11-24 14:49:16.977 19 INFO manila.
B-06-39-
2020-11-24 14:49:16.584 6 INFO oslo_service.
2020-11-24 14:49:16.599 6 INFO oslo_service.
2020-11-24 14:49:16.606 19 INFO manila.service [-] Starting manila-share node (version 10.0.2)
2020-11-24 14:49:17.012 19 INFO manila.
2020-11-24 14:49:17.031 19 INFO ceph_volume_client [req-10e0d7ef-
2020-11-24 14:49:17.590 19 INFO ceph_volume_client [req-10e0d7ef-
2020-11-24 14:49:17.595 19 INFO manila.
2020-11-24 14:49:17.615 19 INFO manila.
2020-11-24 14:49:17.633 19 INFO manila.
B-06-41-
2020-11-24 14:49:16.569 7 INFO oslo_service.
2020-11-24 14:49:16.585 7 INFO oslo_service.
2020-11-24 14:49:16.591 20 INFO manila.service [-] Starting manila-share node (version 10.0.2)
2020-11-24 14:49:17.000 20 INFO manila.
2020-11-24 14:49:17.022 20 INFO ceph_volume_client [req-d9be838e-
2020-11-24 14:49:17.589 20 INFO ceph_volume_client [req-d9be838e-
2020-11-24 14:49:17.594 20 INFO manila.
2020-11-24 14:49:17.613 20 INFO manila.
2020-11-24 14:49:17.633 20 INFO manila.
That maps to https:/
```
if auth_id != CEPH_DEFAULT_
# Evict any other manila sessions. Only do this if we're
# using a client ID that isn't the default admin ID, to avoid
# rudely disrupting anyone else.
else:
try:
```
that calls https:/
:param premount_evict: Optional auth_id to evict before mounting the filesystem: callers
https:/
```A CephFS driver instance, represented as a backend driver section in manila.conf, requires a Ceph auth ID unique to the backend Ceph Filesystem. Using a non-unique Ceph auth ID will result in the driver unintentionally evicting other CephFS clients using the same Ceph auth ID to connect to the backend.```
The kolla-ansible configuration for Manila uses the same auth id for all the manila-share instances rather than the unique IDs which seem to be required, thus resulting in this pattern of evictions.
Manually setting each of the manila instances to use a separate auth id seems to result in multiple manila-share containers functioning without interfering with each other. (Updating the config.json to stage in a different key for each controller and then updating the manila.conf to reference this separate auth id).
(Note that we don't have `enable_
What you expected to happen:
Multiple `manila_share` containers function without interfering with each other.
How to reproduce it (minimal and precise):
Deploy manila with external ceph and manila enabled.
```
enable_manila: "yes"
enable_
```
**Environment**:
* OS (e.g. from /etc/os-release):
* Kernel (e.g. `uname -a`): Linux B-06-39-
* Docker version if applicable (e.g. `docker version`): 19.03.13
* Kolla-Ansible version (e.g. `git head or tag or stable branch` or pip package version if using release): stable/ussuri
* Docker image Install type (source/binary): source
* Docker image distribution:
* Are you using official images from Docker Hub or self built? self-built
* If self built - Kolla version and environment used to build: stable/ussuri
* Share your inventory file, globals.yml and other configuration files if relevant
I'm no Manila expert, but one thing we are missing in the configuration is the [coordination] section. This is used for synchronisation between services, and could be a factor here. It requires a functioning key/value store such as etcd or redis.
I would suggest raising this with the Manila team in #openstack-manila on IRC.