iSCSI gateways become unavailable when third unit of ceph-iscsi is deployed
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Ceph iSCSI Charm |
New
|
Undecided
|
Unassigned |
Bug Description
Steps to reproduce:
1. Deploy two ceph-iscsi gateway units
2. Create a target with 'target-create' action
3. Discover and login to target on the initiator
4. Create a filesystem and mount the shared disk on the initiator machine
5. Deploy additional (third) unit of ceph-iscsi
6. Wait until new unit of ceph-iscsi settles
7. Watch what happens now:
A. All three units go to blocked state with message "3 is an invalid unit count".
B. The disk on the initiator becomes read-only:
root@node05:
-bash: test.txt: Read-only file system
C. Both iSCSI gateways become unavailable. 'gwcli' utility reports "2 gateways are inaccessible - updates will be disabled".
D. multipath -ll on the initiator reports faulty disks:
root@node05:
mpatha (36001405bb2233
size=1.0G features='0' hwhandler='1 alua' wp=rw
`-+- policy='round-robin 0' prio=0 status=enabled
|- 6:0:0:0 sdd 8:48 failed faulty running
`- 5:0:0:0 sde 8:64 failed faulty running
Expected result:
The shared disk should still be available on the initiator. Existing gateways should still be available.
This could actually be the result of rbd-target-api service not being able to restart (https:/ /bugs.launchpad .net/charm- ceph-iscsi/ +bug/1902731).
With the patch applied manually (edited python source code in /usr/lib/ python3/ dist-packages/ ceph_iscsi_ config/ gateway. py) I don't experience all gateways going down anymore.