Support service tokens to prevent failures of long-running (1-3+ hours) retype/migration jobs
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Cinder Charm |
Confirmed
|
Undecided
|
Unassigned |
Bug Description
If you re-type a Cinder volume it is moved from one storage provider to another (live). This process can take hours or even days. In such a case the original Keystone token expires and the migration cannot be completed.
The migration gets stuck in the status "migrating" and does not error out, revert or complete. There is no automated or easy process to complete the migration. In the case of a Ceph migration this leaves the VM in a dangerous state - if you stop and start the VM it will revert from the new storage to the old storage rolling back all of the data days, week or months. While both volumes still exist depending on the application reconciling this can be very difficult.
From the upstream bug here:
https:/
This can sometimes be prevented using a service token:
https:/
This may still not entirely solve the issue as the default fernet rotation and token expiration is 3 * 1 hour = 3 hours and we may need to make further improvements in the default Keystone configuration. As documented in "Troubleshooting #3" above.
The default keystone allow_expired_
Changed in charm-cinder: | |
status: | New → Confirmed |
tags: | added: sts |