Cinder backup appears as down
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Cinder |
New
|
High
|
Unassigned |
Bug Description
When doing concurrent backup operations the backup service may appear as being down and the connection with the RabbitMQ broker may be lost.
This is problematic because any monitoring service (Pacemaker, Kubernetes/
This action is usually to restart the service or stop it and run it somewhere else. In both cases this will stop all ongoing operations.
Increasing the service_down_time is not great either because it also affects cinder-volume, and it's not like 60 seconds is a low time anyway.
Example of the RabbitMQ connection issue:
2023-07-11 11:02:30.117 136067 INFO oslo.messaging.
If we increase the service_down_time we will get to see complains from the backup service about not being able to report to the DB in time.
2023-07-11 11:25:29.215 378376 WARNING oslo.service.
Changed in cinder: | |
importance: | Undecided → High |