swift-recon-object-cron gets stuck
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Object Storage (swift) |
Confirmed
|
Medium
|
Unassigned |
Bug Description
[Errno 17] File exists: '/var/lock/
If the lock dir for swift-recon-cron doesn't get cleaned up because of an unclean shutdown it will never start again until an operator logs in to rmdir.
Probably this happens when someone decides they have to send sigterm to everyone using /etc/swift/
In this case I think swift-recon-cron would have an opportunity to handle SystemExit - but it might be useful to add a configurable timeout on the lockdir as well. This job is supposed to run pretty frequently to give up-to-date numbers. You don't want to overwhelm the disks - but I don't think it's going to cause systemic resource consumption issues if you were to consistently blow away >1hr old lockdirs. Maybe a default timeout in the range of 6-24hrs would be helpful and safe in nearly all imaginable configurations?
N.B. all of the code for this bin script is in the bin dir so moving it to the cli module will be a requirement for unittesting.
Changed in swift: | |
status: | New → Confirmed |
importance: | Undecided → Medium |