Conductor shutdown always triggers deregistration
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Ironic |
Fix Released
|
Medium
|
Mark Goddard |
Bug Description
When a conductor process is shutdown, it triggers the conductor to deregister itself from the conductor database. In a multi-conductor configuration, this causes a hash ring rebalance, and other conductor processes will take over ownership of any nodes previously assigned to the lost conductor. This process can require a fair amount of overhead, with the PXE driver requiring the PXE state to be configured on the new conductor. Worse yet, if the conductor restarts, another ring rebalance will occur, reverting to the initial state via another take over.
If the shutdown period is known in advance to be short, e.g. for an upgrade, it would be advantageous for the conductor to avoid a ring rebalance. This could be done by signalling to the conductor via some mechanism that it should not degregister itself from the conductor database, but should instead allow the registration to time out. If the conductor is restarted before the registration times out, no ring rebalances will occur.
The proposed trigger is to send SIGHUP to the conductor process.
Changed in ironic: | |
assignee: | nobody → Mark Goddard (mgoddard) |
status: | New → In Progress |
Changed in ironic: | |
importance: | Undecided → Medium |
Changed in ironic: | |
milestone: | none → kilo-3 |
status: | Fix Committed → Fix Released |
Changed in ironic: | |
milestone: | kilo-3 → 2015.1.0 |
Devananda rightly pointed out that SIGHUP is not the right to for the job.
The way I see it there are two main options:
1. A trigger that causes the process to shutdown without deregistering itself.
2. A trigger that causes the process to avoid deregistering itself when it is shutdown.
I favour the second approach, as it avoids giving a new purpose to an existing signal.
The mechanism for the trigger could be:
- A signal e.g. SIGUSR1/2.
- The existence of a file, possibly with some particular contents or name to ensure it is intended for that process.
- An API call.
The simplest option is the first, and I think think has some merit. It's main drawback is the lack of available signals, which might be reissued for other purposes in future.