A recent change https://review.openstack.org/#/c/52189/9 introduced the automatic disable / re-enable of nova-compute when connection to libvirt is lost and recovered.
While the idea is a good one the implementation means that any existing disabled status is lost (its very common on a large system for specific hosts to have been disabled by an administrator for a number of reasons, e.g. new servers still being commissioned, planned maintenance, reserved capacity, etc). As implemented this change will remove that disables status - returning nodes to the state where instances are scheduled to them even when the admin has explicit tried to prevent this.
Suggest that this change is backed out and replaced by an additional status value on each service so that there is separation between manual service enable/disable and automatic enable/disable based on detected errors.
Looking at the change it current just passes "" as the disable reason string when connection to libvirtd is lost.
It seems that we could just enhance the code to pass a pre-determined string "Libvirtd offline" and then when connection to libivrtd is re-established, we can check for that disable reason, to see if we should automatically re-enable it or not.
If we wanted to avoid the string reason comparison, then adding eithe boolean flag to to track automatic vs manual disablement could be an option.
I think this could be done as a followup patch - don't see a compelling reason to revert the existing patch