Comment 7 for bug 1581977

Revision history for this message
melanie witt (melwitt) wrote :

Hi Christian, thanks for reporting the security concern related to the BuildFailureWeigher. It is actually a known issue described in the following bug:

https://bugs.launchpad.net/nova/+bug/1818239

and if you check out the discussion there, the conclusion (thus far) [1] has been that while it is possible to de-prioritize a compute host by providing certain invalid inputs, it will not result in deactivating environments because the build failure weigher is only manipulating the scheduling weights of the compute hosts (decrease their ranking) but will not disable or remove them from scheduling. That is, they are still available for scheduling, just in a de-prioritized state.

Now, this can still be undesirable in a deployment because it will affect how instances are spread amongst compute hosts.

Copied from a RHBZ where I have explained this before [2]:

"... This is why the BuildFailureWeigher can be problematic, because it does not differentiate between user-caused build failures vs compute node-related build failures. Any situation where a request goes to a compute node and fails to build the instance (even a reschedule) will cause a failed_build to be tracked by the BuildFailureWeigher. The failed_build counter is reset (cleared out) for a compute node when any successful build occurs on that compute node. So, it does do some self-healing, but will still result in inconsistent instance placement if any build failures occur. If the customer environment requires a consistent placement of instances on compute nodes, it is best to disable the BuildFailureWeigher by setting [filter_scheduler]build_failure_weight_multiplier = 0."

For background, the build failure behavior was introduced to address an operator pain point where if a compute host experienced a hardware failure, for example, and was consistently selected as the first host for scheduling, the cloud could effectively become non-operational with no user able to boot an instance because no instance could get past the compute host with failed hardware and manual intervention from an admin was needed to take the broken compute host out of rotation.

So, initially a mechanism was added to completely disable compute services if they experienced a certain number of failed builds in a row without any successful builds, but this became an actual denial-of-service vector [3] and was changed into the BuildFailureWeigher as a result.

Finally, there was an attempt to "whitelist" certain types of failures to pick and choose which events result in an increment of the failed_build counter [4], but it stalled out and was abandoned because of the complexity and maintainability concerns around having a whitelist. Instead, it is recommended to set [filter_scheduler]build_failure_weight_multiplier = 0 if the BuildFailureWeigher is causing more problems than it is helping in a particular deployment.

[1] https://bugs.launchpad.net/nova/+bug/1818239/comments/21
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1701334#c17
[3] https://bugs.launchpad.net/nova/+bug/1742102
[4] https://review.opendev.org/568953