Oversubscription broken for instances with NUMA topologies
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Fix Released
|
Medium
|
Stephen Finucane | ||
Rocky |
Fix Committed
|
Medium
|
Stephen Finucane |
Bug Description
As described in [1], the fix to [2] appears to have inadvertently broken oversubscription of memory for instances with a NUMA topology but no hugepages.
Steps to reproduce:
1. Create a flavor that will consume > 50% available memory for your host(s) and specify an explicit NUMA topology. For example, on my all-in-one deployment where the host has 32GB RAM, we will request a 20GB instance:
$ openstack flavor create --vcpu 2 --disk 0 --ram 20480 test.numa
$ openstack flavor set test.numa --property hw:numa_nodes=2
2. Boot an instance using this flavor:
$ openstack server create --flavor test.numa --image cirros-
3. Boot another instance using this flavor:
$ openstack server create --flavor test.numa --image cirros-
# Expected result:
The second instance should boot.
# Actual result:
The second instance fails to boot. We see the following error message in the logs.
nova-
nova-
If we revert the patch that addressed the bug [3] then we revert to the correct behaviour and the instance boots. With this though, we obviously lose whatever benefits that change gave us.
[1] http://
[2] https:/
[3] https:/
triaged as medium as while this will affect all deployment with ram_allocation_ ratio >1.0
that use numa affined guests without hugepages, the propotion of clouds that it affect is
expected to be low.
for does that are affected there is no workaround beyond disabling all numa related feature if they want to achive
memory over subscription.