VM creation failure due to Nova hugepage assumptions
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
In Progress
|
Medium
|
Sahid Orentino | ||
Queens |
In Progress
|
Medium
|
Sahid Orentino |
Bug Description
Description:
In Liberty and Mitaka, Nova assumes that it has exclusive access to the huge pages on the compute node. It maintains track of the total pages per NUMA node on the compute node, and then number of used (by Nova VMs) pages on each NUMA node. This is done for the three huge page sizes supported.
However, if other third party processes consume huge pages, there will be a discrepancy between the actual pages available and what Nova thinks is available. As a result, it is possible (based on the number of pages and the VM size) for Nova to think it has enough pages, when there are not enough pages. The create will fail with QEMU reporting insufficient memory available, for example.
Steps to reproduce:
1. Compute with 32768 2MB pages available, giving 16384 per NUMA node with two nodes.
2. Third party process that consumes 256 pages per NUMA node.
3. Create 15 small flavor (2GB = 1024 pages) VMs.
4. Create another small flavor VM.
Expected Result:
That the 16th VM would be created, without an error, and using huge pages on the second NUMA node (and allow more VMs as well).
Actual Result:
After step 3, Nova thinks there are 1024 pages available, but the compute host shows only 768 pages available. The scheduler thinks there is space for one more VM, it will pass the filter. The creation will commence, as Nova thinks there is enough space on NUMA node 0. QEMU will fail, indicating that there is not enough memory.
In addition, there are 16128 pages available on NUMA node 1, but Nova will not attempt using them, as it thinks there is still memory available on NUMA node 0.
In my case, I had multiple compute hosts and ended up with a "No hosts available" error, as it fails on each host when trying NUMA node 0. If, at step 4, one creates a medium flavor VM, it will succeed, as Nova will not see enough pages on NUMA node 0, and will try NUMA node 1, which has ample space.
Commentary: Nova checks total huge pages, but not available huge pages.
Note: A feature was added to master (for Newton) that has a config based mechanism to reserve huge pages for third party applications under bug 1543149. However, the Nova team indicated that this change cannot be back ported to Liberty.
Environment:
Liberty release (12.0.3), with LB, neutron networking, libvirt 1.2.17, API QEMU 1.2.17, QEMU 2.3.0.
Config:
nova flavor-key m1.small set hw:numa_nodes=1
nova flavor-key m1.small set hw:mem_
network, subnet, and standard VM create commands.
tags: | added: hugepages numa |
Changed in nova: | |
assignee: | nobody → sahid (sahid-ferdjaoui) |
status: | Expired → In Progress |
Changed in nova: | |
importance: | Undecided → Medium |
Was also wondering if the solution being targeted to Newton, should reduce the total of pages passed in, when creating a NUMAPagesTopology object in the libvirt driver, rather than alter the object's signature by adding a reserved parameter. With the former, the versioned object would not need an API change and may be a more backward compatible solution.