Large network throttle values cause instance launch failure

Bug #1796410 reported by Eric Miller
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Triaged
Low
Unassigned

Bug Description

Steps to reproduce
==================

Set network throttle properties of a flavor to something large, such as 40Gbps (specified in KiB per second):
quota:vif_inbound_average='4882813', quota:vif_outbound_average='4882813'

Launch an instance.

Easy to reproduce. Lowering the quota values to something smaller (15Gbps or less is what I have tested so far), works fine, so I'm guessing we're running into a 32-bit number issue (too large of throttle to store in a 32-bit number), or a signed number issue (negative number is being generated by setting the throttle too high and wrapping around.

Might be a limitation in LibVirt.

Expected result
===============

Successful instance launch.

Actual result
=============

Server status is ERROR. Fault information indicates:
build of instance XXXXX aborted. Volume attachment YYYYY could not be found.

where XXXXX and YYYYY are UUIDs.

Environment
===========

stable/rocky deployed with Kolla-Ansible 7.0.0.0rc2 with latest Kolla and Kolla-Ansible built from source.

All machines are running CentOS with the latest elrepo kernel (as of today): 4.18.12-1.el7.elrepo.x86_64

Hypervisor is KVM.

Storage is Ceph on SSDs.

Network is Neutron DVR (with OpenVSwitch).

Regarding version of nova - I'm not sure how to get this from the containers since rpm lists nothing from inside any of the Kolla Nova containers.

Changed in nova:
importance: Undecided → Low
status: New → Triaged
tags: added: libvirt network neutron
Revision history for this message
sean mooney (sean-k-mooney) wrote :

i am rather conflicted on how to triage this bug.

while still possible the use of the vif quotas e.g.
quota:vif_inbound_average='4882813', quota:vif_outbound_average='4882813'

is strongly discuraged in favour of neuton qos policies.

they are not supported by all hyperviors and the range of values may be hyperviour specific.
beyond the interop concerns these options also have no effect for ovs-dpdk vpp or other vhost-user based VIF and likely do not work for hardware offloaded ovs or directmode sriov.

with that in mind however we have not deprecated the bandwith io quotas as such they should still be supported to at least some degree.

i think we should really consider deprecateing the option in Stein and remove in the T release.
for now however we should also update the documention to refect the valid range and direct people to the neutron qos policies instead.

tags: added: docs
Revision history for this message
Eric Miller (erickmiller) wrote :

I had looked at Neutron QoS policies, but it looked like these were network policies, not per-flavor policies. We need the ability for flavors to define the policy since our users will create their own projects, create their own networks, and launch their own VMs, and should not have the ability to affect the QoS policy on their own infrastructure.

If you know of a way to do the above with Neutron QoS policies, I will follow your lead.

Thanks!

Eric

tags: added: doc
removed: docs
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.