Loadbalancer quotas on the service_domain/services project cause denial of service after 10 loadbalancers are created

Bug #1850985 reported by Drew Freiberger
18
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Octavia Charm
Fix Released
High
David Ames

Bug Description

When deploying Octavia on a bionic-stein cloud, we are finding that all loadbalancers after the 10th loadbalancer are going into immediate error state with logs showing that neutron notes quota limits reached on security_groups for the services project that octavia runs under.

It is my belief that the quotas assigned to projects such as "loadbalancers, pools, listeners, ports, etc" should be the limiting factor for how many loadbalancers/pools can be managed, however, there is a limitation in the back-end for the service_domain/services project that by default allows secgroups of 1000 or -1, secgroup-rules of 1000 or -1, and ports of 1000 or -1 so that octavia doesn't fail in creating loadbalancers due to services project quotas.

I believe the octavia charm should manage setting reasonable values for all associated quota elements in the network and compute stacks for managing amphorae VMs, VIPs, and ports.

Changed in charm-octavia:
importance: Undecided → Wishlist
status: New → Triaged
Revision history for this message
James Troup (elmo) wrote :

Respectfully, I think this is a higher than wishlist bug; our default configuration is quite simply broken. If you use it, there's a landmine waiting for you the first time someone uses Octavia that is not trivial to debug if you've not run into it before.

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

I was unsure of whether to put this as wishlist or something else. I think it needs some discussion, as I wonder if there is DOS type issue around 'unlimited' quotas, or if there is a way of tying the quota to a tenant project. I've pushed it back to undecided, as I agree that 'wishlist' would just hide it. I'm also dropping triaged to get it to pop back into new.

Changed in charm-octavia:
importance: Wishlist → Undecided
status: Triaged → New
Revision history for this message
James Page (james-page) wrote :

Reading the discuss I think this does need some focus and that the octavia configure-resources action is a good place to implement any quota requirements.

We just need to figure out what are sane defaults.

Changed in charm-octavia:
status: New → Triaged
importance: Undecided → High
milestone: none → 20.01
James Page (james-page)
Changed in charm-octavia:
milestone: 20.01 → 20.05
Revision history for this message
Drew Freiberger (afreiberger) wrote :

FYI, tripleo solves this with
https://opendev.org/openstack/tripleo-common/commit/287f110d187b7fe508858e157d9c9ddb6b398f1e

      openstack quota set --cores -1 --ram -1 --ports -1 --instances -1 --secgroups -1 {{ auth_project_name }}

Revision history for this message
Chris Sanders (chris.sanders) wrote :

We've now seen cloud outages from this setting on several occasions, I'm marking this field high as I think this demonstrates that the default is causing user errors and needs adjustments.

Revision history for this message
Drew Freiberger (afreiberger) wrote :

Some things to consider when deciding defaults on this should be:

1. Use of octavia as ingress for k8s-on-openstack services could result in hundreds of loadbalancers being deployed.
2. Use of unrestricted quotas will pass the buck on to placement-api hypervisor overcommit management as well as network IP availability, which seems a more logical place to handle resource limitation for what amounts to openstack networking infrastructure.
3. The flavor used for amphora along with the number of amphora expected should be considered in any non-unlimited quotas.

If we don't want to choose unlimited quotas, perhaps the octavia charm could have an "expected-max-amphora" setting (similar to ceph-mon's expected-osd-count) and the "configure-resources" action could check the current amphora flavor and expected-max-amphora count and make calculations from there to set reasonable quota defaults.

Revision history for this message
James Page (james-page) wrote :

I think unlimited quotas for the octavia/service project is OK given that octavia has a quota system for the loadbalancers which is enforced for each project consuming them. That gives operators the required control points to limit the amount of resource a single project can consume on the cloud, without limiting the underlying service account in terms of ports/cpu/ram.

David Ames (thedac)
Changed in charm-octavia:
milestone: 20.05 → 20.08
Revision history for this message
Tori Hegarty (ptoridactyl) wrote :

We have seen another re-occurrence of this outage, would it be possible to have someone assigned to this bug?

Revision history for this message
David Ames (thedac) wrote :

Reading through the bug it seems James has signed off on unlimited quotas for the octavia/service project in comment #7 and suggested the place to do this is in configure-resources action in comment #3.

Drew's pointed out what we need in comment #4

openstack quota set --cores -1 --ram -1 --ports -1 --instances -1 --secgroups -1 {{ auth_project_name }}

So now we just need to do it.

Changed in charm-octavia:
assignee: nobody → David Ames (thedac)
Changed in charm-octavia:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-octavia (master)

Reviewed: https://review.opendev.org/736868
Committed: https://git.openstack.org/cgit/openstack/charm-octavia/commit/?id=32ef4539eb88020cc2c17473273ae8e8d01f9ad6
Submitter: Zuul
Branch: master

commit 32ef4539eb88020cc2c17473273ae8e8d01f9ad6
Author: David Ames <email address hidden>
Date: Thu Jun 18 16:27:28 2020 -0700

    Set services quotas to unlimited

    Automatically set services project quotas to unlimited to enable production
    readiness for Octavia loadbalancers.

    Closes-Bug: #1850985
    Change-Id: I14aa6456159a6e7c205ebd0eff62437db9afa13b

Changed in charm-octavia:
status: In Progress → Fix Committed
Changed in charm-octavia:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.