OpenStack Compute (nova)

Nova creates duplicate Neutron ports on instance reschedule

Bug #1609526 reported by Major Hayden on 2016-08-03

This bug report is a duplicate of: Bug #1639230: reschedule fails with ip already allocated error. Edit Remove

102

This bug affects 20 people

Affects		Status	Importance	Assigned to	Milestone
	OpenStack Compute (nova)	Fix Committed	Medium	Liyingjun

Bug Description

Consider this environment:

* Running stable/mitaka (latest available)
* Four hypervisors
* Two glance nodes (A and B)
* The glance nodes are storing images locally but the image files aren't in sync between both hosts

When I request a new instance, the following happens:

* Instance is scheduled to hypervisor A
* Hypervisor A checks to see if the image is available for use -- SUCCESS
* Hypervisor A calls neutron for a network port -- SUCCESS
* Hypervisor A tries to download image from glance server A -- FAILURE (glance server A doesn't have the image cached on its filesystem)
* Instance is rescheduled to hypervisor B
* Hypervisor B checks to see if the image is available for use -- SUCCESS
* Hypervisor B calls neutron for a network port -- SUCCESS
* Hypervisor B downloads an image from glance server B -- SUCCESS (glance server B has the image on its filesystem)

The instance will come up on hypervisor B with two ports attached to the instance. The second one (requested by hypervisor B) will be up and fully functional. The first port (requested by hypervisor A) will be marked as 'down' and won't be usable.

It seems like nova-compute should call neutron to say "I don't need that network port any longer since I can't get what I need to build the rest of the instance" and clean up that port. Without the cleanup, an instance can end up with a lot of ports attached and potentially waste a lot of IPv4 address space.

I wrote more details on this issue here: https://major.io/2016/08/03/openstack-instances-come-online-with-multiple-network-ports-attached/

Tags:

Major Hayden (rackerhacker) on 2016-08-03

summary:

- nova doesn't clean up network ports when an image fails to download from
+ nova should clean up network ports when an image fails to download from
glance

Revision history for this message

Rui Chen (kiwik-chenrui) wrote on 2016-09-01: Re: nova should clean up network ports when an image fails to download from glance

Today I face the same issue in my devstack.

stack@szxbzci0004 ~/nova (master *) $ git log -1
commit e9d503a1202fadd5163e343424cf15285f5dc016
Merge: 5426d95 a6ad102
Author: Jenkins <email address hidden>
Date: Thu Sep 1 03:15:49 2016 +0000

Merge "Update placement config reno"

I have two compute nodes, but one of them(A) exist RBD configure issue, so when libvirt try to launch the instance, a LibvirtError is raised, the instance is rescheduled to another compute node(B), but the linux bridge isn't cleaned up on compute node A, and the instance launch on compute node B successfully, but it allocate port again, so the instance run with two ports.

See my operation details:
http://paste.openstack.org/show/565674/

Changed in nova:
status:	New → Confirmed

Zhenyu Zheng (zhengzhenyu) on 2016-09-02

Changed in nova:
assignee:	nobody → Zhenyu Zheng (zhengzhenyu)

Revision history for this message

cloudbuilders (operations-8) wrote on 2016-09-06:

We've came across this problem as well.
We have 4 Glance nodes, with the images mounted on an NFS volume. One of the Glance instances went down, and it failed mounting the NFS when it rebooted. We started having VMs with more than one port assigned (showing more than one IP per VM in Horizon.)

Seems to us that Nova should tell Neutron, either to delete the unused port, or update it instead of creating a new one.

Revision history for this message

Maciej Szankin (mszankin) wrote on 2016-11-18:

Zhenyu Zheng, how is the work going? It has been some time since your last activity. If you are actively working on this item can you confirm, otherwise unassign yourself?

Roman Podoliaka (rpodolyaka) on 2016-12-15

summary:	- nova should clean up network ports when an image fails to download from - glance + Nova creates duplicate Neutron ports on instance reschedule
Changed in nova:
importance:	Undecided → Medium

Eugene Nikanorov (enikanorov) on 2016-12-19

Changed in nova:
assignee:	Zhenyu Zheng (zhengzhenyu) → nobody

Revision history for this message

Piyush Srivastava (piyush0101) wrote on 2017-05-18:

We have run into this issue on Mitaka as well. Its not happening for the same reason i.e glance image failing to download.

For us, one of the hypervisors did not have propert virt enabled which caused the instance launch to fail on that hypervisor and reschedule on a different one. However, the port that was created while the instance was attempting to launch on the first one was still there and not cleaned up.

Result was two ports attached to the instance and only one of them being in use.

Liyingjun (liyingjun) on 2017-05-24

Changed in nova:
assignee:	nobody → Liyingjun (liyingjun)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-05-24: Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/467509

Changed in nova:
status:	Confirmed → In Progress

Revision history for this message

Jirayut Nimsaeng (winggundamth) wrote on 2017-06-26:

I'm sorry. Seems like I have problem with my touchpad so it click automatically.

information type:	Public → Public Security
information type:	Public Security → Private Security
information type:	Private Security → Public

Revision history for this message

Sean Dague (sdague) wrote on 2017-06-27:

Automatically discovered version mitaka in description. If this is incorrect, please update the description to include 'nova version: ...'

tags:

added: openstack-version.mitaka

Revision history for this message

melanie witt (melwitt) wrote on 2017-07-13:

This might be a duplicate of https://bugs.launchpad.net/nova/+bug/1639230 that was fixed by https://review.openstack.org/393805

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-07-14: Change abandoned on nova (master)

Change abandoned by Li Yingjun (<email address hidden>) on branch: master
Review: https://review.openstack.org/467509
Reason: confirmed, already fixed in https://review.openstack.org/#/c/393805/

Liyingjun (liyingjun) on 2017-07-14

Changed in nova:
status:	In Progress → Fix Committed

Report a bug

This report contains Public information

Everyone can see this information.

Duplicate of bug #1639230 Remove

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.