l2pop RPC code throwing an exception in fdb_chg_ip_tun()

Bug #1367881 reported by Brian Haley
14
This bug affects 3 people
Affects Status Importance Assigned to Milestone
neutron
Fix Released
High
Eugene Nikanorov

Bug Description

I'm seeing an error in the l2pop code where it's failing to add a flow for the ARP entry responder.

This is sometimes leading to DHCP failures for VMs, although a soft reboot typically fixes that problem.

Here is the trace:

2014-09-10 15:10:36.954 9351 ERROR neutron.agent.linux.ovs_lib [req-de0c2985-1fac-46a8-a42b-f0bad5a43805 None] OVS flows could not be applied on bridge br-tun
2014-09-10 15:10:36.954 9351 TRACE neutron.agent.linux.ovs_lib Traceback (most recent call last):
2014-09-10 15:10:36.954 9351 TRACE neutron.agent.linux.ovs_lib File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/plugins/openvswitch/agent/ovs_neutron_agent.py", line 407, in _fdb_chg_ip
2014-09-10 15:10:36.954 9351 TRACE neutron.agent.linux.ovs_lib self.local_ip, self.local_vlan_map)
2014-09-10 15:10:36.954 9351 TRACE neutron.agent.linux.ovs_lib File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/common/log.py", line 36, in wrapper
2014-09-10 15:10:36.954 9351 TRACE neutron.agent.linux.ovs_lib return method(*args, **kwargs)
2014-09-10 15:10:36.954 9351 TRACE neutron.agent.linux.ovs_lib File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/agent/l2population_rpc.py", line 250, in fdb_chg_ip_tun
2014-09-10 15:10:36.954 9351 TRACE neutron.agent.linux.ovs_lib for mac, ip in after:
2014-09-10 15:10:36.954 9351 TRACE neutron.agent.linux.ovs_lib TypeError: 'NoneType' object is not iterable
2014-09-10 15:10:36.954 9351 TRACE neutron.agent.linux.ovs_lib
2014-09-10 15:10:36.955 9351 ERROR oslo.messaging.rpc.dispatcher [req-de0c2985-1fac-46a8-a42b-f0bad5a43805 ] Exception during message handling: 'NoneType' object is not iterable
2014-09-10 15:10:36.955 9351 TRACE oslo.messaging.rpc.dispatcher Traceback (most recent call last):
2014-09-10 15:10:36.955 9351 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/oslo/messaging/rpc/dispatcher.py", line 134, in _dispatch_and_reply
2014-09-10 15:10:36.955 9351 TRACE oslo.messaging.rpc.dispatcher incoming.message))
2014-09-10 15:10:36.955 9351 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/oslo/messaging/rpc/dispatcher.py", line 177, in _dispatch
2014-09-10 15:10:36.955 9351 TRACE oslo.messaging.rpc.dispatcher return self._do_dispatch(endpoint, method, ctxt, args)
2014-09-10 15:10:36.955 9351 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/oslo/messaging/rpc/dispatcher.py", line 123, in _do_dispatch
2014-09-10 15:10:36.955 9351 TRACE oslo.messaging.rpc.dispatcher result = getattr(endpoint, method)(ctxt, **new_args)
2014-09-10 15:10:36.955 9351 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/common/log.py", line 36, in wrapper
2014-09-10 15:10:36.955 9351 TRACE oslo.messaging.rpc.dispatcher return method(*args, **kwargs)
2014-09-10 15:10:36.955 9351 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/agent/l2population_rpc.py", line 55, in update_fdb_entries
2014-09-10 15:10:36.955 9351 TRACE oslo.messaging.rpc.dispatcher self.fdb_update(context, fdb_entries)
2014-09-10 15:10:36.955 9351 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/common/log.py", line 36, in wrapper
2014-09-10 15:10:36.955 9351 TRACE oslo.messaging.rpc.dispatcher return method(*args, **kwargs)
2014-09-10 15:10:36.955 9351 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/agent/l2population_rpc.py", line 212, in fdb_update
2014-09-10 15:10:36.955 9351 TRACE oslo.messaging.rpc.dispatcher getattr(self, method)(context, values)
2014-09-10 15:10:36.955 9351 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/plugins/openvswitch/agent/ovs_neutron_agent.py", line 407, in _fdb_chg_ip
2014-09-10 15:10:36.955 9351 TRACE oslo.messaging.rpc.dispatcher self.local_ip, self.local_vlan_map)
2014-09-10 15:10:36.955 9351 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/common/log.py", line 36, in wrapper
2014-09-10 15:10:36.955 9351 TRACE oslo.messaging.rpc.dispatcher return method(*args, **kwargs)
2014-09-10 15:10:36.955 9351 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/agent/l2population_rpc.py", line 250, in fdb_chg_ip_tun
2014-09-10 15:10:36.955 9351 TRACE oslo.messaging.rpc.dispatcher for mac, ip in after:
2014-09-10 15:10:36.955 9351 TRACE oslo.messaging.rpc.dispatcher TypeError: 'NoneType' object is not iterable
2014-09-10 15:10:36.955 9351 TRACE oslo.messaging.rpc.dispatcher
2014-09-10 15:10:36.957 9351 ERROR oslo.messaging._drivers.common [req-de0c2985-1fac-46a8-a42b-f0bad5a43805 ] Returning exception 'NoneType' object is not iterable to caller

I don't know this code well enough to suggest a fix - whether it's checking the return from agent_ports.items() better, or that there is a bug elsewhere, so any help would be appreciated.

Tags: l2-pop
Changed in neutron:
importance: Undecided → High
tags: added: l2-pop
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/120976

Changed in neutron:
assignee: nobody → Eugene Nikanorov (enikanorov)
status: New → In Progress
Revision history for this message
Miguel Angel Ajo (mangelajo) wrote : Ipset

   Hi Brian!,

   Thank you for your comments and reviews on the ipset patch. If we merge the feature
(under FFE) it should be approved by today.

   I'd thank you that if you see anything which could look like a nit, please propose it
under 120806 -follow up refactor- (instead of 111877).

   If we're in time during this morning we're thinking about squashing it back
111877 (feature)<- 120087 (functional test) <- 120806 (refactor, doesn't change the logic,
but just moves functionality around).

I'll be available on IRC around 11 CEST, may be earlier.

Best regards,
Miguel Ángel.

Kyle Mestery (mestery)
Changed in neutron:
milestone: none → kilo-1
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/120976
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=c5ae5ad2637a561ccbc7484045e99d8140822b8d
Submitter: Jenkins
Branch: master

commit c5ae5ad2637a561ccbc7484045e99d8140822b8d
Author: Eugene Nikanorov <email address hidden>
Date: Fri Sep 12 09:05:39 2014 +0400

    Properly handle empty before/after notifications in l2pop code

    Change-Id: I8644bb7cc2afb3b181397a478f96927990c0a4ca
    Closes-Bug: #1367881

Changed in neutron:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (feature/lbaasv2)

Fix proposed to branch: feature/lbaasv2
Review: https://review.openstack.org/130864

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (feature/lbaasv2)
Download full text (72.6 KiB)

Reviewed: https://review.openstack.org/130864
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=c089154a94e5872efc95eab33d3d0c9de8619fe4
Submitter: Jenkins
Branch: feature/lbaasv2

commit 62588957fbeccfb4f80eaa72bef2b86b6f08dcf8
Author: Kevin Benton <email address hidden>
Date: Wed Oct 22 13:04:03 2014 -0700

    Big Switch: Switch to TLSv1 in server manager

    Switch to TLSv1 for the connections to the backend
    controllers. The default SSLv3 is no longer considered
    secure.

    TLSv1 was chosen over .1 or .2 because the .1 and .2 weren't
    added until python 2.7.9 so TLSv1 is the only compatible option
    for py26.

    Closes-Bug: #1384487
    Change-Id: I68bd72fc4d90a102003d9ce48c47a4a6a3dd6e03

commit 17204e8f02fdad046dabdb8b31397289d72c877b
Author: OpenStack Proposal Bot <email address hidden>
Date: Wed Oct 22 06:20:15 2014 +0000

    Imported Translations from Transifex

    For more information about this automatic import see:
    https://wiki.openstack.org/wiki/Translations/Infrastructure

    Change-Id: I58db0476c810aa901463b07c42182eef0adb5114

commit d712663b99520e6d26269b0ca193527603178742
Author: Carl Baldwin <email address hidden>
Date: Mon Oct 20 21:48:42 2014 +0000

    Move disabling of metadata and ipv6_ra to _destroy_router_namespace

    I noticed that disable_ipv6_ra is called from the wrong place and that
    in some cases it was called with a bogus router_id because the code
    made an incorrect assumption about the context. In other case, it was
    never called because _destroy_router_namespace was being called
    directly. This patch moves the disabling of metadata and ipv6_ra in
    to _destroy_router_namespace to ensure they get called correctly and
    avoid duplication.

    Change-Id: Ia76a5ff4200df072b60481f2ee49286b78ece6c4
    Closes-Bug: #1383495

commit f82a5117f6f484a649eadff4b0e6be9a5a4d18bb
Author: OpenStack Proposal Bot <email address hidden>
Date: Tue Oct 21 12:11:19 2014 +0000

    Updated from global requirements

    Change-Id: Idcbd730f5c781d21ea75e7bfb15959c8f517980f

commit be6bd82d43fbcb8d1512d8eb5b7a106332364c31
Author: Angus Lees <email address hidden>
Date: Mon Aug 25 12:14:29 2014 +1000

    Remove duplicate import of constants module

    .. and enable corresponding pylint check now the only offending instance
    is fixed.

    Change-Id: I35a12ace46c872446b8c87d0aacce45e94d71bae

commit 9902400039018d77aa3034147cfb24ca4b2353f6
Author: rajeev <email address hidden>
Date: Mon Oct 13 16:25:36 2014 -0400

    Fix race condition on processing DVR floating IPs

    Fip namespace and agent gateway port can be shared by multiple dvr routers.
    This change uses a set as the control variable for these shared resources
    and ensures that Test and Set operation on the control variable are
    performed atomically so that race conditions do not occur among
    multiple threads processing floating IPs.
    Limitation: The scope of this change is limited to addressing the race
    condition described in the bug report. It may not address other issues
    such as pre-existing issue wit...

Thierry Carrez (ttx)
Changed in neutron:
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/166931

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (master)

Reviewed: https://review.openstack.org/166931
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=cd4d2f03e83c326f849a0b18a666a406087c94e7
Submitter: Jenkins
Branch: master

commit cd4d2f03e83c326f849a0b18a666a406087c94e7
Author: Eugene Nikanorov <email address hidden>
Date: Mon Mar 23 07:51:06 2015 +0400

    Fix handling of before/after notifications in linuxbridge agent

    Avoid problem similar to described in bug #1367881

    Change-Id: I76059469c20be9161743ba730e46da1789ded4a8
    Closes-Bug: #1407887
    Related-Bug: #1367881

Thierry Carrez (ttx)
Changed in neutron:
milestone: kilo-1 → 2015.1.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.