Comment 4 for bug 1667756

Revision history for this message
Aaron Smith (aaron-smith) wrote : Re: [Bug 1667756] Re: Backup HA router sending traffic, traffic from switch interrupted

Hi Ann,

[root@overcloud-controller-2 ~]# rpm -qa | grep neut
openstack-neutron-lbaas-9.1.0-1.el7ost.noarch
openstack-neutron-ml2-9.1.0-8.el7ost.noarch
python-neutronclient-6.0.0-2.el7ost.noarch
python-neutron-lib-0.4.0-1.el7ost.noarch
openstack-neutron-bigswitch-lldp-9.40.0-1.1.el7ost.noarch
openstack-neutron-metering-agent-9.1.0-8.el7ost.noarch
puppet-neutron-9.4.2-1.el7ost.noarch
python-neutron-9.1.0-8.el7ost.noarch
python-neutron-tests-9.1.0-8.el7ost.noarch
python-neutron-lbaas-9.1.0-1.el7ost.noarch
openstack-neutron-openvswitch-9.1.0-8.el7ost.noarch
openstack-neutron-bigswitch-agent-9.40.0-1.1.el7ost.noarch
openstack-neutron-sriov-nic-agent-9.1.0-8.el7ost.noarch
openstack-neutron-common-9.1.0-8.el7ost.noarch
openstack-neutron-9.1.0-8.el7ost.noarch

Aaron

On Wed, Mar 1, 2017 at 5:25 PM, Ann Taraday <email address hidden>
wrote:

> On what version of Neutron have this behavior been seen?
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1667756
>
> Title:
> Backup HA router sending traffic, traffic from switch interrupted
>
> Status in neutron:
> New
>
> Bug description:
> As outlined in https://review.openstack.org/#/c/142843/, backup HA
> routers should not send any traffic. Any traffic will cause the
> connected switch to learn a new port for the associated src mac
> address since the mac address will be in use on the primary HA router.
>
> We are observing backup routers sending IPv6 RA and RS messages
> probably in response to incoming IPv6 RA messages. The subnets
> associated with the HA routers are not intended for IPv6 traffic.
>
> A typical traffic sequence is:
>
> Packet from external switch...
> 08:81:f4:a6:dc:01 > 33:33:00:00:00:01, ethertype IPv6 (0x86dd), length
> 110: (hlim 255, next-header ICMPv6 (58) payload length: 56)
> fe80:52:0:136c::fe > ff02::1: [icmp6 sum ok] ICMP6, router advertisement,
> length 56
>
> Immediately followed by a packet from the backup HA router...
> fa:16:3e:a7:ae:63 > 33:33:ff:a7:ae:63, ethertype IPv6 (0x86dd), length
> 86: (hlim 1, next-header Options (0) payload length: 32) :: >
> ff02::1:ffa7:ae63: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6,
> multicast listener reportmax resp delay: 0 addr: ff02::1:ffa7:ae63
>
> Another pkt...
> fa:16:3e:a7:ae:63 > 33:33:ff:a7:ae:63, ethertype IPv6 (0x86dd), length
> 78: (hlim 255, next-header ICMPv6 (58) payload length: 24) :: >
> ff02::1:ffa7:ae63: [icmp6 sum ok] ICMP6, neighbor solicitation, length 24,
> who has 2620:52:0:136c:f816:3eff:fea7:ae63
>
> Another Pkt...
> fa:16:3e:a7:ae:63 > 33:33:00:00:00:01, ethertype IPv6 (0x86dd), length
> 86: (hlim 255, next-header ICMPv6 (58) payload length: 32)
>
> At this point, the switch has updated its mac table and traffic to the
> fa:16:3e:a7:ae:63 address has been redirected to the backup host.
> SSH/ping traffic resumes at a later time when the primary router node
> sends traffic with the fa:16:3e:a7:ae:63 source address.
>
> This problem is reproducible in our environment as follows:
>
> 1. Deploy OSP10
> 2. Create external network
> 3. Create external subnet (IPv4)
> 4. Create an internal network and VM
> 5. Attach floating ip
> 6. ssh into the VM through the FIP or ping the FIP
> 7. you will start to see ssh freeze or the ping fail occasionally
>
>
> Additional info:
> Setting accept_ra=0 on the backup host routers stops the problem from
> happening. Unfortunately, on a reboot, we loose the setting. The
> current
> sysctl files have accept_ra=0.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/neutron/+bug/1667756/+subscriptions
>

--
Aaron Smith | Senior Principal Software Engineer
NFV Partner Engineering
Red Hat
<email address hidden>

Better technology. Faster innovation. Powered by community collaboration.
See how it works at redhat.com