neutron

Bug #1738768
Activity log

Activity log for bug #1738768

Date	Who	What changed	Old value	New value	Message
2017-12-18 11:28:52	Daniel Alvarez	bug			added bug
2017-12-18 11:33:00	Daniel Alvarez	description	I have deployed a 3 controllers - 3 computes HA environment with ML2/OVS and observed dataplane downtime when restarting/stopping neutron-l3 container on controllers. This is what I did: 1. Created a network, subnet, router, a VM and attached a FIP to the VIM 2. Left a ping running on the undercloud to the FIP 3. Stopped l3 container in controller-0. Result: Observed some packet loss while the router was failed over to controller-1 4. Stopped l3 container in controller-1 Result: Observed some packet loss while the router was failed over to controller-2 5. Stopped l3 container in controller-2 Result: No traffic to/from the FIP at all. (overcloud) [stack@undercloud ~]$ ping 10.0.0.131 PING 10.0.0.131 (10.0.0.131) 56(84) bytes of data. 64 bytes from 10.0.0.131: icmp_seq=1 ttl=63 time=1.83 ms 64 bytes from 10.0.0.131: icmp_seq=2 ttl=63 time=1.56 ms <---- Last l3 container was stopped here (step 5) in the above description ----> From 10.0.0.1 icmp_seq=10 Destination Host Unreachable From 10.0.0.1 icmp_seq=11 Destination Host Unreachable When containers are stopped, I guess that the qrouter namespace is not accessible by the kernel: [heat-admin@overcloud-controller-2 ~]$ sudo ip netns e qrouter-5244e91c-f533-4128-9289-f37c9656792c ip a RTNETLINK answers: Invalid argument RTNETLINK answers: Invalid argument setting the network namespace "qrouter-5244e91c-f533-4128-9289-f37c9656792c" failed: Invalid argument This means that not only we're getting controlplane downtime but also dataplane which could be seen as a regression when compared to non-containerized environments. The same would happen with DHCP and I expect instances not being able to fetch IP addresses from dnsmasq when dhcp containers are stopped.	I have deployed a 3 controllers - 3 computes HA environment with ML2/OVS and observed dataplane downtime when restarting/stopping neutron-l3 container on controllers. This is what I did: 1. Created a network, subnet, router, a VM and attached a FIP to the VM 2. Left a ping running on the undercloud to the FIP 3. Stopped l3 container in controller-0. Result: Observed some packet loss while the router was failed over to controller-1 4. Stopped l3 container in controller-1 Result: Observed some packet loss while the router was failed over to controller-2 5. Stopped l3 container in controller-2 Result: No traffic to/from the FIP at all. (overcloud) [stack@undercloud ~]$ ping 10.0.0.131 PING 10.0.0.131 (10.0.0.131) 56(84) bytes of data. 64 bytes from 10.0.0.131: icmp_seq=1 ttl=63 time=1.83 ms 64 bytes from 10.0.0.131: icmp_seq=2 ttl=63 time=1.56 ms <---- Last l3 container was stopped here (step 5 above)----> From 10.0.0.1 icmp_seq=10 Destination Host Unreachable From 10.0.0.1 icmp_seq=11 Destination Host Unreachable When containers are stopped, I guess that the qrouter namespace is not accessible by the kernel: [heat-admin@overcloud-controller-2 ~]$ sudo ip netns e qrouter-5244e91c-f533-4128-9289-f37c9656792c ip a RTNETLINK answers: Invalid argument RTNETLINK answers: Invalid argument setting the network namespace "qrouter-5244e91c-f533-4128-9289-f37c9656792c" failed: Invalid argument This means that not only we're getting controlplane downtime but also dataplane which could be seen as a regression when compared to non-containerized environments. The same would happen with DHCP and I expect instances not being able to fetch IP addresses from dnsmasq when dhcp containers are stopped.
2017-12-18 18:53:26	Brian Haley	bug			added subscriber Brian Haley
2017-12-19 01:15:11	Lujin Luo	neutron: status	New	Incomplete
2017-12-20 00:32:58	Lujin Luo	tags		l3
2017-12-20 09:55:20	Toni Freger	bug			added subscriber Toni Freger
2017-12-21 15:36:23	Assaf Muller	neutron: status	Incomplete	Confirmed
2017-12-21 15:36:34	Assaf Muller	neutron: importance	Undecided	Critical
2017-12-21 15:37:26	Assaf Muller	bug task added		tripleo
2017-12-21 15:37:36	Assaf Muller	tripleo: status	New	Confirmed
2017-12-22 00:14:22	Emilien Macchi	tripleo: status	Confirmed	Triaged
2017-12-22 00:14:38	Emilien Macchi	tripleo: importance	Undecided	High
2017-12-22 00:14:43	Emilien Macchi	tripleo: milestone		queens-3
2018-01-26 00:53:39	Emilien Macchi	tripleo: milestone	queens-3	queens-rc1
2018-02-09 15:01:07	Brent Eagles	tripleo: assignee		Brent Eagles (beagles)
2018-02-09 18:03:03	OpenStack Infra	tripleo: status	Triaged	In Progress
2018-03-02 20:24:19	Alex Schultz	tripleo: milestone	queens-rc1	rocky-1
2018-03-14 16:18:48	OpenStack Infra	tripleo: assignee	Brent Eagles (beagles)	Jiří Stránský (jistr)
2018-03-14 16:20:13	Jiří Stránský	tripleo: assignee	Jiří Stránský (jistr)	Brent Eagles (beagles)
2018-03-16 11:37:59	OpenStack Infra	tripleo: assignee	Brent Eagles (beagles)	Jiří Stránský (jistr)
2018-03-26 15:09:11	OpenStack Infra	tags	l3	in-stable-queens l3
2018-04-20 17:41:13	Alex Schultz	tripleo: milestone	rocky-1	rocky-2
2018-06-05 19:05:54	Emilien Macchi	tripleo: milestone	rocky-2	rocky-3
2018-07-26 13:43:30	Emilien Macchi	tripleo: milestone	rocky-3	rocky-rc1
2018-08-14 15:03:31	Alex Schultz	tripleo: milestone	rocky-rc1	stein-1
2018-10-30 16:27:00	Juan Antonio Osorio Robles	tripleo: milestone	stein-1	stein-2
2019-01-13 22:46:12	Emilien Macchi	tripleo: milestone	stein-2	stein-3
2019-03-14 02:32:08	Alex Schultz	tripleo: milestone	stein-3	stein-rc1
2019-04-12 09:14:15	Jiří Stránský	tripleo: status	In Progress	Fix Released
2019-04-12 09:16:51	Bernard Cafarelli	neutron: status	Confirmed	Invalid