Activity log for bug #1899369

Date Who What changed Old value New value Message
2020-10-11 19:06:39 Frode Nordahl bug added bug
2020-10-11 19:07:56 Frode Nordahl description A change [0] prior to the release of OVN v20.03.0 introduced a change of behavior where the inactivity probe for the ofctrl connection defaults to 5 seconds. Since this normally is a unix socket the default was not to have a inactivity probe at all. On a busy system a inactivity probe of 5 seconds is not enough for the OVN Controller to complete programming of the switch. The change of behavior was corrected in [1] and I think it would be beneficial if Ubuntu backported this fix to the OVN package rather than having charms and/or end users work around the issue by manually configuring the timeout through the `external-ids:ovn-openflow-probe-interval` key in the Open_vSwitch table. 0: https://github.com/ovn-org/ovn/commit/c99069c8934c9ea55d310a8b6d48fb66aa477589 1: https://github.com/ovn-org/ovn/commit/b8af8549396e62d6523be18e104352e334825783 A change [0] prior to the release of OVN v20.03.0 introduced a change of behavior where the inactivity probe for the ofctrl connection defaults to 5 seconds. Since this normally is a unix socket the default was not to have a inactivity probe at all. On a busy system a inactivity probe of 5 seconds is not enough for the OVN Controller to complete programming of the switch. The change of behavior was corrected in [1] and I think it would be beneficial if Ubuntu backported this fix to the OVN package rather than having charms and/or end users work around the issue by manually configuring the timeout through the `external-ids:ovn-openflow-probe-interval` key in the Open_vSwitch table. Symptoms of this problem is that a OVN controller is either unable to do initial programming of a switch for a host with many ports and flows or that updates are lost on a functional system. The following will be printed in the log: 2020-10-11T18:56:09.355Z|30186|rconn|ERR|unix:/var/run/openvswitch/br-int.mgmt: no response to inactivity probe after 5 seconds, disconnecting 0: https://github.com/ovn-org/ovn/commit/c99069c8934c9ea55d310a8b6d48fb66aa477589 1: https://github.com/ovn-org/ovn/commit/b8af8549396e62d6523be18e104352e334825783
2020-10-12 07:18:12 Frode Nordahl ovn (Ubuntu): status New Triaged
2020-10-12 07:18:15 Frode Nordahl ovn (Ubuntu): importance Undecided High
2020-11-06 08:34:49 Launchpad Janitor merge proposal linked https://code.launchpad.net/~fnordahl/ubuntu/+source/ovn/+git/ovn/+merge/393432
2020-11-06 08:45:09 Launchpad Janitor merge proposal linked https://code.launchpad.net/~fnordahl/ubuntu/+source/ovn/+git/ovn/+merge/393433
2020-11-06 08:54:36 Launchpad Janitor merge proposal linked https://code.launchpad.net/~fnordahl/ubuntu/+source/ovn/+git/ovn/+merge/393435
2020-11-06 11:17:15 Launchpad Janitor merge proposal linked https://code.launchpad.net/~fnordahl/ubuntu/+source/ovn/+git/ovn/+merge/393441
2020-11-06 11:23:01 Launchpad Janitor merge proposal linked https://code.launchpad.net/~fnordahl/ubuntu/+source/ovn/+git/ovn/+merge/393442
2020-11-06 11:30:42 Launchpad Janitor merge proposal linked https://code.launchpad.net/~fnordahl/ubuntu/+source/ovn/+git/ovn/+merge/393443
2020-11-09 09:27:20 James Page description A change [0] prior to the release of OVN v20.03.0 introduced a change of behavior where the inactivity probe for the ofctrl connection defaults to 5 seconds. Since this normally is a unix socket the default was not to have a inactivity probe at all. On a busy system a inactivity probe of 5 seconds is not enough for the OVN Controller to complete programming of the switch. The change of behavior was corrected in [1] and I think it would be beneficial if Ubuntu backported this fix to the OVN package rather than having charms and/or end users work around the issue by manually configuring the timeout through the `external-ids:ovn-openflow-probe-interval` key in the Open_vSwitch table. Symptoms of this problem is that a OVN controller is either unable to do initial programming of a switch for a host with many ports and flows or that updates are lost on a functional system. The following will be printed in the log: 2020-10-11T18:56:09.355Z|30186|rconn|ERR|unix:/var/run/openvswitch/br-int.mgmt: no response to inactivity probe after 5 seconds, disconnecting 0: https://github.com/ovn-org/ovn/commit/c99069c8934c9ea55d310a8b6d48fb66aa477589 1: https://github.com/ovn-org/ovn/commit/b8af8549396e62d6523be18e104352e334825783 [Impact] [Test Case] [Regression Potential] [Original Bug Report] A change [0] prior to the release of OVN v20.03.0 introduced a change of behavior where the inactivity probe for the ofctrl connection defaults to 5 seconds. Since this normally is a unix socket the default was not to have a inactivity probe at all. On a busy system a inactivity probe of 5 seconds is not enough for the OVN Controller to complete programming of the switch. The change of behavior was corrected in [1] and I think it would be beneficial if Ubuntu backported this fix to the OVN package rather than having charms and/or end users work around the issue by manually configuring the timeout through the `external-ids:ovn-openflow-probe-interval` key in the Open_vSwitch table. Symptoms of this problem is that a OVN controller is either unable to do initial programming of a switch for a host with many ports and flows or that updates are lost on a functional system. The following will be printed in the log: 2020-10-11T18:56:09.355Z|30186|rconn|ERR|unix:/var/run/openvswitch/br-int.mgmt: no response to inactivity probe after 5 seconds, disconnecting 0: https://github.com/ovn-org/ovn/commit/c99069c8934c9ea55d310a8b6d48fb66aa477589 1: https://github.com/ovn-org/ovn/commit/b8af8549396e62d6523be18e104352e334825783
2020-11-09 09:27:27 James Page nominated for series Ubuntu Groovy
2020-11-09 09:27:27 James Page bug task added ovn (Ubuntu Groovy)
2020-11-09 09:27:27 James Page nominated for series Ubuntu Hirsute
2020-11-09 09:27:27 James Page bug task added ovn (Ubuntu Hirsute)
2020-11-09 09:27:27 James Page nominated for series Ubuntu Focal
2020-11-09 09:27:27 James Page bug task added ovn (Ubuntu Focal)
2020-11-09 09:27:39 James Page ovn (Ubuntu Hirsute): status Triaged Fix Released
2020-11-09 09:27:41 James Page ovn (Ubuntu Groovy): status New Triaged
2020-11-09 09:27:43 James Page ovn (Ubuntu Focal): status New Triaged
2020-11-09 09:27:45 James Page ovn (Ubuntu Focal): importance Undecided High
2020-11-09 09:27:46 James Page ovn (Ubuntu Groovy): importance Undecided High
2020-11-10 06:46:48 Frode Nordahl merge proposal unlinked https://code.launchpad.net/~fnordahl/ubuntu/+source/ovn/+git/ovn/+merge/393441
2020-11-10 07:10:45 Frode Nordahl description [Impact] [Test Case] [Regression Potential] [Original Bug Report] A change [0] prior to the release of OVN v20.03.0 introduced a change of behavior where the inactivity probe for the ofctrl connection defaults to 5 seconds. Since this normally is a unix socket the default was not to have a inactivity probe at all. On a busy system a inactivity probe of 5 seconds is not enough for the OVN Controller to complete programming of the switch. The change of behavior was corrected in [1] and I think it would be beneficial if Ubuntu backported this fix to the OVN package rather than having charms and/or end users work around the issue by manually configuring the timeout through the `external-ids:ovn-openflow-probe-interval` key in the Open_vSwitch table. Symptoms of this problem is that a OVN controller is either unable to do initial programming of a switch for a host with many ports and flows or that updates are lost on a functional system. The following will be printed in the log: 2020-10-11T18:56:09.355Z|30186|rconn|ERR|unix:/var/run/openvswitch/br-int.mgmt: no response to inactivity probe after 5 seconds, disconnecting 0: https://github.com/ovn-org/ovn/commit/c99069c8934c9ea55d310a8b6d48fb66aa477589 1: https://github.com/ovn-org/ovn/commit/b8af8549396e62d6523be18e104352e334825783 [Impact] Service/host restart or upgrade of the ovn-host package may render a host participating in a OVN network unusable as the ovn-controller process fails to complete programming of the local Open vSwitch switch flows. [Test Case] The issue was discovered when migrating a 3-node OpenStack cloud with 1000 instances deployed in our test lab. A test case could be to repeat that setup. [Regression Potential] None, the change of behavior was introduced upstream in [0] and later reversed in [1]. Keeping an idle probe for a unix socket type connection is clearly unnecessary. [Original Bug Report] A change [0] prior to the release of OVN v20.03.0 introduced a change of behavior where the inactivity probe for the ofctrl connection defaults to 5 seconds. Since this normally is a unix socket the default was not to have a inactivity probe at all. On a busy system a inactivity probe of 5 seconds is not enough for the OVN Controller to complete programming of the switch. The change of behavior was corrected in [1] and I think it would be beneficial if Ubuntu backported this fix to the OVN package rather than having charms and/or end users work around the issue by manually configuring the timeout through the `external-ids:ovn-openflow-probe-interval` key in the Open_vSwitch table. Symptoms of this problem is that a OVN controller is either unable to do initial programming of a switch for a host with many ports and flows or that updates are lost on a functional system. The following will be printed in the log: 2020-10-11T18:56:09.355Z|30186|rconn|ERR|unix:/var/run/openvswitch/br-int.mgmt: no response to inactivity probe after 5 seconds, disconnecting 0: https://github.com/ovn-org/ovn/commit/c99069c8934c9ea55d310a8b6d48fb66aa477589 1: https://github.com/ovn-org/ovn/commit/b8af8549396e62d6523be18e104352e334825783
2020-11-10 18:02:35 Brian Murray ovn (Ubuntu Groovy): status Triaged Fix Committed
2020-11-10 18:02:37 Brian Murray bug added subscriber Ubuntu Stable Release Updates Team
2020-11-10 18:02:40 Brian Murray bug added subscriber SRU Verification
2020-11-10 18:02:44 Brian Murray tags verification-needed verification-needed-groovy
2020-11-10 18:06:45 Brian Murray ovn (Ubuntu Focal): status Triaged Fix Committed
2020-11-10 18:06:52 Brian Murray tags verification-needed verification-needed-groovy verification-needed verification-needed-focal verification-needed-groovy
2020-11-18 06:55:19 Frode Nordahl tags verification-needed verification-needed-focal verification-needed-groovy verification-done-focal verification-needed verification-needed-groovy
2020-11-19 10:07:38 Ɓukasz Zemczak removed subscriber Ubuntu Stable Release Updates Team
2020-11-19 10:17:43 Launchpad Janitor ovn (Ubuntu Focal): status Fix Committed Fix Released
2020-11-26 06:56:10 Frode Nordahl tags verification-done-focal verification-needed verification-needed-groovy verification-done verification-done-focal verification-done-groovy
2020-11-30 09:29:27 Launchpad Janitor ovn (Ubuntu Groovy): status Fix Committed Fix Released
2020-12-02 09:33:54 James Page bug task added cloud-archive
2020-12-02 09:34:06 James Page nominated for series cloud-archive/ussuri
2020-12-02 09:34:06 James Page bug task added cloud-archive/ussuri
2020-12-02 09:34:06 James Page nominated for series cloud-archive/victoria
2020-12-02 09:34:06 James Page bug task added cloud-archive/victoria
2020-12-02 09:34:34 James Page cloud-archive/victoria: status New Invalid
2020-12-02 09:36:42 James Page cloud-archive: status Invalid Fix Committed
2020-12-02 09:36:47 James Page cloud-archive/ussuri: status New Fix Committed
2020-12-11 11:53:03 Launchpad Janitor merge proposal linked https://code.launchpad.net/~fnordahl/ubuntu/+source/ovn/+git/ovn/+merge/395221
2021-01-07 13:44:25 Corey Bryant cloud-archive/ussuri: status Fix Committed Fix Released