2020-10-11 19:06:39 |
Frode Nordahl |
bug |
|
|
added bug |
2020-10-11 19:07:56 |
Frode Nordahl |
description |
A change [0] prior to the release of OVN v20.03.0 introduced a change of behavior where the inactivity probe for the ofctrl connection defaults to 5 seconds. Since this normally is a unix socket the default was not to have a inactivity probe at all.
On a busy system a inactivity probe of 5 seconds is not enough for the OVN Controller to complete programming of the switch.
The change of behavior was corrected in [1] and I think it would be beneficial if Ubuntu backported this fix to the OVN package rather than having charms and/or end users work around the issue by manually configuring the timeout through the `external-ids:ovn-openflow-probe-interval` key in the Open_vSwitch table.
0: https://github.com/ovn-org/ovn/commit/c99069c8934c9ea55d310a8b6d48fb66aa477589
1: https://github.com/ovn-org/ovn/commit/b8af8549396e62d6523be18e104352e334825783 |
A change [0] prior to the release of OVN v20.03.0 introduced a change of behavior where the inactivity probe for the ofctrl connection defaults to 5 seconds. Since this normally is a unix socket the default was not to have a inactivity probe at all.
On a busy system a inactivity probe of 5 seconds is not enough for the OVN Controller to complete programming of the switch.
The change of behavior was corrected in [1] and I think it would be beneficial if Ubuntu backported this fix to the OVN package rather than having charms and/or end users work around the issue by manually configuring the timeout through the `external-ids:ovn-openflow-probe-interval` key in the Open_vSwitch table.
Symptoms of this problem is that a OVN controller is either unable to do initial programming of a switch for a host with many ports and flows or that updates are lost on a functional system. The following will be printed in the log:
2020-10-11T18:56:09.355Z|30186|rconn|ERR|unix:/var/run/openvswitch/br-int.mgmt: no response to inactivity probe after 5 seconds, disconnecting
0: https://github.com/ovn-org/ovn/commit/c99069c8934c9ea55d310a8b6d48fb66aa477589
1: https://github.com/ovn-org/ovn/commit/b8af8549396e62d6523be18e104352e334825783 |
|
2020-10-12 07:18:12 |
Frode Nordahl |
ovn (Ubuntu): status |
New |
Triaged |
|
2020-10-12 07:18:15 |
Frode Nordahl |
ovn (Ubuntu): importance |
Undecided |
High |
|
2020-11-06 08:34:49 |
Launchpad Janitor |
merge proposal linked |
|
https://code.launchpad.net/~fnordahl/ubuntu/+source/ovn/+git/ovn/+merge/393432 |
|
2020-11-06 08:45:09 |
Launchpad Janitor |
merge proposal linked |
|
https://code.launchpad.net/~fnordahl/ubuntu/+source/ovn/+git/ovn/+merge/393433 |
|
2020-11-06 08:54:36 |
Launchpad Janitor |
merge proposal linked |
|
https://code.launchpad.net/~fnordahl/ubuntu/+source/ovn/+git/ovn/+merge/393435 |
|
2020-11-06 11:17:15 |
Launchpad Janitor |
merge proposal linked |
|
https://code.launchpad.net/~fnordahl/ubuntu/+source/ovn/+git/ovn/+merge/393441 |
|
2020-11-06 11:23:01 |
Launchpad Janitor |
merge proposal linked |
|
https://code.launchpad.net/~fnordahl/ubuntu/+source/ovn/+git/ovn/+merge/393442 |
|
2020-11-06 11:30:42 |
Launchpad Janitor |
merge proposal linked |
|
https://code.launchpad.net/~fnordahl/ubuntu/+source/ovn/+git/ovn/+merge/393443 |
|
2020-11-09 09:27:20 |
James Page |
description |
A change [0] prior to the release of OVN v20.03.0 introduced a change of behavior where the inactivity probe for the ofctrl connection defaults to 5 seconds. Since this normally is a unix socket the default was not to have a inactivity probe at all.
On a busy system a inactivity probe of 5 seconds is not enough for the OVN Controller to complete programming of the switch.
The change of behavior was corrected in [1] and I think it would be beneficial if Ubuntu backported this fix to the OVN package rather than having charms and/or end users work around the issue by manually configuring the timeout through the `external-ids:ovn-openflow-probe-interval` key in the Open_vSwitch table.
Symptoms of this problem is that a OVN controller is either unable to do initial programming of a switch for a host with many ports and flows or that updates are lost on a functional system. The following will be printed in the log:
2020-10-11T18:56:09.355Z|30186|rconn|ERR|unix:/var/run/openvswitch/br-int.mgmt: no response to inactivity probe after 5 seconds, disconnecting
0: https://github.com/ovn-org/ovn/commit/c99069c8934c9ea55d310a8b6d48fb66aa477589
1: https://github.com/ovn-org/ovn/commit/b8af8549396e62d6523be18e104352e334825783 |
[Impact]
[Test Case]
[Regression Potential]
[Original Bug Report]
A change [0] prior to the release of OVN v20.03.0 introduced a change of behavior where the inactivity probe for the ofctrl connection defaults to 5 seconds. Since this normally is a unix socket the default was not to have a inactivity probe at all.
On a busy system a inactivity probe of 5 seconds is not enough for the OVN Controller to complete programming of the switch.
The change of behavior was corrected in [1] and I think it would be beneficial if Ubuntu backported this fix to the OVN package rather than having charms and/or end users work around the issue by manually configuring the timeout through the `external-ids:ovn-openflow-probe-interval` key in the Open_vSwitch table.
Symptoms of this problem is that a OVN controller is either unable to do initial programming of a switch for a host with many ports and flows or that updates are lost on a functional system. The following will be printed in the log:
2020-10-11T18:56:09.355Z|30186|rconn|ERR|unix:/var/run/openvswitch/br-int.mgmt: no response to inactivity probe after 5 seconds, disconnecting
0: https://github.com/ovn-org/ovn/commit/c99069c8934c9ea55d310a8b6d48fb66aa477589
1: https://github.com/ovn-org/ovn/commit/b8af8549396e62d6523be18e104352e334825783 |
|
2020-11-09 09:27:27 |
James Page |
nominated for series |
|
Ubuntu Groovy |
|
2020-11-09 09:27:27 |
James Page |
bug task added |
|
ovn (Ubuntu Groovy) |
|
2020-11-09 09:27:27 |
James Page |
nominated for series |
|
Ubuntu Hirsute |
|
2020-11-09 09:27:27 |
James Page |
bug task added |
|
ovn (Ubuntu Hirsute) |
|
2020-11-09 09:27:27 |
James Page |
nominated for series |
|
Ubuntu Focal |
|
2020-11-09 09:27:27 |
James Page |
bug task added |
|
ovn (Ubuntu Focal) |
|
2020-11-09 09:27:39 |
James Page |
ovn (Ubuntu Hirsute): status |
Triaged |
Fix Released |
|
2020-11-09 09:27:41 |
James Page |
ovn (Ubuntu Groovy): status |
New |
Triaged |
|
2020-11-09 09:27:43 |
James Page |
ovn (Ubuntu Focal): status |
New |
Triaged |
|
2020-11-09 09:27:45 |
James Page |
ovn (Ubuntu Focal): importance |
Undecided |
High |
|
2020-11-09 09:27:46 |
James Page |
ovn (Ubuntu Groovy): importance |
Undecided |
High |
|
2020-11-10 06:46:48 |
Frode Nordahl |
merge proposal unlinked |
https://code.launchpad.net/~fnordahl/ubuntu/+source/ovn/+git/ovn/+merge/393441 |
|
|
2020-11-10 07:10:45 |
Frode Nordahl |
description |
[Impact]
[Test Case]
[Regression Potential]
[Original Bug Report]
A change [0] prior to the release of OVN v20.03.0 introduced a change of behavior where the inactivity probe for the ofctrl connection defaults to 5 seconds. Since this normally is a unix socket the default was not to have a inactivity probe at all.
On a busy system a inactivity probe of 5 seconds is not enough for the OVN Controller to complete programming of the switch.
The change of behavior was corrected in [1] and I think it would be beneficial if Ubuntu backported this fix to the OVN package rather than having charms and/or end users work around the issue by manually configuring the timeout through the `external-ids:ovn-openflow-probe-interval` key in the Open_vSwitch table.
Symptoms of this problem is that a OVN controller is either unable to do initial programming of a switch for a host with many ports and flows or that updates are lost on a functional system. The following will be printed in the log:
2020-10-11T18:56:09.355Z|30186|rconn|ERR|unix:/var/run/openvswitch/br-int.mgmt: no response to inactivity probe after 5 seconds, disconnecting
0: https://github.com/ovn-org/ovn/commit/c99069c8934c9ea55d310a8b6d48fb66aa477589
1: https://github.com/ovn-org/ovn/commit/b8af8549396e62d6523be18e104352e334825783 |
[Impact]
Service/host restart or upgrade of the ovn-host package may render a host participating in a OVN network unusable as the ovn-controller process fails to complete programming of the local Open vSwitch switch flows.
[Test Case]
The issue was discovered when migrating a 3-node OpenStack cloud with 1000 instances deployed in our test lab. A test case could be to repeat that setup.
[Regression Potential]
None, the change of behavior was introduced upstream in [0] and later reversed in [1]. Keeping an idle probe for a unix socket type connection is clearly unnecessary.
[Original Bug Report]
A change [0] prior to the release of OVN v20.03.0 introduced a change of behavior where the inactivity probe for the ofctrl connection defaults to 5 seconds. Since this normally is a unix socket the default was not to have a inactivity probe at all.
On a busy system a inactivity probe of 5 seconds is not enough for the OVN Controller to complete programming of the switch.
The change of behavior was corrected in [1] and I think it would be beneficial if Ubuntu backported this fix to the OVN package rather than having charms and/or end users work around the issue by manually configuring the timeout through the `external-ids:ovn-openflow-probe-interval` key in the Open_vSwitch table.
Symptoms of this problem is that a OVN controller is either unable to do initial programming of a switch for a host with many ports and flows or that updates are lost on a functional system. The following will be printed in the log:
2020-10-11T18:56:09.355Z|30186|rconn|ERR|unix:/var/run/openvswitch/br-int.mgmt: no response to inactivity probe after 5 seconds, disconnecting
0: https://github.com/ovn-org/ovn/commit/c99069c8934c9ea55d310a8b6d48fb66aa477589
1: https://github.com/ovn-org/ovn/commit/b8af8549396e62d6523be18e104352e334825783 |
|
2020-11-10 18:02:35 |
Brian Murray |
ovn (Ubuntu Groovy): status |
Triaged |
Fix Committed |
|
2020-11-10 18:02:37 |
Brian Murray |
bug |
|
|
added subscriber Ubuntu Stable Release Updates Team |
2020-11-10 18:02:40 |
Brian Murray |
bug |
|
|
added subscriber SRU Verification |
2020-11-10 18:02:44 |
Brian Murray |
tags |
|
verification-needed verification-needed-groovy |
|
2020-11-10 18:06:45 |
Brian Murray |
ovn (Ubuntu Focal): status |
Triaged |
Fix Committed |
|
2020-11-10 18:06:52 |
Brian Murray |
tags |
verification-needed verification-needed-groovy |
verification-needed verification-needed-focal verification-needed-groovy |
|
2020-11-18 06:55:19 |
Frode Nordahl |
tags |
verification-needed verification-needed-focal verification-needed-groovy |
verification-done-focal verification-needed verification-needed-groovy |
|
2020-11-19 10:07:38 |
Ćukasz Zemczak |
removed subscriber Ubuntu Stable Release Updates Team |
|
|
|
2020-11-19 10:17:43 |
Launchpad Janitor |
ovn (Ubuntu Focal): status |
Fix Committed |
Fix Released |
|
2020-11-26 06:56:10 |
Frode Nordahl |
tags |
verification-done-focal verification-needed verification-needed-groovy |
verification-done verification-done-focal verification-done-groovy |
|
2020-11-30 09:29:27 |
Launchpad Janitor |
ovn (Ubuntu Groovy): status |
Fix Committed |
Fix Released |
|
2020-12-02 09:33:54 |
James Page |
bug task added |
|
cloud-archive |
|
2020-12-02 09:34:06 |
James Page |
nominated for series |
|
cloud-archive/ussuri |
|
2020-12-02 09:34:06 |
James Page |
bug task added |
|
cloud-archive/ussuri |
|
2020-12-02 09:34:06 |
James Page |
nominated for series |
|
cloud-archive/victoria |
|
2020-12-02 09:34:06 |
James Page |
bug task added |
|
cloud-archive/victoria |
|
2020-12-02 09:34:34 |
James Page |
cloud-archive/victoria: status |
New |
Invalid |
|
2020-12-02 09:36:42 |
James Page |
cloud-archive: status |
Invalid |
Fix Committed |
|
2020-12-02 09:36:47 |
James Page |
cloud-archive/ussuri: status |
New |
Fix Committed |
|
2020-12-11 11:53:03 |
Launchpad Janitor |
merge proposal linked |
|
https://code.launchpad.net/~fnordahl/ubuntu/+source/ovn/+git/ovn/+merge/395221 |
|
2021-01-07 13:44:25 |
Corey Bryant |
cloud-archive/ussuri: status |
Fix Committed |
Fix Released |
|