Two of the three ovn-controller agents on octavia units are registered with host=$fqdn, and the down controller is registered with a shortname.
`hostname -f` shows the full fqdn on the down unit
/etc/openvswitch/system-id.conf lists the short hostname only
`ovs-vsctl list open_vswitch` lists both the hostname and the system-id as shortname
seeing a lot of errors in /var/log/ovn/ovn-controller.log along the lines of:
2020-09-22T14:22:39.500Z|04678|binding|INFO|Changing chassis for lport 529233fc-f9c4-40b1-8c6a-f2e906a2498d from juju-a9d6f4-21-lxd-9.maas to juju-a9d6f4-21-lxd-9.
2020-09-22T06:25:01.829Z|857112|main|INFO|OVNSB commit failed, force recompute next time.
restart of ovn-controller shows the following in the log:
2020-09-22T14:22:30.498Z|00001|vlog|INFO|opened log file /var/log/ovn/ovn-controller.log
2020-09-22T14:22:30.500Z|00002|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connecting...
2020-09-22T14:22:30.500Z|00003|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connected
2020-09-22T14:22:30.502Z|00004|main|INFO|OVS IDL reconnected, force recompute.
2020-09-22T14:22:30.504Z|00005|reconnect|INFO|ssl:10.35.61.157:6642: connecting...
2020-09-22T14:22:30.504Z|00006|main|INFO|OVNSB IDL reconnected, force recompute.
2020-09-22T14:22:30.508Z|00007|reconnect|INFO|ssl:10.35.61.157:6642: connected
2020-09-22T14:22:30.514Z|00008|ofctrl|INFO|unix:/var/run/openvswitch/br-int.mgmt: connecting to switch
2020-09-22T14:22:30.514Z|00009|rconn|INFO|unix:/var/run/openvswitch/br-int.mgmt: connecting...
2020-09-22T14:22:30.514Z|00010|rconn|INFO|unix:/var/run/openvswitch/br-int.mgmt: connected
2020-09-22T14:22:30.515Z|00011|ovsdb_idl|WARN|transaction error: {"details":"RBAC rules for client \"juju-a9d6f4-21-lxd-9\" role \"ovn-controller\" prohibit modification of table \"Chassis\".","error":"permission error"}
2020-09-22T14:22:30.515Z|00012|main|INFO|OVNSB commit failed, force recompute next time.
2020-09-22T14:22:30.515Z|00001|pinctrl(ovn_pinctrl0)|INFO|unix:/var/run/openvswitch/br-int.mgmt: connecting to switch
2020-09-22T14:22:30.515Z|00002|rconn(ovn_pinctrl0)|INFO|unix:/var/run/openvswitch/br-int.mgmt: connecting...
2020-09-22T14:22:30.516Z|00013|ovsdb_idl|WARN|transaction error: {"details":"Transaction causes multiple rows in \"Encap\" table to have identical values (geneve and \"10.35.82.18\") for index on columns \"type\" and \"ip\". First row, with UUID 86556077-6325-4cb6-9bbd-c5979ae15d2c, was inserted by this transaction. Second row, with UUID 3345a08e-534b-4ccf-a7b6-2d6d00706422, existed in the database before this transaction and was not modified by the transaction.","error":"constraint violation"}
2020-09-22T14:22:30.516Z|00014|main|INFO|OVNSB commit failed, force recompute next time.
2020-09-22T14:22:30.516Z|00015|ovsdb_idl|WARN|transaction error: {"details":"Transaction causes multiple rows in \"Encap\" table to have identical values (geneve and \"10.35.82.18\") for index on columns \"type\" and \"ip\". First row, with UUID 3345a08e-534b-4ccf-a7b6-2d6d00706422, existed in the database before this transaction and was not modified by the transaction. Second row, with UUID 916635aa-e98c-4f23-8ac8-1e3f381151c6, was inserted by this transaction.","error":"constraint violation"}
2020-09-22T14:22:30.516Z|00016|main|INFO|OVNSB commit failed, force recompute next time.
2020-09-22T14:22:30.516Z|00017|binding|INFO|Changing chassis for lport 529233fc-f9c4-40b1-8c6a-f2e906a2498d from juju-a9d6f4-21-lxd-9.maas to juju-a9d6f4-21-lxd-9.
2020-09-22T14:22:30.516Z|00018|binding|INFO|529233fc-f9c4-40b1-8c6a-f2e906a2498d: Claiming fa:16:3e:e4:70:66 fc00:2d33:a2bc:84d4:f816:3eff:fee4:7066
2020-09-22T14:22:30.517Z|00019|ovsdb_idl|WARN|transaction error: {"details":"Transaction causes multiple rows in \"Encap\" table to have identical values (geneve and \"10.35.82.18\") for index on columns \"type\" and \"ip\". First row, with UUID 3345a08e-534b-4ccf-a7b6-2d6d00706422, existed in the database before this transaction and was not modified by the transaction. Second row, with UUID 6219b9c9-fc57-4caa-8f75-46ead7584901, was inserted by this transaction.","error":"constraint violation"}
2020-09-22T14:22:30.517Z|00020|main|INFO|OVNSB commit failed, force recompute next time.
2020-09-22T14:22:30.518Z|00021|binding|INFO|Changing chassis for lport 529233fc-f9c4-40b1-8c6a-f2e906a2498d from juju-a9d6f4-21-lxd-9.maas to juju-a9d6f4-21-lxd-9.
2020-09-22T14:22:30.518Z|00022|binding|INFO|529233fc-f9c4-40b1-8c6a-f2e906a2498d: Claiming fa:16:3e:e4:70:66 fc00:2d33:a2bc:84d4:f816:3eff:fee4:7066
2020-09-22T14:22:30.521Z|00023|ovsdb_idl|WARN|transaction error: {"details":"Transaction causes multiple rows in \"Encap\" table to have identical values (geneve and \"10.35.82.18\") for index on columns \"type\" and \"ip\". First row, with UUID 3345a08e-534b-4ccf-a7b6-2d6d00706422, existed in the database before this transaction and was not modified by the transaction. Second row, with UUID 5f2ca07b-859f-4013-9e49-5fd00a1909e9, was inserted by this transaction.","error":"constraint violation"}
2020-09-22T14:22:30.521Z|00024|main|INFO|OVNSB commit failed, force recompute next time.
2020-09-22T14:22:30.521Z|00003|rconn(ovn_pinctrl0)|INFO|unix:/var/run/openvswitch/br-int.mgmt: connected
Relation info being provided from octavia-ovn-chassis to octavia on that unit shows chassis-name as the short hostname, but on other octavia units, the chassis-name provided from ovn-chassis to octavia is the fqdn.
It appears from a brief read-through of the ovn-chassis charm that the hostname is queried from the ovsdb and then system-id is set from that hostname. Is it possible that there's a race between the system being able to query it's fqdn from DNS during deployment and the hostname ovs sees when it initializes the database on install?
Some potentially relevant code snippets:
# The local ``ovn-controller`` process will retrieve information about
# how to connect to OVN from the local Open vSwitch database. self.run('ovs-vsctl', 'set', 'open', '.', 'external-ids:ovn-encap-type=geneve', '--', 'set', 'open', '.', 'external-ids:ovn-encap-ip={}' .format(self.get_data_ip()), '--', 'set', 'open', '.', 'external-ids:system-id={}' .format(self.get_ovs_hostname()))
*snip*
def get_ovs_hostname():
for row in ch_ovsdb.SimpleOVSDB('ovs-vsctl').open_vswitch:
return row['external_ids']['hostname']
On a juju 2.7.8, latest charms (20.08), I have a dead ovn-controller agent on one of the octavia units.
$ openstack network agent list|grep lxd 21-lxd- 9.maas | OVN Controller agent | juju-a9d6f4- 21-lxd- 9 | | XXX | UP | ovn-controller | 25-lxd- 10.maas | OVN Controller agent | juju-a9d6f4- 25-lxd- 10.maas | | :-) | UP | ovn-controller | 23-lxd- 10.maas | OVN Controller agent | juju-a9d6f4- 23-lxd- 10.maas | | :-) | UP | ovn-controller |
| juju-a9d6f4-
| juju-a9d6f4-
| juju-a9d6f4-
Two of the three ovn-controller agents on octavia units are registered with host=$fqdn, and the down controller is registered with a shortname.
`hostname -f` shows the full fqdn on the down unit h/system- id.conf lists the short hostname only
/etc/openvswitc
`ovs-vsctl list open_vswitch` lists both the hostname and the system-id as shortname
seeing a lot of errors in /var/log/ ovn/ovn- controller. log along the lines of: 22T14:22: 39.500Z| 04678|binding| INFO|Changing chassis for lport 529233fc- f9c4-40b1- 8c6a-f2e906a249 8d from juju-a9d6f4- 21-lxd- 9.maas to juju-a9d6f4- 21-lxd- 9. 22T06:25: 01.829Z| 857112| main|INFO| OVNSB commit failed, force recompute next time.
2020-09-
2020-09-
restart of ovn-controller shows the following in the log: 22T14:22: 30.498Z| 00001|vlog| INFO|opened log file /var/log/ ovn/ovn- controller. log 22T14:22: 30.500Z| 00002|reconnect |INFO|unix: /var/run/ openvswitch/ db.sock: connecting... 22T14:22: 30.500Z| 00003|reconnect |INFO|unix: /var/run/ openvswitch/ db.sock: connected 22T14:22: 30.502Z| 00004|main| INFO|OVS IDL reconnected, force recompute. 22T14:22: 30.504Z| 00005|reconnect |INFO|ssl: 10.35.61. 157:6642: connecting... 22T14:22: 30.504Z| 00006|main| INFO|OVNSB IDL reconnected, force recompute. 22T14:22: 30.508Z| 00007|reconnect |INFO|ssl: 10.35.61. 157:6642: connected 22T14:22: 30.514Z| 00008|ofctrl| INFO|unix: /var/run/ openvswitch/ br-int. mgmt: connecting to switch 22T14:22: 30.514Z| 00009|rconn| INFO|unix: /var/run/ openvswitch/ br-int. mgmt: connecting... 22T14:22: 30.514Z| 00010|rconn| INFO|unix: /var/run/ openvswitch/ br-int. mgmt: connected 22T14:22: 30.515Z| 00011|ovsdb_ idl|WARN| transaction error: {"details":"RBAC rules for client \"juju- a9d6f4- 21-lxd- 9\" role \"ovn-controller\" prohibit modification of table \"Chassis\ ".","error" :"permission error"} 22T14:22: 30.515Z| 00012|main| INFO|OVNSB commit failed, force recompute next time. 22T14:22: 30.515Z| 00001|pinctrl( ovn_pinctrl0) |INFO|unix: /var/run/ openvswitch/ br-int. mgmt: connecting to switch 22T14:22: 30.515Z| 00002|rconn( ovn_pinctrl0) |INFO|unix: /var/run/ openvswitch/ br-int. mgmt: connecting... 22T14:22: 30.516Z| 00013|ovsdb_ idl|WARN| transaction error: {"details" :"Transaction causes multiple rows in \"Encap\" table to have identical values (geneve and \"10.35.82.18\") for index on columns \"type\" and \"ip\". First row, with UUID 86556077- 6325-4cb6- 9bbd-c5979ae15d 2c, was inserted by this transaction. Second row, with UUID 3345a08e- 534b-4ccf- a7b6-2d6d007064 22, existed in the database before this transaction and was not modified by the transaction. ","error" :"constraint violation"} 22T14:22: 30.516Z| 00014|main| INFO|OVNSB commit failed, force recompute next time. 22T14:22: 30.516Z| 00015|ovsdb_ idl|WARN| transaction error: {"details" :"Transaction causes multiple rows in \"Encap\" table to have identical values (geneve and \"10.35.82.18\") for index on columns \"type\" and \"ip\". First row, with UUID 3345a08e- 534b-4ccf- a7b6-2d6d007064 22, existed in the database before this transaction and was not modified by the transaction. Second row, with UUID 916635aa- e98c-4f23- 8ac8-1e3f381151 c6, was inserted by this transaction. ","error" :"constraint violation"} 22T14:22: 30.516Z| 00016|main| INFO|OVNSB commit failed, force recompute next time. 22T14:22: 30.516Z| 00017|binding| INFO|Changing chassis for lport 529233fc- f9c4-40b1- 8c6a-f2e906a249 8d from juju-a9d6f4- 21-lxd- 9.maas to juju-a9d6f4- 21-lxd- 9. 22T14:22: 30.516Z| 00018|binding| INFO|529233fc- f9c4-40b1- 8c6a-f2e906a249 8d: Claiming fa:16:3e:e4:70:66 fc00:2d33: a2bc:84d4: f816:3eff: fee4:7066 22T14:22: 30.517Z| 00019|ovsdb_ idl|WARN| transaction error: {"details" :"Transaction causes multiple rows in \"Encap\" table to have identical values (geneve and \"10.35.82.18\") for index on columns \"type\" and \"ip\". First row, with UUID 3345a08e- 534b-4ccf- a7b6-2d6d007064 22, existed in the database before this transaction and was not modified by the transaction. Second row, with UUID 6219b9c9- fc57-4caa- 8f75-46ead75849 01, was inserted by this transaction. ","error" :"constraint violation"} 22T14:22: 30.517Z| 00020|main| INFO|OVNSB commit failed, force recompute next time. 22T14:22: 30.518Z| 00021|binding| INFO|Changing chassis for lport 529233fc- f9c4-40b1- 8c6a-f2e906a249 8d from juju-a9d6f4- 21-lxd- 9.maas to juju-a9d6f4- 21-lxd- 9. 22T14:22: 30.518Z| 00022|binding| INFO|529233fc- f9c4-40b1- 8c6a-f2e906a249 8d: Claiming fa:16:3e:e4:70:66 fc00:2d33: a2bc:84d4: f816:3eff: fee4:7066 22T14:22: 30.521Z| 00023|ovsdb_ idl|WARN| transaction error: {"details" :"Transaction causes multiple rows in \"Encap\" table to have identical values (geneve and \"10.35.82.18\") for index on columns \"type\" and \"ip\". First row, with UUID 3345a08e- 534b-4ccf- a7b6-2d6d007064 22, existed in the database before this transaction and was not modified by the transaction. Second row, with UUID 5f2ca07b- 859f-4013- 9e49-5fd00a1909 e9, was inserted by this transaction. ","error" :"constraint violation"} 22T14:22: 30.521Z| 00024|main| INFO|OVNSB commit failed, force recompute next time. 22T14:22: 30.521Z| 00003|rconn( ovn_pinctrl0) |INFO|unix: /var/run/ openvswitch/ br-int. mgmt: connected
2020-09-
2020-09-
2020-09-
2020-09-
2020-09-
2020-09-
2020-09-
2020-09-
2020-09-
2020-09-
2020-09-
2020-09-
2020-09-
2020-09-
2020-09-
2020-09-
2020-09-
2020-09-
2020-09-
2020-09-
2020-09-
2020-09-
2020-09-
2020-09-
2020-09-
2020-09-
2020-09-
Relation info being provided from octavia-ovn-chassis to octavia on that unit shows chassis-name as the short hostname, but on other octavia units, the chassis-name provided from ovn-chassis to octavia is the fqdn.
$ sudo juju-run octavia/0 -r 139 --remote-unit octavia- ovn-chassis/ 1 'relation-get' a9d6f4- 21-lxd- 9"'
chassis-name: '"juju-
egress-subnets: 10.35.61.179/32
ingress-address: 10.35.61.179
ovn-configured: "true"
private-address: 10.35.61.179
$ sudo juju-run octavia/1 -r 139 --remote-unit octavia- ovn-chassis/ 2 'relation-get' a9d6f4- 23-lxd- 10.maas" '
chassis-name: '"juju-
egress-subnets: 10.35.61.191/32
ingress-address: 10.35.61.191
ovn-configured: "true"
private-address: 10.35.61.191
It appears from a brief read-through of the ovn-chassis charm that the hostname is queried from the ovsdb and then system-id is set from that hostname. Is it possible that there's a race between the system being able to query it's fqdn from DNS during deployment and the hostname ovs sees when it initializes the database on install?
Some potentially relevant code snippets:
self.run( 'ovs-vsctl' ,
'set' , 'open', '.',
'external- ids:ovn- encap-type= geneve' , '--',
'set' , 'open', '.',
'external- ids:ovn- encap-ip= {}'
.format( self.get_ data_ip( )), '--',
'set' , 'open', '.',
'external- ids:system- id={}'
.format( self.get_ ovs_hostname( ))) SimpleOVSDB( 'ovs-vsctl' ).open_ vswitch: ids'][' hostname' ]
# The local ``ovn-controller`` process will retrieve information about
# how to connect to OVN from the local Open vSwitch database.
*snip*
def get_ovs_hostname():
for row in ch_ovsdb.
return row['external_