ovn-nbctl times out after 10s with 4+ machines
This bug report was marked for expiration 376 days ago. (find out why)
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
microovn |
Incomplete
|
Undecided
|
Unassigned |
Bug Description
When setting up MicroOVN with a LXD cluster, both using the `stable` channel snap, when LXD attempts to create the `ovn` network, it calls
```
ovn-nbctl --timeout=10 --db ssl:10.
```
with `ovn.env` from MicroOVN:
```
OVN_INITIAL_
OVN_INITIAL_
OVN_NB_
OVN_SB_
OVN_LOCAL_
```
When there are 4 or more cluster members, this command occasionally times out after 10s which implies the network is unreachable:
```
Error: Failed to run: ovn-nbctl --timeout=10 --db ssl:10.
```
This might have to do with only 3 systems being present in `ovn.env` for the `CONNECT` strings, with the command occasionally being run on the excluded system, but I'm not sure.
tags: | added: lxd |
tags: | added: microcloud |
The OVN central services are intentionally only ran on 3 of the nodes due to clustered DBs using the RAFT algorithm for consensus. Subsequently the NB/SB connect string will always only contain 3 IP addresses.
Any participating node without OVN central services will connect to all of the addresses in the connect string and depending on client settings, settle with the first one it hits or hunt for the leader.
Is there something in the deployment/ environment preventing the client to connect to the nodes with OVN DB servers?