Galera xinetd only_from listens to load balancers' "ansible_host" rather than container management IP
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
openstack-ansible |
Confirmed
|
Medium
|
Unassigned |
Bug Description
OSA 16.0.9 configures the Galera role (in group_vars/
By default, that whitelist includes the "ansible_host" address of the members of the galera_all and haproxy_all groups. Unfortunately, the "ansible_host" address is potentially not the address from which the LB hosts will attempt to communicate with xinetd, e.g. where the load balancers' container management network interface is not the load balancers' default interface.
it seems to me that instead of the "ansible_host" address, OSA should instead be using the various hosts' addresses associated with their container management network interfaces (br-mgmt, by default).
The current workaround is to include the entirety of the container management network (as the exact host numbering is unknown until the creation and starting of all the containers) in the galera_
For illustration, below please find a slightly redacted copy of my openstack_
---
cidr_networks:
container: 10.120.241.0/24
tunnel: 10.120.242.0/24
storage: 10.120.120.0/23
used_ips:
- "10.120.
- "10.120.
- "10.120.
- "10.120.
- "10.120.
- "10.120.
global_overrides:
internal_
external_
management_
tunnel_bridge: "br-vxlan"
provider_
- network:
ip_from_q: "container"
type: "raw"
- all_containers
- hosts
- network:
ip_from_q: "tunnel"
type: "vxlan"
range: "1:1000"
net_name: "vxlan"
- neutron_
- network:
type: "flat"
net_name: "flat"
- neutron_
- network:
type: "vlan"
range: "1:1"
net_name: "vlan"
- neutron_
- network:
ip_from_q: "storage"
type: "raw"
- glance_api
- cinder_api
- cinder_volume
- nova_compute
- swift_proxy
shared-infra_hosts:
infra0:
ip: 10.120.240.5
infra1:
ip: 10.120.240.6
repo-infra_hosts:
infra0:
ip: 10.120.240.5
infra1:
ip: 10.120.240.6
haproxy_hosts:
balancer0:
ip: 10.120.240.11
balancer1:
ip: 10.120.240.12
log_hosts:
logging0:
ip: 10.120.240.9
logging1:
ip: 10.120.240.10
identity_hosts:
infra0:
ip: 10.120.240.5
infra1:
ip: 10.120.240.6
storage-
infra0:
ip: 10.120.240.5
infra1:
ip: 10.120.240.6
image_hosts:
storage0:
ip: 10.120.240.7
storage1:
ip: 10.120.240.8
compute-
infra0:
ip: 10.120.240.5
infra1:
ip: 10.120.240.6
orchestration_
infra0:
ip: 10.120.240.5
infra1:
ip: 10.120.240.6
dashboard_hosts:
infra0:
ip: 10.120.240.5
infra1:
ip: 10.120.240.6
network_hosts:
infra0:
ip: 10.120.240.5
infra1:
ip: 10.120.240.6
compute_hosts:
aio:
ip: 10.120.240.4
storage_hosts:
storage0:
ip: 10.120.240.7
container_vars:
cinder_
netapp:
shares:
- ip: netapp-
storage1:
ip: 10.120.240.8
container_vars:
cinder_
netapp:
shares:
- ip: openstack-
swift-proxy_hosts:
infra0:
ip: 10.120.240.5
infra1:
ip: 10.120.240.6
magnum-infra_hosts:
infra0:
ip: 10.120.240.5
infra1:
ip: 10.120.240.6
orchestration_
infra0:
ip: 10.120.240.5
infra1:
ip: 10.120.240.6
swift_hosts:
storage0:
ip: 10.120.240.7
container_vars:
swift_vars:
zone: 1
storage1:
ip: 10.120.240.8
container_vars:
swift_vars:
zone: 2
description: | updated |
You are defining your br-mgmt network as the ssh network, and define 10.120.241.0/24 as the management network, yet your're using 10.120.240.11, 10.120.240.12 and 10.120.240.5, 10.120.240.6, which are outside the range.
Let's say we move the validation towards using cidr_networks instead of those IPs, we would still not have enough information to cover your (generic) case:
I guess if I dig a little deeper, I'd find out that your haproxy nodes have at least two interfaces, one being the br-mgmt on 10.120.241.0/24 and can therefore communicate with the galera nodes, because they would be on the same network. But nothing is explicitly said so in your configuration file, which is the only source we can base ourselves on.
Summary (for me): If we use cidr_networks, we are opening the door too big, if we take an assumption, we can go wrong in any case. I don't think there is a perfect way here. Any suggestion is welcomed.