Galera xinetd only_from listens to load balancers' "ansible_host" rather than container management IP

Bug #1755609 reported by David J. Haines
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
openstack-ansible
Confirmed
Medium
Unassigned

Bug Description

OSA 16.0.9 configures the Galera role (in group_vars/galera_all.yml) with a whitelist for the monitoring on port 9200 that ends up as an xinetd service definition on the galera containers at /etc/xinetd.d/mysqlchk.

By default, that whitelist includes the "ansible_host" address of the members of the galera_all and haproxy_all groups. Unfortunately, the "ansible_host" address is potentially not the address from which the LB hosts will attempt to communicate with xinetd, e.g. where the load balancers' container management network interface is not the load balancers' default interface.

it seems to me that instead of the "ansible_host" address, OSA should instead be using the various hosts' addresses associated with their container management network interfaces (br-mgmt, by default).

The current workaround is to include the entirety of the container management network (as the exact host numbering is unknown until the creation and starting of all the containers) in the galera_monitoring_allowed_source variable in an /etc/openstack_deploy/user_*.yml file. Nevertheless, as OSA has all the information it needs to calculate the correct addresses to be whitelisted, I believe that this is a bug.

For illustration, below please find a slightly redacted copy of my openstack_user_config.yml

---
cidr_networks:
  container: 10.120.241.0/24
  tunnel: 10.120.242.0/24
  storage: 10.120.120.0/23

used_ips:
  - "10.120.243.1,10.120.243.16"
  - "10.120.120.1,10.120.121.12"
  - "10.120.6.1,10.120.6.140"
  - "10.120.240.1,10.120.240.16"
  - "10.120.241.1,10.120.241.16"
  - "10.120.242.1,10.120.242.16"

global_overrides:
  internal_lb_vip_address: openstack-internal.lab.example.com
  external_lb_vip_address: openstack.lab.example.com
  management_bridge: "br-mgmt"
  tunnel_bridge: "br-vxlan"
  provider_networks:
    - network:
        container_bridge: "br-mgmt"
        container_type: "veth"
        container_interface: "eth1"
        ip_from_q: "container"
        type: "raw"
        group_binds:
          - all_containers
          - hosts
        is_container_address: true
        is_ssh_address: true
    - network:
        container_bridge: "br-vxlan"
        container_type: "veth"
        container_interface: "eth10"
        ip_from_q: "tunnel"
        type: "vxlan"
        range: "1:1000"
        net_name: "vxlan"
        group_binds:
          - neutron_linuxbridge_agent
    - network:
        container_bridge: "br-vlan"
        container_type: "veth"
        container_interface: "eth12"
        host_bind_override: "eth12"
        type: "flat"
        net_name: "flat"
        group_binds:
          - neutron_linuxbridge_agent
    - network:
        container_bridge: "br-vlan"
        container_type: "veth"
        container_interface: "eth11"
        type: "vlan"
        range: "1:1"
        net_name: "vlan"
        group_binds:
          - neutron_linuxbridge_agent
    - network:
        container_bridge: "br-storage"
        container_type: "veth"
        container_interface: "eth2"
        ip_from_q: "storage"
        type: "raw"
        group_binds:
          - glance_api
          - cinder_api
          - cinder_volume
          - nova_compute
          - swift_proxy

shared-infra_hosts:
  infra0:
    ip: 10.120.240.5
  infra1:
    ip: 10.120.240.6

repo-infra_hosts:
  infra0:
    ip: 10.120.240.5
  infra1:
    ip: 10.120.240.6

haproxy_hosts:
  balancer0:
    ip: 10.120.240.11
  balancer1:
    ip: 10.120.240.12

log_hosts:
  logging0:
    ip: 10.120.240.9
  logging1:
    ip: 10.120.240.10

identity_hosts:
  infra0:
    ip: 10.120.240.5
  infra1:
    ip: 10.120.240.6

storage-infra_hosts:
  infra0:
    ip: 10.120.240.5
  infra1:
    ip: 10.120.240.6

image_hosts:
  storage0:
    ip: 10.120.240.7
  storage1:
    ip: 10.120.240.8

compute-infra_hosts:
  infra0:
    ip: 10.120.240.5
  infra1:
    ip: 10.120.240.6

orchestration_hosts:
  infra0:
    ip: 10.120.240.5
  infra1:
    ip: 10.120.240.6

dashboard_hosts:
  infra0:
    ip: 10.120.240.5
  infra1:
    ip: 10.120.240.6

network_hosts:
  infra0:
    ip: 10.120.240.5
  infra1:
    ip: 10.120.240.6

compute_hosts:
  aio:
    ip: 10.120.240.4

storage_hosts:
  storage0:
    ip: 10.120.240.7
    container_vars:
      cinder_backends:
        limit_container_types: cinder_volume
        netapp:
          netapp_storage_family: ontap_cluster
          netapp_storage_protocol: nfs
          nfs_shares_config: /etc/cinder/nfs_shares
          netapp_server_hostname: netapp-mgmt.example.com
          netapp_server_port: 443
          netapp_transport_type: https
          max_over_subscription_ratio: 2.0
          reserved_percentage: 5
          nfs_mount_options: lookupcache=pos
          netapp_vserver: openstack
          netapp_login: cinder_api
          netapp_password: p4ssw0rd
          volume_driver: cinder.volume.drivers.netapp.common.NetAppDriver
          volume_backend_name: netapp
          shares:
            - ip: netapp-nfs.example.com
              share: /cinderVol0
  storage1:
    ip: 10.120.240.8
    container_vars:
      cinder_backends:
        limit_container_types: cinder_volume
        netapp:
          netapp_storage_family: ontap_cluster
          netapp_storage_protocol: nfs
          nfs_shares_config: /etc/cinder/nfs_shares
          netapp_server_hostname: netapp-mgmt.example.com
          netapp_server_port: 443
          netapp_transport_type: https
          max_over_subscription_ratio: 2.0
          reserved_percentage: 5
          nfs_mount_options: lookupcache=pos
          netapp_vserver: openstack
          netapp_login: cinder_api
          netapp_password: p4ssw0rd
          volume_driver: cinder.volume.drivers.netapp.common.NetAppDriver
          volume_backend_name: netapp
          shares:
            - ip: openstack-nfs.example.com
              share: /cinderVol0

swift-proxy_hosts:
  infra0:
    ip: 10.120.240.5
  infra1:
    ip: 10.120.240.6

magnum-infra_hosts:
  infra0:
    ip: 10.120.240.5
  infra1:
    ip: 10.120.240.6

orchestration_hosts:
  infra0:
    ip: 10.120.240.5
  infra1:
    ip: 10.120.240.6

swift_hosts:
  storage0:
    ip: 10.120.240.7
    container_vars:
      swift_vars:
        zone: 1
  storage1:
    ip: 10.120.240.8
    container_vars:
      swift_vars:
        zone: 2

description: updated
Revision history for this message
Jean-Philippe Evrard (jean-philippe-evrard) wrote :

You are defining your br-mgmt network as the ssh network, and define 10.120.241.0/24 as the management network, yet your're using 10.120.240.11, 10.120.240.12 and 10.120.240.5, 10.120.240.6, which are outside the range.

Let's say we move the validation towards using cidr_networks instead of those IPs, we would still not have enough information to cover your (generic) case:
I guess if I dig a little deeper, I'd find out that your haproxy nodes have at least two interfaces, one being the br-mgmt on 10.120.241.0/24 and can therefore communicate with the galera nodes, because they would be on the same network. But nothing is explicitly said so in your configuration file, which is the only source we can base ourselves on.

Summary (for me): If we use cidr_networks, we are opening the door too big, if we take an assumption, we can go wrong in any case. I don't think there is a perfect way here. Any suggestion is welcomed.

Revision history for this message
Jean-Philippe Evrard (jean-philippe-evrard) wrote :

Please see today's bug triage meeting conversation on ideas for fixing this.

Changed in openstack-ansible:
status: New → Confirmed
importance: Undecided → Medium
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.