Autopilot: Nagios uses the wrong subnet IP to reach one host
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Canonical Juju |
Expired
|
Medium
|
Unassigned | ||
Landscape Server |
Invalid
|
Undecided
|
Unassigned | ||
MAAS |
Invalid
|
Undecided
|
Unassigned | ||
nagios (Juju Charms Collection) |
New
|
Undecided
|
Unassigned |
Bug Description
After deploying Openstack with Landscape Autopilot, one of the hosts always fails all the checks in Nagios with a "CHECK_NRPE: Error - Could not complete SSL handshake" error. This error always happens in all Autopilot deployments and always for the same node.
The issue is caused by Nagios having the wrong IP address for that host in /etc/nagios3/
The host entry in /etc/nagios3/
define host {
use generic-host
statusmap_image base/ubuntu.gd2
icon_image_alt Ubuntu Linux
vrml_image ubuntu.png
host_name <REMOVED - hostname of the host that fails>
icon_image base/ubuntu.png
address <REMOVED - private IP of the host>
}
Here is another entry from a similar host that works fine in Nagios:
define host {
use generic-host
statusmap_image base/ubuntu.gd2
icon_image_alt Ubuntu Linux
vrml_image ubuntu.png
host_name <REMOVED - hostname of a host that works>
icon_image base/ubuntu.png
address <REMOVED - FQDN of the host in the form hostname.maas>
}
hostname.maas points to the public IP of the host instead of the private IP.
The workaround is to modify the address line of the bad host entry to be <hostname>.maas which points to the public IP and to restart Nagios.
We suspect that MAAS or Juju is not doing the right thing somewhere which leads to an incorrect entry in /etc/nagios3/
information type: | Proprietary → Private |
Changed in landscape: | |
status: | New → Invalid |
tags: | added: kanban-cross-team landscape |
tags: | removed: kanban-cross-team |
information type: | Private → Public |
Changed in juju-core: | |
status: | New → Triaged |
importance: | Undecided → Medium |
milestone: | none → 2.1.0 |
affects: | juju-core → juju |
Changed in juju: | |
milestone: | 2.1.0 → none |
milestone: | none → 2.1.0 |
Changed in juju: | |
milestone: | 2.1-rc2 → none |
MAAS doesn't really have a differentiation between what is a public/private network. We always create a DNS record as <hostname>.<domain> for the PXE interface. Starting for 2.0+, it new records will be created as <ethX>. <hostname> .<domain> , so it is juju who should decide to what interface it wants to connect it to, and grab the correct hostname/domain/ip for it.