need extended relation checking for nrpe charm

Bug #1805701 reported by Drew Freiberger
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Juju Lint
Fix Released
Medium
Unassigned

Bug Description

Currently, juju-lint in the canonical-openstack-rules.yaml checks that nrpe exists on each physical and container machine, however, it does not check that it's related to all services on those machines. In situations where we have smooshed both nova-compute and ceph-osd onto the same metal, we need to ensure that we are checking that nrpe is related to both ceph-osd and nova-compute, as relating to only one or the other leaves us with missing monitors. The nrpe relation checking needs an expanded methodology that checks for any charm with an "nrpe-external-monitors" relation interface and ensures nrpe is related to that interface.

As an example here are the two applications and their subordinates on machine 2. You can see nrpe is missing from ceph-osd as a subordinate, which means that the ceph-osd service checks will not be dropped into the nagios environment.

ceph-osd/2 active idle 2 10.55.0.198 Unit is ready (2 OSD)
  clamav/7 active idle 10.55.0.198

nova-compute-kvm/0 active idle 2 10.55.0.198 Unit is ready
  ceilometer-agent/4 active idle 10.55.0.198 Unit is ready
  filebeat/8 active idle 10.55.0.198 Filebeat ready.
  landscape-client/18 active idle 10.55.0.198 System successfully registered
  lldpd/14 active idle 10.55.0.198 LLDP daemon running
  neutron-openvswitch/5 active idle 10.55.0.198 Unit is ready
  nrpe-host/5 active idle 10.55.0.198 icmp,5666/tcp ready
  ntp/1 active idle 10.55.0.198 123/udp Ready
  telegraf/22 active idle 10.55.0.198 9103/tcp Monitoring nova-compute-kvm/0

As there are many application names for a given charm, such as hacluster, and nrpe needs to be related to both the principal service and the hacluster subordinate to configure check_crm_status, we will need to have a way to list charms that should be related to nrpe, not application names.

As an example:

required-relations:
- - nrpe
  - hacluster

This should check for all (cs|local):(.*/)*hacluster charm applications being related to any nrpe-charm application, whether nrpe-container, nrpe-lxd, nrpe-host, or nrpe-physical.

Related branches

Revision history for this message
Drew Freiberger (afreiberger) wrote :

It was discovered recently that ntp was not related to nrpe in any of the FCB skus. This check will also be useful to verify that nrpe and ntp have a required relation fulfilled.

Revision history for this message
James Hebden (ec0) wrote :

Part of this is now implemented, in that you can write rules which simply target the charm name, rather than the unit name. So, if rules were added for these relations, they could simply ensure that all "NTP" deployed units (regardless of name) are related to NRPE unit. Or, more specifically, the hacluster/NRPE example provided should be possible now.

Currently the linting doesn't actually check that relations between services exist at all, so this would need to be added in order to make the rest of the assertions being suggested here.

Changed in juju-lint:
importance: Undecided → Medium
Revision history for this message
Drew Freiberger (afreiberger) wrote :

During implementation, it may be nice to have "required relations" (nrpe:nrpe-external-master to various charms) and "forbidden relations" (like filebeat<->elasticsearch)

Eric Chen (eric-chen)
tags: added: bseng-199
Revision history for this message
Gabriel Cocenza (gabrielcocenza) wrote :

Juju-lint[0] now is able to check if every application (principal or subordinate) that provides the endpoint `nrpe-external-master` is relating with nrpe. Moreover, we are also able now to identify machines that doesn't have nrpe because there is no principal charm in the machine.

In the case specified juju-lint will warn missing relations among ceph-osd, nova-compute and nrpe. That is also true in the case of hacluster because this subordinate provides`nrpe-external-master` endpoint.

[0] https://code.launchpad.net/~gabrielcocenza/juju-lint/+git/juju-lint/+merge/427991

Changed in juju-lint:
milestone: none → 1.0.4
status: New → Fix Committed
Changed in juju-lint:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.