Activity log for bug #2017617

Date Who What changed Old value New value Message
2023-04-25 03:24:40 Barry Price bug added bug
2023-04-25 03:25:39 Barry Price description This is semi-related to LP:2012693. If we deploy four landscape-server units, let's call them landscape-server/0, /1, /2 and /3, then when Nagios check definitions are created, the package-search and package-upload checks are only created on the current leader. Let's say that's /0 in our example. Assuming a separate Nagios in a separate environment (so no CMRs), those checks are typically collected shortly after deploy time, and are not regularly refreshed. So if/when leadership changes from /0 to say /2 at some future point in time, the defined package-search/upload checks on /0 return "UNKNOWN" status in Nagios, while there are no defined checks for these services on /2 which results in outages not alerting. This is semi-related to LP:2012693. If we deploy four landscape-server units, let's call them landscape-server/0, /1, /2 and /3, then when Nagios check definitions are created, the package-search and package-upload checks are only created on the current leader. Let's say that's /0 in our example. Assuming a separate Nagios in a separate environment (so no CMRs), those checks are typically collected shortly after deploy time, and are not regularly refreshed. So if/when leadership changes from /0 to say /2 at some future point in time, the defined package-search/upload checks on /0 return "UNKNOWN" status in Nagios, while there are no defined checks for these services on /2 which results in outages not alerting. Since UNKNOWN doesn't result in an alert being generated (at least in the config that I'm looking at), it seems like it would be safer to generate those checks on all units including non-leaders, or at least provide an option by which to do so.
2023-04-25 03:25:47 Barry Price bug added subscriber The Canonical Sysadmins