we have correlated the apt-update on hacluster components with the start of the issue when the pacemaker loses the quorum.... history.log. Start-Date: 2019-05-09 11:56:50 Commandline: apt-get --assume-yes --option=Dpkg::Options::=--force-confold install crmsh corosync pacemaker ipmitool libmonitoring-plugin-perl python3-requests-oauthlib python3-libmaas Install: corosync:amd64 (2.4.3-0ubuntu1), resource-agents:amd64 (1:4.1.0~rc1-1ubuntu1, automatic), libcmap4:amd64 (2.4.3-0ubuntu1, automatic), libb-hooks-op-check-perl:amd64 (0.22-1, automatic), python3-argcomplete:amd64 (1.8.1-1ubuntu1, automatic), python3-async-timeout:amd64 (2.0.0-1, automatic), libmodule-runtime-perl:amd64 (0.016-1, automatic), libmath-calc-units-perl:amd64 (1.07-1, automatic), libparams-validate-perl:amd64 (1.29-1, automatic), python3-terminaltables:amd64 (3.1.0-2, automatic), libxml2-utils:amd64 (2.9.4+dfsg1-6.1ubuntu1.2, automatic), openhpid:amd64 (3.6.1-3.1build1, automatic), pacemaker:amd64 (1.1.18-0ubuntu1.1), libfreeipmi16:amd64 (1.4.11-1.1ubuntu4.1, automatic), python3-pymongo-ext:amd64 (3.6.1+dfsg1-1, automatic), libtry-tiny-perl:amd64 (0.30-1, automatic), pacemaker-resource-agents:amd64 (1.1.18-0ubuntu1.1, automatic), liblrmd1:amd64 (1.1.18-0ubuntu1.1, automatic), libdevel-callchecker-perl:amd64 (0.007-2build1, automatic), python3-libmaas:amd64 (0.6.0-0ubuntu1), libquorum5:amd64 (2.4.3-0ubuntu1, automatic), libcrmcommon3:amd64 (1.1.18-0ubuntu1.1, automatic), libesmtp6:amd64 (1.0.6-4.3build1, automatic), freeipmi-common:amd64 (1.4.11-1.1ubuntu4.1, automatic), libcrmcluster4:amd64 (1.1.18-0ubuntu1.1, automatic), libmodule-implementation-perl:amd64 (0.09-1, automatic), python3-colorclass:amd64 (2.2.0-2, automatic), libtotem-pg5:amd64 (2.4.3-0ubuntu1, automatic), libpe-rules2:amd64 (1.1.18-0ubuntu1.1, automatic), liblrm2:amd64 (1.0.12-7build1, automatic), libpengine10:amd64 (1.1.18-0ubuntu1.1, automatic), libtransitioner2:amd64 (1.1.18-0ubuntu1.1, automatic), librdmacm1:amd64 (17.1-1ubuntu0.1, automatic), libqb0:amd64 (1.0.1-1ubuntu1, automatic), python3-pymongo:amd64 (3.6.1+dfsg1-1, automatic), libdynaloader-functions-perl:amd64 (0.003-1, automatic), libltdl7:amd64 (2.4.6-2, automatic), ipmitool:amd64 (1.8.18-5ubuntu0.1), libconfig-tiny-perl:amd64 (2.23-1, automatic), libplumb2:amd64 (1.0.12-7build1, automatic), libnet1:amd64 (1.1.6+dfsg-3.1, automatic), python3-requests-oauthlib:amd64 (0.8.0-0.1), libparams-classify-perl:amd64 (0.015-1, automatic), libstatgrab10:amd64 (0.91-1build1, automatic), libplumbgpl2:amd64 (1.0.12-7build1, automatic), libsub-name-perl:amd64 (0.21-1build1, automatic), python3-multidict:amd64 (4.1.0-1, automatic), libvotequorum8:amd64 (2.4.3-0ubuntu1, automatic), crmsh:amd64 (3.0.1-3ubuntu1), python3-bson-ext:amd64 (3.6.1+dfsg1-1, automatic), python3-aiohttp:amd64 (3.0.1-1, automatic), python3-yarl:amd64 (1.1.0-1, automatic), libdbus-glib-1-2:amd64 (0.110-2, automatic), libopenhpi3:amd64 (3.6.1-3.1build1, automatic), python3-bson:amd64 (3.6.1+dfsg1-1, automatic), libcfg6:amd64 (2.4.3-0ubuntu1, automatic), libcrmservice3:amd64 (1.1.18-0ubuntu1.1, automatic), libpils2:amd64 (1.0.12-7build1, automatic), libstonith1:amd64 (1.0.12-7build1, automatic), libcib4:amd64 (1.1.18-0ubuntu1.1, automatic), pacemaker-cli-utils:amd64 (1.1.18-0ubuntu1.1, automatic), libmonitoring-plugin-perl:amd64 (0.39-1), libtimedate-perl:amd64 (2.3000-2, automatic), python3-gridfs:amd64 (3.6.1+dfsg1-1, automatic), python3-tz:amd64 (2018.3-2, automatic), cluster-glue:amd64 (1.0.12-7build1, automatic), openipmi:amd64 (2.0.22-1.1ubuntu2.1, automatic), xsltproc:amd64 (1.1.29-5ubuntu0.1, automatic), libopenipmi0:amd64 (2.0.22-1.1ubuntu2.1, automatic), libcpg4:amd64 (2.4.3-0ubuntu1, automatic), libpe-status10:amd64 (1.1.18-0ubuntu1.1, automatic), python-parallax:amd64 (1.0.3-1, automatic), libclass-accessor-perl:amd64 (0.51-1, automatic), libcorosync-common4:amd64 (2.4.3-0ubuntu1, automatic), libstonithd2:amd64 (1.1.18-0ubuntu1.1, automatic), pacemaker-common:amd64 (1.1.18-0ubuntu1.1, automatic) End-Date: 2019-05-09 11:57:18 pacemaker log:- Seems to have started around May 9th. Set r/w permissions for uid=115, gid=121 on /var/log/pacemaker.log May 09 11:57:17 [36525] juju-264ace-19-lxd-0 pacemakerd: info: crm_log_init: Changed active directory to /var/lib/pacemaker/cores May 09 11:57:17 [36525] juju-264ace-19-lxd-0 pacemakerd: info: get_cluster_type: Detected an active 'corosync' cluster May 09 11:57:17 [36525] juju-264ace-19-lxd-0 pacemakerd: error: sysrq_init: Cannot write to /proc/sys/kernel/sysrq: Permission denied (13) May 09 11:57:17 [36525] juju-264ace-19-lxd-0 pacemakerd: info: qb_ipcs_us_publish: server name: pacemakerd May 09 11:57:17 [36525] juju-264ace-19-lxd-0 pacemakerd: info: pcmk__ipc_is_authentic_process_active: Could not connect to lrmd IPC: Connection refused May 09 11:57:17 [36525] juju-264ace-19-lxd-0 pacemakerd: info: pcmk__ipc_is_authentic_process_active: Could not connect to cib_ro IPC: Connection refused May 09 11:57:17 [36525] juju-264ace-19-lxd-0 pacemakerd: info: pcmk__ipc_is_authentic_process_active: Could not connect to crmd IPC: Connection refused May 09 11:57:17 [36525] juju-264ace-19-lxd-0 pacemakerd: info: pcmk__ipc_is_authentic_process_active: Could not connect to attrd IPC: Connection refused May 09 11:57:17 [36525] juju-264ace-19-lxd-0 pacemakerd: info: pcmk__ipc_is_authentic_process_active: Could not connect to pengine IPC: Connection refused May 09 11:57:17 [36525] juju-264ace-19-lxd-0 pacemakerd: info: pcmk__ipc_is_authentic_process_active: Could not connect to stonith-ng IPC: Connection refused May 09 11:57:17 [36525] juju-264ace-19-lxd-0 pacemakerd: info: corosync_node_name: Unable to get node name for nodeid 2130706433 May 09 11:57:17 [36525] juju-264ace-19-lxd-0 pacemakerd: notice: get_node_name: Could not obtain a node name for corosync nodeid 2130706433 May 09 11:57:17 [36525] juju-264ace-19-lxd-0 pacemakerd: info: crm_get_peer: Created entry bf4a27d3-7b65-4b3b-a31f-d0c3cf3a923e/0x55d4e382f570 for node (null)/2130706433 (1 total) May 09 11:57:17 [36525] juju-264ace-19-lxd-0 pacemakerd: info: crm_get_peer: Node 2130706433 has uuid 2130706433 May 09 11:57:17 [36525] juju-264ace-19-lxd-0 pacemakerd: info: crm_update_peer_proc: cluster_connect_cpg: Node (null)[2130706433] - corosync-cpg is now online May 09 11:57:17 [36525] juju-264ace-19-lxd-0 pacemakerd: warning: cluster_connect_quorum: Quorum lost May 09 11:57:17 [36525] juju-264ace-19-lxd-0 pacemakerd: info: corosync_node_name: Unable to get node name for nodeid 2130706433 May 09 11:57:17 [36525] juju-264ace-19-lxd-0 pacemakerd: notice: get_node_name: Defaulting to uname -n for the local corosync node name May 09 11:57:17 [36525] juju-264ace-19-lxd-0 pacemakerd: info: crm_get_peer: Node 2130706433 is now known as juju-264ace-19-lxd-0 May 09 11:57:17 [36525] juju-264ace-19-lxd-0 pacemakerd: info: pcmk_quorum_notification: Quorum still lost | membership=4 members=1 May 09 11:57:17 [36525] juju-264ace-19-lxd-0 pacemakerd: notice: crm_update_peer_state_iter: Node juju-264ace-19-lxd-0 state is now member | nodeid=2130706433 previous=unknown source=pcmk_quorum_notification May 09 11:57:17 [36525] juju-264ace-19-lxd-0 pacemakerd: info: pcmk_cpg_membership: Node 2130706433 joined group pacemakerd (counter=0.0, pid=32767, unchecked for rivals) May 09 11:57:17 [36525] juju-264ace-19-lxd-0 pacemakerd: info: pcmk_cpg_membership: Node 2130706433 still member of group pacemakerd (peer=juju-264ace-19-lxd-0:36525, counter=0.0, at least once) we have applied the fix : echo 'APT::Periodic::Unattended-Upgrade "0";' > /etc/apt/apt.conf.d/90hacluster on AODH cluster nodes to far