very slow juju unit removals for nova compute charm
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Nova Compute Charm |
New
|
Undecided
|
Unassigned |
Bug Description
Juju charm version
nova-compute-kvm nova-compute jujucharms 314 ubuntu
Deleting the only two units on a hyperconverged node (not including subordinate charms)
juju remove-unit nova-compute-kvm/#
juju remove-unit ceph-osd-ssd/#
All other charms remove within 45 minutes or so but nova compute is taking longer then 3 hrs to remove.
This cloud has a total of 43 compute nodes.
The nodes are 40 cores with about 1T or RAM so node performance is not a concern
I was able to get the entire removal process in the units juju logs and have attached that for review.
----
From what I can tell for example the ceph-mon relation (unit ceph-mon/6)
Triggers more then once
On nova compute unit
2021-03-29 13:40:40 DEBUG juju.machinelock machinelock.go:172 machine lock acquired for nova-compute-kvm/52 uniter (run relation-changed (96; unit: ceph-mon/6) hook)
2021-03-29 13:40:40 DEBUG juju.worker.
2021-03-29 13:40:45 DEBUG juju.worker.
2021-03-29 13:41:12 DEBUG juju.worker.
2021-03-29 13:41:21 DEBUG juju.machinelock machinelock.go:186 machine lock released for nova-compute-kvm/52 uniter (run relation-changed (96; unit: ceph-mon/6) hook)
2021-03-29 15:29:08 DEBUG juju.worker.
2021-03-29 15:29:08 DEBUG juju.machinelock machinelock.go:162 acquire machine lock for nova-compute-kvm/52 uniter (run relation-changed (96; unit: ceph-mon/6) hook)
2021-03-29 15:29:08 DEBUG juju.machinelock machinelock.go:172 machine lock acquired for nova-compute-kvm/52 uniter (run relation-changed (96; unit: ceph-mon/6) hook)
2021-03-29 15:29:08 DEBUG juju.worker.
2021-03-29 15:29:10 DEBUG juju.worker.
2021-03-29 15:29:32 DEBUG juju.worker.
2021-03-29 15:29:38 DEBUG juju.machinelock machinelock.go:186 machine lock released for nova-compute-kvm/52 uniter (run relation-changed (96; unit: ceph-mon/6) hook)
2021-03-29 19:21:37 DEBUG juju.worker.
2021-03-29 19:21:37 DEBUG juju.machinelock machinelock.go:162 acquire machine lock for nova-compute-kvm/52 uniter (run relation-departed (96; unit: ceph-mon/6, departee: nova-compute-
2021-03-29 19:21:37 DEBUG juju.machinelock machinelock.go:172 machine lock acquired for nova-compute-kvm/52 uniter (run relation-departed (96; unit: ceph-mon/6, departee: nova-compute-
2021-03-29 19:21:37 DEBUG juju.worker.
2021-03-29 19:21:38 DEBUG juju.worker.
2021-03-29 19:21:39 DEBUG juju.worker.
2021-03-29 19:21:41 DEBUG juju.machinelock machinelock.go:186 machine lock released for nova-compute-kvm/52 uniter (run relation-departed (96; unit: ceph-mon/6, departee: nova-compute-
One ceph units 6
on ceph unit I see a good number of relation changed events
ubuntu@
20050
2021-03-26 15:44:16 DEBUG jujuc server.go:211 running hook tool "relation-get" for ceph-mon/
--
zgrep relation-joined unit-ceph-
9327
2021-03-26 14:16:50 DEBUG jujuc server.go:211 running hook tool "relation-get" for ceph-mon/
On the actual nova compute node we have these counts for various hooks that are run.
fgrep -c relation-joined *.log
unit-ceph-
unit-clamav-
unit-filebeat-
unit-hw-
unit-landscape-
unit-lldpd-
unit-neutron-
unit-nova-
unit-nrpe-
unit-nrpe-
unit-ntp-
relation changed
unit-neutron-
unit-nova-
unit-nrpe-
unit-nrpe-
unit-ntp-
relation-departed
unit-nova-
unit-nrpe-
unit-nrpe-
unit-ntp-
config-changed
unit-nova-
unit-nrpe-
unit-nrpe-
unit-ntp-832.log:0
tags: | added: scaleback |
I have attached the nova log which contains the removal logs.