remove HA etcd application in error state
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Etcd Charm |
Triaged
|
Medium
|
Unassigned |
Bug Description
With 3 units etcd HA, remove application removed 2 units and leave one in error state.
etcd 3.1.10 error 1 etcd jujucharms 434 ubuntu
etcd/2* error idle 2/lxd/1 10.244.245.235 2379/tcp hook failed: "cluster-
2019-07-05 05:11:39 DEBUG cluster-
2019-07-05 05:11:39 DEBUG cluster-
2019-07-05 05:11:39 DEBUG cluster-
2019-07-05 05:11:39 DEBUG cluster-
2019-07-05 05:11:39 DEBUG cluster-
2019-07-05 05:11:39 DEBUG cluster-
2019-07-05 05:11:39 DEBUG cluster-
2019-07-05 05:11:39 DEBUG cluster-
2019-07-05 05:11:39 DEBUG cluster-
2019-07-05 05:11:39 DEBUG cluster-
2019-07-05 05:11:39 DEBUG cluster-
2019-07-05 05:11:39 DEBUG cluster-
2019-07-05 05:11:39 DEBUG cluster-
2019-07-05 05:11:39 DEBUG cluster-
2019-07-05 05:11:39 DEBUG cluster-
2019-07-05 05:11:39 DEBUG cluster-
2019-07-05 05:11:39 DEBUG cluster-
2019-07-05 05:11:39 DEBUG cluster-
2019-07-05 05:11:39 ERROR juju.worker.
2019-07-05 05:11:39 DEBUG juju.machinelock machinelock.go:180 machine lock released for uniter (run relation-broken (3) hook)
summary: |
- remove etcd application in error state + remove HA etcd application in error state |
We've never encountered this, but I see how it could happen. The last unit is trying to unregister itself from a cluster that no longer exists. That happens here: https:/ /github. com/charmed- kubernetes/ layer-etcd/ blob/aca040b46a c80e97da8ea3135 b46216cf6bb854c /reactive/ etcd.py# L598