etcd remains unhealthy after unit removal
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Etcd Charm |
Fix Released
|
Undecided
|
Berkay Tekin Öz |
Bug Description
Removing any unit(leader or not) from etcd results in etcd being stuck at an unhealthy state. The main cause seems to be that the etcd peers are not getting updated as necessary, resulting in dangling peers(removed units) in the cluster that are unreachable.
Steps to reproduce:
1. Deploy easyrsa with `juju deploy cs:~containers/
2. Deploy etcd with `juju deploy cs:~containers/
3. Relate etcd and easyrsa with `juju add-relation etcd easyrsa`
4. Add 2 more etcd units `juju add-unit -n 2 etcd`
5. Remove a unit from etcd `juju remove-unit etcd/2`
Some related logs can be seen below:
unit-etcd-1: 21:02:16 INFO unit.etcd/
unit-etcd-1: 21:02:18 ERROR unit.etcd/
unit-etcd-1: 21:02:18 ERROR unit.etcd/
unit-etcd-1: 21:02:18 ERROR unit.etcd/
unit-etcd-1: 21:02:18 ERROR unit.etcd/
unit-etcd-1: 21:02:18 WARNING unit.etcd/
tags: | added: backport-needed |
tags: | removed: backport-needed |
Changed in charm-etcd: | |
status: | Fix Committed → Fix Released |
PR: https:/ /github. com/charmed- kubernetes/ layer-etcd/ pull/196