Etcd Charm

Provide an action to recover from a majority failure

Bug #1842332 reported by Andrea Ieri on 2019-09-02

This bug affects 2 people

Affects		Status	Importance	Assigned to	Milestone
	Etcd Charm	In Progress	Medium	Justin Clark	Etcd Charm 1.29

Bug Description

An HA ETCD cluster can normally be scaled down to a single node by simply removing extra units. However, if the majority of the units needs to be force removed, relation departed hooks will not have a chance to run and the surviving unit(s) will not accept new cluster members.
In order to recover from this situation, the etcd cluster has to be restarted once with the force-new-cluster option set to true. This should be wrapped in an action.

Example: let's assume we have a 3-node ETCD cluster where etcd/0 is functional, while etcd/1 and etcd/2 are unrecoverable. In order to bring the cluster back to health, an operator needs to do the following:

1. juju remove-unit --force etcd/1
2. juju remove-unit --force etcd/2
3. vim /var/snap/etcd/common/etcd.conf.yml # set 'force-new-cluster:' to true
4. service snap.etcd.etcd restart
5. vim /var/snap/etcd/common/etcd.conf.yml # set 'force-new-cluster:' to false
6. juju add-unit -n2 etcd

Lines 3 to 6 should be performed by an action.

George Kraft (cynerva) on 2020-06-26

Changed in charm-etcd:
importance:	Undecided → Medium
status:	New → Triaged

Tim Van Steenburgh (tvansteenburgh) on 2020-08-04

Changed in charm-etcd:
assignee:	nobody → Justin Clark (justinclark)
status:	Triaged → In Progress

Revision history for this message

Adam Dyess (addyess) wrote on 2023-08-02:

Adding a link to the PR which was started to address this
https://github.com/charmed-kubernetes/layer-etcd/pull/177

Changed in charm-etcd:
milestone:	none → 1.28

Adam Dyess (addyess) on 2023-08-02

Changed in charm-etcd:
milestone:	1.28 → 1.28+ck1

Adam Dyess (addyess) on 2023-09-19

Changed in charm-etcd:
milestone:	1.28+ck1 → 1.29

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.