Comment 0 for bug 1962023

Revision history for this message
Alexander Balderson (asbalderson) wrote :

On a deployment of openstack with HA vault, 2 etcd units are stuck because of errored with zero known peers, and the logs reporting:

2022-02-22 05:48:20 WARNING unit.etcd/0.update-status logger.go:60 Error: open /var/snap/etcd/common/server.crt: no such file or directory
2022-02-22 05:48:20 ERROR unit.etcd/0.juju-log server.go:327 ['/snap/bin/etcd.etcdctl', 'cluster-health']
2022-02-22 05:48:20 ERROR unit.etcd/0.juju-log server.go:327 {'ETCDCTL_API': '2', 'ETCDCTL_CA_FILE': '/var/snap/etcd/common/ca.crt', 'ETCDCTL_CERT_FILE': '/var/snap/etcd/common/server.crt', 'ETCDCTL_KEY_FILE': '/var/snap/etcd/common/server.key'}
2022-02-22 05:48:20 ERROR unit.etcd/0.juju-log server.go:327 b''
2022-02-22 05:48:20 ERROR unit.etcd/0.juju-log server.go:327 None
2022-02-22 05:48:20 WARNING unit.etcd/0.juju-log server.go:327 Notice: Unit failed cluster-health check
2022-02-22 05:48:20 WARNING unit.etcd/0.update-status logger.go:60 open /var/snap/etcd/common/server.crt: no such file or directory
2022-02-22 05:48:20 ERROR unit.etcd/0.juju-log server.go:327 ['/snap/bin/etcd.etcdctl', 'member', 'list']
2022-02-22 05:48:20 ERROR unit.etcd/0.juju-log server.go:327 {'ETCDCTL_API': '2', 'ETCDCTL_CA_FILE': '/var/snap/etcd/common/ca.crt', 'ETCDCTL_CERT_FILE': '/var/snap/etcd/common/server.crt', 'ETCDCTL_KEY_FILE': '/var/snap/etcd/common/server.key'}
2022-02-22 05:48:20 ERROR unit.etcd/0.juju-log server.go:327 b''
2022-02-22 05:48:20 ERROR unit.etcd/0.juju-log server.go:327 None
2022-02-22 05:48:20 INFO unit.etcd/0.juju-log server.go:327 Invoking reactive handler: reactive/etcd.py:139:set_app_version
2022-02-22 05:48:20 INFO unit.etcd/0.juju-log server.go:327 Invoking reactive handler: reactive/etcd.py:153:prepare_tls_certificates
2022-02-22 05:48:21 INFO unit.etcd/0.juju-log server.go:327 Invoking reactive handler: reactive/etcd.py:264:set_db_ingress_address
2022-02-22 05:48:21 INFO unit.etcd/0.juju-log server.go:327 Invoking reactive handler: reactive/etcd.py:271:send_cluster_connection_details
2022-02-22 05:48:21 INFO unit.etcd/0.juju-log server.go:327 Invoking reactive handler: hooks/relations/tls-certificates/requires.py:79:joined:certificates
2022-02-22 05:48:21 INFO unit.etcd/0.juju-log server.go:327 status-set: active: Errored with 0 known peers

The third unit (etcd_2) is reporting that it is failing registration with a similar error:

2022-02-22 05:50:53 WARNING unit.etcd/2.update-status logger.go:60 client: etcd cluster is unavailable or misconfigured; error #0: client: endpoint https://10.246.164.239:2379 exceeded header timeout
2022-02-22 05:50:53 WARNING unit.etcd/2.update-status logger.go:60
2022-02-22 05:50:53 ERROR unit.etcd/2.juju-log server.go:327 ['/snap/bin/etcd.etcdctl', '--endpoint', 'https://10.246.164.239:2379', 'member', 'list']
2022-02-22 05:50:53 ERROR unit.etcd/2.juju-log server.go:327 {'ETCDCTL_API': '2', 'ETCDCTL_CA_FILE': '/var/snap/etcd/common/ca.crt', 'ETCDCTL_CERT_FILE': '/var/snap/etcd/common/server.crt', 'ETCDCTL_KEY_FILE': '/var/snap/etcd/common/server.key'}
2022-02-22 05:50:53 ERROR unit.etcd/2.juju-log server.go:327 b''
2022-02-22 05:50:53 ERROR unit.etcd/2.juju-log server.go:327 None
2022-02-22 05:50:53 INFO unit.etcd/2.juju-log server.go:327 etcdctl.register failed, will retry