when using etcd snap under charm, root can't run cluster health check

Bug #1809386 reported by Tim Van Steenburgh
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Etcd Charm
New
Undecided
Unassigned

Bug Description

Opened by afreiberger on 2018-10-16 21:00:34+00:00 at https://github.com/juju-solutions/layer-etcd/issues/140

------------------------------------------------------------

I'm not sure if this is expected behavior, but when running the etcd snap under the cs:etcd charm, it appears that running etcdctl commands as root are denied due to a problem with the CA cert not being trusted in the root environment, however, the ubuntu user on that unit can query cluster-health just fine.
ubuntu@juju-7e2a4a-15-lxd-7:/var/snap/etcd/common$ sudo etcd.etcdctl cluster-health
failed to check the health of member dfcad738516bf405 on https://10.10.0.115:2379: Get https://10.10.0.115:2379/health: x509: certificate signed by unknown authority
member dfcad738516bf405 is unreachable: [https://10.10.0.115:2379] are all unreachable
failed to check the health of member ffa62da5d402d749 on https://10.10.0.116:2379: Get https://10.10.0.116:2379/health: x509: certificate signed by unknown authority
member ffa62da5d402d749 is unreachable: [https://10.10.0.116:2379] are all unreachable
cluster is unhealthy
ubuntu@juju-7e2a4a-15-lxd-7:/var/snap/etcd/common$ etcd.etcdctl cluster-health
2018-10-16 20:57:11.980201 I | warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
2018-10-16 20:57:11.982259 I | warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
member dfcad738516bf405 is healthy: got healthy result from https://10.10.0.115:2379
member ffa62da5d402d749 is healthy: got healthy result from https://10.10.0.116:2379
cluster is healthy

====================== COMMENTS ============================

Comment created by paulgear on 2018-10-18 06:06:55+00:00

This seems to be caused by an environmental issue:
```
ubuntu@juju-0f73fb-stg-is-kubernetes-71:~$ sudo etcdctl cluster-health
sudo: unable to resolve host juju-0f73fb-stg-is-kubernetes-71
failed to check the health of member 4336bf307e4efba on https://10.25.127.55:2379: Get https://10.25.127.55:2379/health: x509: certificate signed by unknown authority
member 4336bf307e4efba is unreachable: [https://10.25.127.55:2379] are all unreachable
failed to check the health of member 12905b2a1e7bb8a7 on https://10.25.127.54:2379: Get https://10.25.127.54:2379/health: x509: certificate signed by unknown authority
member 12905b2a1e7bb8a7 is unreachable: [https://10.25.127.54:2379] are all unreachable
failed to check the health of member f02319eaab88c2a8 on https://10.25.127.53:2379: Get https://10.25.127.53:2379/health: x509: certificate signed by unknown authority
member f02319eaab88c2a8 is unreachable: [https://10.25.127.53:2379] are all unreachable
cluster is unhealthy
ubuntu@juju-0f73fb-stg-is-kubernetes-71:~$ sudo -i
sudo: unable to resolve host juju-0f73fb-stg-is-kubernetes-71
root@juju-0f73fb-stg-is-kubernetes-71:~# etcdctl cluster-health
member 4336bf307e4efba is healthy: got healthy result from https://10.25.127.55:2379
member 12905b2a1e7bb8a7 is healthy: got healthy result from https://10.25.127.54:2379
member f02319eaab88c2a8 is healthy: got healthy result from https://10.25.127.53:2379
cluster is healthy
```
It also exhibits itself as incorrect charm status output. After an upgrade to this charm, status shows "Errored with 0 known peers", even though the cluster is clearly healthy from the above `etcdctl` output.

------------------------------------------------------------

Comment created by paulgear on 2018-10-19 03:09:34+00:00

I straced this and determined that in my case it's definitely the proxy setting per https://github.com/juju-solutions/layer-etcd/issues/122.

tags: added: canonical-bootstack
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.