Activity log for bug #1960656

Date Who What changed Old value New value Message
2022-02-11 16:44:39 Alexander Balderson bug added bug
2022-02-11 16:44:39 Alexander Balderson attachment added juju-crashdump-kubernetes-maas-2022-02-10-22.47.07.tar.gz https://bugs.launchpad.net/bugs/1960656/+attachment/5560413/+files/juju-crashdump-kubernetes-maas-2022-02-10-22.47.07.tar.gz
2022-02-11 16:45:48 Alexander Balderson description During a deployment of latest Kubernetes on baremetal SQA ran into an issue where all 3 etcd units came up and were marked as active/idle, but the syslog shows etcd rejecting every request that came in, including those from 127.0.0.1. Here is a snip from the end of the log on etcd_2 Feb 10 22:25:34 juju-da6364-5-lxd-0 etcd[38396]: rejected connection from "10.246.168.186:53202" (error "EOF", ServerName "") Feb 10 22:26:07 juju-da6364-5-lxd-0 etcd[38396]: rejected connection from "10.246.169.4:56646" (error "EOF", ServerName "") Feb 10 22:26:43 juju-da6364-5-lxd-0 etcd[38396]: rejected connection from "10.246.169.209:45360" (error "EOF", ServerName "") Feb 10 22:27:28 juju-da6364-5-lxd-0 etcd[38396]: rejected connection from "10.246.168.187:60292" (error "EOF", ServerName "") Feb 10 22:27:29 juju-da6364-5-lxd-0 systemd[1]: Started snap.etcd.etcdctl.42388d27-d989-4115-8dfd-291f30bb6b6b.scope. Feb 10 22:27:29 juju-da6364-5-lxd-0 etcd[38396]: rejected connection from "127.0.0.1:52626" (error "tls: first record does not look like a TLS handshake", ServerName "") As a result 2 of the 3 vault units in the deployment were unable to connect to etcd and start the vault service, blocking the deployment. The etcd units are running etcd 3.4/stable I've attached the crashdump, but the testrun can be found at: https://solutions.qa.canonical.com/testruns/testRun/fb27ca53-2c5c-4ffe-9e59-516242fda696 During a deployment of latest Kubernetes on baremetal SQA ran into an issue where all 3 etcd units came up and were marked as active/idle, but the syslog shows etcd rejecting every request that came in, including those from 127.0.0.1. Here is a snip from the end of the log on etcd_2 Feb 10 22:25:34 juju-da6364-5-lxd-0 etcd[38396]: rejected connection from "10.246.168.186:53202" (error "EOF", ServerName "") Feb 10 22:26:07 juju-da6364-5-lxd-0 etcd[38396]: rejected connection from "10.246.169.4:56646" (error "EOF", ServerName "") Feb 10 22:26:43 juju-da6364-5-lxd-0 etcd[38396]: rejected connection from "10.246.169.209:45360" (error "EOF", ServerName "") Feb 10 22:27:28 juju-da6364-5-lxd-0 etcd[38396]: rejected connection from "10.246.168.187:60292" (error "EOF", ServerName "") Feb 10 22:27:29 juju-da6364-5-lxd-0 systemd[1]: Started snap.etcd.etcdctl.42388d27-d989-4115-8dfd-291f30bb6b6b.scope. Feb 10 22:27:29 juju-da6364-5-lxd-0 etcd[38396]: rejected connection from "127.0.0.1:52626" (error "tls: first record does not look like a TLS handshake", ServerName "") As a result 2 of the 3 vault units in the deployment were unable to connect to etcd and start the vault service, blocking the deployment. The etcd units are running etcd 3.4/stable I've attached the crashdump, but the testrun can be found at: https://solutions.qa.canonical.com/testruns/testRun/fb27ca53-2c5c-4ffe-9e59-516242fda696 All instances of this bug SQA hits can be found at: https://solutions.qa.canonical.com/bugs/bugs/bug/1960656
2022-06-30 13:25:58 Milos bug added subscriber Milos