worker nodes outside of openstack cannot join cluster when Octavia is used as LB for k8s-master
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Openstack Integrator Charm |
Triaged
|
Medium
|
Unassigned |
Bug Description
Kubernetes 1.17.5
Openstack Train (Stein for Octavia)
When using this bundle to deploy Openstack with Octavia:
https:/
Bare-metal worker nodes do not join the cluster permanently. They will join for a moment and then disappear and never come back.
the LB IP given to the masters from Octavia is 172.16.7.191 and is reachable from the bare-metal worker.
Openstack kubernetes bundle - https:/
k8s-worker bare-metal bundle - https:/
Juju status of kubernetes model on openstack controller:
---
Model Controller Cloud/Region Version SLA Timestamp
kubernetes openstack-regionone openstack/RegionOne 2.7.6 unsupported 16:39:35-04:00
App Version Status Scale Charm Store Rev OS Notes
ceph-proxy active 1 ceph-proxy jujucharms 29 ubuntu
containerd active 5 containerd jujucharms 61 ubuntu
easyrsa 3.0.1 active 1 easyrsa jujucharms 296 ubuntu
etcd 3.3.15 active 1 etcd jujucharms 496 ubuntu
flannel 0.11.0 active 5 flannel jujucharms 468 ubuntu
kubernetes-master 1.17.5 active 3 kubernetes-master jujucharms 808 ubuntu exposed
kubernetes-
openstack-
Unit Workload Agent Machine Public address Ports Message
ceph-proxy/0* active idle 0 172.16.7.184 Ready to proxy settings
easyrsa/0* active idle 1 172.16.7.193 Certificate Authority connected.
etcd/0* active idle 2 172.16.7.179 2379/tcp Healthy with 1 known peer
kubernetes-
containerd/0* active idle 172.16.7.203 Container runtime available
flannel/0* active idle 172.16.7.203 Flannel subnet 10.1.89.1/24
kubernetes-master/1 active idle 4 172.16.7.185 6443/tcp Kubernetes master running.
containerd/4 active idle 172.16.7.185 Container runtime available
flannel/4 active idle 172.16.7.185 Flannel subnet 10.1.67.1/24
kubernetes-master/2 active idle 5 172.16.7.187 6443/tcp Kubernetes master running.
containerd/3 active idle 172.16.7.187 Container runtime available
flannel/3 active idle 172.16.7.187 Flannel subnet 10.1.39.1/24
kubernetes-
containerd/1 active idle 172.16.7.192 Container runtime available
flannel/1 active idle 172.16.7.192 Flannel subnet 10.1.38.1/24
kubernetes-
containerd/2 active idle 172.16.7.201 Container runtime available
flannel/2 active idle 172.16.7.201 Flannel subnet 10.1.36.1/24
openstack-
Machine State DNS Inst id Series AZ Message
0 started 172.16.7.184 00028a1f-
1 started 172.16.7.193 aaa825e1-
2 started 172.16.7.179 4de47d67-
3 started 172.16.7.203 20d36010-
4 started 172.16.7.185 9bd706de-
5 started 172.16.7.187 880cb07a-
6 started 172.16.7.192 f4bfbd36-
7 started 172.16.7.201 cb82027c-
8 started 172.16.7.194 507048d1-
Offer Application Charm Rev Connected Endpoint Interface Role
easyrsa easyrsa easyrsa 296 1/1 client tls-certificates provider
etcd etcd etcd 496 1/1 db etcd provider
kubernetes-
kubernetes-
Relation provider Requirer Interface Type Message
ceph-proxy:client kubernetes-
easyrsa:client etcd:certificates tls-certificates regular
easyrsa:client kubernetes-
easyrsa:client kubernetes-
etcd:cluster etcd:cluster etcd peer
etcd:db flannel:etcd etcd regular
etcd:db kubernetes-
kubernetes-
kubernetes-
kubernetes-
kubernetes-
kubernetes-
kubernetes-
kubernetes-
kubernetes-
kubernetes-
openstack-
openstack-
openstack-
---
juju status of bare-metal worker model on MAAS controller:
---
Model Controller Cloud/Region Version SLA Timestamp
k8s-worker jhillman-maas jhillman-maas 2.7.6 unsupported 16:41:01-04:00
SAAS Status Store URL
easyrsa active openstack-regionone admin/kubernete
etcd active openstack-regionone admin/kubernete
kubernetes-
kubernetes-
App Version Status Scale Charm Store Rev OS Notes
containerd active 1 containerd jujucharms 61 ubuntu
flannel 0.11.0 active 1 flannel jujucharms 468 ubuntu
kubernetes-
Unit Workload Agent Machine Public address Ports Message
kubernetes-
containerd/0* active idle 172.16.7.92 Container runtime available
flannel/0* active idle 172.16.7.92 Flannel subnet 10.1.92.1/24
Machine State DNS Inst id Series AZ Message
0 started 172.16.7.92 agrippa bionic default Deployed
Relation provider Requirer Interface Type Message
easyrsa:client kubernetes-
etcd:db flannel:etcd etcd regular
kubernetes-
kubernetes-
kubernetes-
kubernetes-
kubernetes-
---
As can be seen on the bare-metal model, the charm says that the worker is running, but 'kubectl get nodes' only shows the openstack nodes.
---
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
juju-472727-
juju-472727-
---
unit log from the bare-metal worker and syslog from bare-metal worker will be uploaded to the bug.
In the syslog is the repeated message of:
---
May 11 20:42:36 agrippa kubelet.
May 11 20:42:36 agrippa kubelet.
May 11 20:42:36 agrippa kubelet.
May 11 20:42:36 agrippa kubelet.
May 11 20:42:36 agrippa kubelet.
May 11 20:42:36 agrippa kubelet.
May 11 20:42:36 agrippa kubelet.
May 11 20:42:36 agrippa kubelet.
May 11 20:42:36 agrippa kubelet.
May 11 20:42:36 agrippa kubelet.
May 11 20:42:36 agrippa kubelet.
May 11 20:42:36 agrippa kubelet.
May 11 20:42:36 agrippa kubelet.
May 11 20:42:36 agrippa kubelet.
---
The kubernetes master can resolve this node name (agrippa), in fact everyone in this environment can resolve that name, and the fqdn of agrippa.maas.
A grep of 'agrippa' on the syslog of a kubernetes master in the kubernetes model on the openstack controller.:
---
May 11 20:18:56 juju-472727-
May 11 20:18:56 juju-472727-
May 11 20:18:56 juju-472727-
May 11 20:18:56 juju-472727-
May 11 20:19:00 juju-472727-
May 11 20:19:00 juju-472727-
May 11 20:19:00 juju-472727-
May 11 20:19:00 juju-472727-
May 11 20:19:00 juju-472727-
May 11 20:19:00 juju-472727-
May 11 20:19:00 juju-472727-
May 11 20:19:00 juju-472727-
May 11 20:19:00 juju-472727-
May 11 20:19:00 juju-472727-
May 11 20:19:01 juju-472727-
May 11 20:19:01 juju-472727-
May 11 20:19:01 juju-472727-
May 11 20:19:01 juju-472727-
May 11 20:19:01 juju-472727-
May 11 20:19:01 juju-472727-
May 11 20:19:05 juju-472727-
May 11 20:19:05 juju-472727-
May 11 20:19:05 juju-472727-
May 11 20:19:22 juju-472727-
---
the syslog of that kubernetes-master node will be uploaded to the bug.
Using the same openstack deployment, creating a kubernetes model with this bundle: https:/
from a lot of troubleshooting, the only key differences are:
- octavia instead of kube-api-lb (used with address pairs in openstack to allow a floating VIP)
- the relations to openstack-
- the CMR going to k8s-master for the api-endpoint as opposed to kube-api-lb
I believe the issue is coming into play somehow with how either octavia is being configured, or possibly how the masters are listening. The /root/.kube/config file is identical on the bare-metal workers and the openstack workers. and the bare-metal worker can run kubectl commands successfully with this config.
Changed in charm-openstack-integrator: | |
importance: | High → Medium |
tarball containing the following:
kubernetes- master- os-syslog - /var/log/syslog from k8s-master in kubernetes model in openstack controller
unit-kubernetes -master- 0.log - /var/log/juju unit file from k8s-master in kubernetes model in openstack controller
kubernetes- worker- bm-syslog - /var/log/syslog from k8s-worker in bare-metal model in MAAS
un-tkubernetes- worker- bm.log - /var/log/juju unit file from k8s-worker in bare-metal model in MAAS