Comment 1 for bug 1903077

Revision history for this message
George Kraft (cynerva) wrote :

Thanks for the report and reproduction steps. I can reproduce this, although it appears to be a race condition so it might not reproduce with 100% certainty.

In my case, on kubernetes-master, both kube-controller-manager and kube-scheduler were failing to reach kube-apiserver due to "x509: certificate signed by unknown authority". This occurred because build_kubeconfig[1] ran before store_ca[2] and ca_written[3]. So while the charm did detect the change and restart services, it did so using kubeconfigs that were rendered with the old CA. On the next hook, it re-ran build_kubeconfig and rendered new kubeconfigs with the correct CA, but did not restart services.

To fix this, the charm's handling of the tls_client.ca.written flag will need to be adjusted to ensure new kubeconfigs are rendered before restarting the services.

[1]: https://github.com/charmed-kubernetes/charm-kubernetes-master/blob/1467e9ba8332c2959dd8f908aa29cee18f90e540/reactive/kubernetes_master.py#L1912
[2]: https://github.com/charmed-kubernetes/layer-tls-client/blob/9bfaafcd15ecdbfb435fd35c28057372f7d62e2b/reactive/tls_client.py#L19
[3]: https://github.com/charmed-kubernetes/charm-kubernetes-master/blob/1467e9ba8332c2959dd8f908aa29cee18f90e540/reactive/kubernetes_master.py#L1159