dns-provider=auto fails to re-deploy core-dns after changing to dns-provider=none
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Kubernetes Control Plane Charm |
Triaged
|
Medium
|
Unassigned |
Bug Description
We had a deployment where the customer wanted to change to using their own custom cluster dns service, so we performed the following:
juju config kubernetes-master dns-provider=none (was previously auto)
This deleted the core-dns pod deployed by cdk-addons from kube-system namespace which had been customized by the customer.
Since it was unexpected that the current core-dns pod would be deleted, we had outages of DNS for the entire cluster. Before we re-deployed a new core-dns pod manually, we tried to get the charm to roll-back the change by running:
juju config kubernetes-master dns-provider=auto
However, this did not trigger re-deployment of core-dns, though that was the dns-provider configuration found in /var/snap/
It would be expected that re-adding dns-provider=auto would re-deploy the pod if it doesn't exist in the namespace.
Changed in charm-kubernetes-master: | |
importance: | Undecided → Medium |
status: | Confirmed → Triaged |
Hey Drew, thanks for the report.
What prevents this is kubernetes-master's get_dns_provider function[1] that takes dns-provider=auto to mean "keep using the existing provider," even if it was 'none'. When you go from 'none' to 'auto', it will keep using 'none'. It's by design, but it's not an intuitive design, and could use improvement.
> However, this did not trigger re-deployment of core-dns, though that was the dns-provider configuration found in /var/snap/ cdk-addons/ current/ dns-provider file. I think the apply function is not run upon changing of this variable, but maybe only during deployment and charm/k8s version upgrade.
The kubernetes-master leader unit reconfigures cdk-addons and runs cdk-addons.apply on every hook: config-changed, update-status, etc. I don't think that's likely to be an issue here.
I am confused how you could have a value of 'core-dns' in /var/snap/ cdk-addons/ current/ dns-provider after changing dns-provider from 'none' to 'auto'. Given the get_dns_provider function I described above, it should stay 'none'. Can you confirm that you see that on the kubernetes-master leader unit and not one of the followers?
[1]: https:/ /github. com/charmed- kubernetes/ charm-kubernete s-master/ blob/daa8e11d70 adb92e46853229e d88c57f4eeafc21 /reactive/ kubernetes_ master. py#L2618- L2641