Kubernetes master API services constantly restarting on update-status with update-status with restart_apiserver_for_encryption_key

Bug #1949807 reported by Diko Parvanov
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Kubernetes Control Plane Charm
Fix Released
Medium
Adam Dyess
Vault KV Charm Layer
Fix Released
Medium
Adam Dyess

Bug Description

kubernetes-master charm 1034
1.21/stable (1.21.6)

Non-leader units constantly restarting snap.kube-apiserver.daemon

Symptoms:

Seeing this appear in the logs every 5 minutes during the update-status hooks on the non-leader units.

```
tracer: set flag layer.vault-kv.app-kv.changed.encryption_key
tracer: ++ queue handler reactive/kubernetes_master.py:3115:restart_apiserver_for_encryption_key
```

Revision history for this message
Adam Dyess (addyess) wrote :

Looking at layer-vault-kv [1] , it seems that when `update-status` hooks run, every unit will contact vault and collect some app layer key-value data. Also, each unit collects its 'hash' of the data to determine if the value has been locally consumed by the charm. Those app layer kv hashes are only updated by the leader [2], and not every unit.

All units should read the same k-v store[3] on every hook, as well as their own unit's hash on every hook.

I believe its possible that the non-leaders are reading a different hash value than the hash generated from the value in the the app storage, which causes it to believe there is a change.

this layer tries to keep up with a shared k-v store and hashes the values to see if they have "changed" or not. When the non-leader wakes, it reads the KV store, sees that the `encryption_key` hash doesn't match the previous hash and declares it "changed".

can the kv-store in vault be interrogated?
Yes

```
vault kv get charm-kubernetes-master/kv/app
vault kv get charm-kubernetes-master/kv/app-hashes/7
vault kv get charm-kubernetes-master/kv/app-hashes/8
vault kv get charm-kubernetes-master/kv/app-hashes/9
```

Result:
```
ubuntu@juju-f7d8a7-lma-4:~$ for i in app app-hashes/{7,8,9}; do vault kv get -address http://172.16.100.4:8200 charm-kubernetes-master/kv/${i}; done
========= Data =========
Key Value
--- -----
encryption_key s[...]j
========= Data =========
Key Value
--- -----
encryption_key c[...]1
No value found at charm-kubernetes-master/kv/app-hashes/8
No value found at charm-kubernetes-master/kv/app-hashes/9
```

as somewhat expected, the non-leaders don't write their hash so they always appear "changed".

Work-around
```
HASH=$(vault kv get -field=encryption_key charm-kubernetes-master/kv/app-hashes/7)
vault kv put charm-kubernetes-master/kv/app-hashes/8 encryption_key=$HASH
vault kv put charm-kubernetes-master/kv/app-hashes/9 encryption_key=$HASH
```

[1] https://github.com/juju-solutions/layer-vault-kv/blob/e22c18b133070ce354cebbda864a5aa8a4b60398/lib/charms/layer/vault_kv.py#L101
[2] https://github.com/juju-solutions/layer-vault-kv/blob/e22c18b133070ce354cebbda864a5aa8a4b60398/reactive/vault_kv.py#L55
[3] https://github.com/juju-solutions/layer-vault-kv/blob/e22c18b133070ce354cebbda864a5aa8a4b60398/reactive/vault_kv.py#L49

Adam Dyess (addyess)
description: updated
Revision history for this message
Przemyslaw Lal (przemeklal) wrote :

Workaround described in [0] fixed the issue and affected kubernetes-master units stopped restarting.

[0] https://bugs.launchpad.net/charm-layer-vault-kv/+bug/1949807/comments/1

Revision history for this message
George Kraft (cynerva) wrote :
Changed in charm-layer-vault-kv:
milestone: none → 1.22+ck3
Changed in charm-kubernetes-master:
milestone: none → 1.22+ck3
importance: Undecided → Medium
Changed in charm-layer-vault-kv:
importance: Undecided → Medium
Changed in charm-kubernetes-master:
status: New → Fix Committed
Changed in charm-layer-vault-kv:
status: New → Fix Committed
tags: added: backport-needed
Changed in charm-kubernetes-master:
assignee: nobody → Adam Dyess (addyess)
Changed in charm-layer-vault-kv:
assignee: nobody → Adam Dyess (addyess)
Revision history for this message
George Kraft (cynerva) wrote :
Changed in charm-kubernetes-master:
milestone: 1.22+ck3 → 1.23
Changed in charm-layer-vault-kv:
milestone: 1.22+ck3 → 1.23
tags: removed: backport-needed
George Kraft (cynerva)
Changed in charm-kubernetes-master:
status: Fix Committed → Fix Released
Changed in charm-layer-vault-kv:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.