Comment 7 for bug 1896542

Revision history for this message
George Kraft (cynerva) wrote :

Regarding the earlier failures from comment #3 which predate vault's certificate caching, it looks like test_sans failed to get certificate information from kubeapi-load-balancer/0:

Traceback (most recent call last):
  File "/home/ubuntu/k8s-validation/jobs/integration/validation.py", line 1117, in test_sans
    await retry_async_with_timeout(
  File "/home/ubuntu/k8s-validation/jobs/integration/utils.py", line 198, in retry_async_with_timeout
    if await func(*args):
  File "/home/ubuntu/k8s-validation/jobs/integration/validation.py", line 1099, in all_certs_in_place
    certs = await get_server_certs()
  File "/home/ubuntu/k8s-validation/jobs/integration/validation.py", line 1079, in get_server_certs
    action = await juju_run(
  File "/home/ubuntu/k8s-validation/jobs/integration/utils.py", line 578, in juju_run
    raise JujuRunError(unit, cmd, result)
integration.utils.JujuRunError: `openssl s_client -connect 127.0.0.1:443 </dev/null 2>/dev/null | openssl x509 -text` failed on kubeapi-load-balancer/0:

unable to load certificate
140264670332224:error:0909006C:PEM routines:get_name:no start line:../crypto/pem/pem_lib.c:745:Expecting: TRUSTED CERTIFICATE

It looks like this can happen when the 127.0.0.1:443 endpoint is down, i.e. when nginx or kube-apiserver are restarting. Note that this "Expecting: TRUSTED CERTIFICATE" output is specific to focal; on jammy, openssl outputs something different.

I think we need to update test_sans to handle nginx and kube-apiserver restarts gracefully. Maybe just catch this JujuRunError and retry.