Regarding the earlier failures from comment #3 which predate vault's certificate caching, it looks like test_sans failed to get certificate information from kubeapi-load-balancer/0:
Traceback (most recent call last):
File "/home/ubuntu/k8s-validation/jobs/integration/validation.py", line 1117, in test_sans
await retry_async_with_timeout(
File "/home/ubuntu/k8s-validation/jobs/integration/utils.py", line 198, in retry_async_with_timeout
if await func(*args):
File "/home/ubuntu/k8s-validation/jobs/integration/validation.py", line 1099, in all_certs_in_place
certs = await get_server_certs()
File "/home/ubuntu/k8s-validation/jobs/integration/validation.py", line 1079, in get_server_certs
action = await juju_run(
File "/home/ubuntu/k8s-validation/jobs/integration/utils.py", line 578, in juju_run
raise JujuRunError(unit, cmd, result)
integration.utils.JujuRunError: `openssl s_client -connect 127.0.0.1:443 </dev/null 2>/dev/null | openssl x509 -text` failed on kubeapi-load-balancer/0:
unable to load certificate
140264670332224:error:0909006C:PEM routines:get_name:no start line:../crypto/pem/pem_lib.c:745:Expecting: TRUSTED CERTIFICATE
It looks like this can happen when the 127.0.0.1:443 endpoint is down, i.e. when nginx or kube-apiserver are restarting. Note that this "Expecting: TRUSTED CERTIFICATE" output is specific to focal; on jammy, openssl outputs something different.
I think we need to update test_sans to handle nginx and kube-apiserver restarts gracefully. Maybe just catch this JujuRunError and retry.
Regarding the earlier failures from comment #3 which predate vault's certificate caching, it looks like test_sans failed to get certificate information from kubeapi- load-balancer/ 0:
Traceback (most recent call last): ubuntu/ k8s-validation/ jobs/integratio n/validation. py", line 1117, in test_sans with_timeout( ubuntu/ k8s-validation/ jobs/integratio n/utils. py", line 198, in retry_async_ with_timeout ubuntu/ k8s-validation/ jobs/integratio n/validation. py", line 1099, in all_certs_in_place ubuntu/ k8s-validation/ jobs/integratio n/validation. py", line 1079, in get_server_certs ubuntu/ k8s-validation/ jobs/integratio n/utils. py", line 578, in juju_run utils.JujuRunEr ror: `openssl s_client -connect 127.0.0.1:443 </dev/null 2>/dev/null | openssl x509 -text` failed on kubeapi- load-balancer/ 0:
File "/home/
await retry_async_
File "/home/
if await func(*args):
File "/home/
certs = await get_server_certs()
File "/home/
action = await juju_run(
File "/home/
raise JujuRunError(unit, cmd, result)
integration.
unable to load certificate :error: 0909006C: PEM routines: get_name: no start line:.. /crypto/ pem/pem_ lib.c:745: Expecting: TRUSTED CERTIFICATE
140264670332224
It looks like this can happen when the 127.0.0.1:443 endpoint is down, i.e. when nginx or kube-apiserver are restarting. Note that this "Expecting: TRUSTED CERTIFICATE" output is specific to focal; on jammy, openssl outputs something different.
I think we need to update test_sans to handle nginx and kube-apiserver restarts gracefully. Maybe just catch this JujuRunError and retry.