Comment 1 for bug 1916927

Revision history for this message
George Kraft (cynerva) wrote :

Thanks for the report. Slightly different symptom this time, as the "Reaching out to nginx.netpolicy with restrictions" message is only logged once:

2021-02-25-07:06:50 root DEBUG Reaching out to nginx.netpolicy with no restrictions
2021-02-25-07:07:28 root DEBUG Reaching out to nginx.netpolicy with no restrictions
2021-02-25-07:07:41 root DEBUG Reaching out to nginx.netpolicy with restrictions
2021-02-25-07:36:00 root ERROR [localhost] Command failed: ...

Looks like the test is hanging here: https://github.com/charmed-kubernetes/jenkins/blob/9f180e2be0d209a6b82be93bda8f9623cd133bf8/jobs/integration/validation.py#L543-L552

At 07:09:03, kubernetes-master/2 acquires a machine lock for action 108 and never releases it:

2021-02-25 07:09:03 DEBUG juju.machinelock machinelock.go:172 machine lock acquired for kubernetes-master/2 uniter (run action 108)
2021-02-25 07:09:03 DEBUG juju.worker.uniter.operation executor.go:132 preparing operation "run action 108" for kubernetes-master/2
2021-02-25 07:09:03 DEBUG juju.worker.uniter.operation executor.go:132 executing operation "run action 108" for kubernetes-master/2
2021-02-25 07:09:03 DEBUG juju.worker.uniter.runner runner.go:288 juju-run action is running

That action is definitely holding things up. Unfortunately, I'm not able to find which action that is or what command it ran. That info doesn't appear to be collected in the crashdump.