When testing juju 3.2.2, SQA saw "ERROR connection is shut down" during deploy kubernetes in this run, https://oil-jenkins.canonical.com/artifacts/2d4ae78c-d4f1-462d-8b47-0c2d6189cc24/index.html
crashdump could be found here, https://oil-jenkins.canonical.com/artifacts/2d4ae78c-d4f1-462d-8b47-0c2d6189cc24/generated/generated/juju_openstack_controller/juju-crashdump-controller-2023-08-16-10.34.29.tar.gz
Unit Workload Agent Machine Public address Ports Message
controller/0 unknown lost 0 10.244.33.200 agent lost, see 'juju show-status-log controller/0'
controller/1 unknown lost 1 10.244.33.187 agent lost, see 'juju show-status-log controller/1'
controller/2 unknown lost 2 10.244.33.248 agent lost, see 'juju show-status-log controller/2'
From unit-controller-0.log
2023-08-16 10:14:14 DEBUG unit.controller/0.juju-log server.go:325 Operator Framework 1.5.3 up and running.
2023-08-16 10:14:15 DEBUG juju.worker.uniter.remotestate watcher.go:768 got leadership change for controller/0: leader
2023-08-16 10:14:15 DEBUG juju.worker.dependency engine.go:618 "leadership-tracker" manifold worker stopped: error while controller/0 waiting for controller leadership release: error blocking on leadership release: lease manager stopped
stack trace:
lease manager stopped
github.com/juju/juju/api/agent/leadership.(*client).BlockUntilLeadershipReleased:57: error blocking on leadership release
github.com/juju/juju/worker/leadership.(*Tracker).loop:140: error while controller/0 waiting for controller leadership release
2023-08-16 10:14:15 ERROR juju.worker.dependency engine.go:695 "leadership-tracker" manifold worker returned unexpected error: error while controller/0 waiting for controller leadership release: error blocking on leadership release: lease manager stopped
2023-08-16 10:14:15 DEBUG juju.worker.uniter runlistener.go:130 juju-exec listener stopping
2023-08-16 10:14:15 DEBUG juju.worker.uniter runlistener.go:149 juju-exec listener stopped
2023-08-16 10:14:18 DEBUG juju.worker.dependency engine.go:580 "leadership-tracker" manifold worker started at 2023-08-16 10:14:18.67507489 +0000 UTC
2023-08-16 10:14:20 DEBUG unit.controller/0.juju-log server.go:325 Emitting Juju event update_status.
2023-08-16 10:14:20 DEBUG juju.worker.dependency engine.go:618 "log-sender" manifold worker stopped: sending log message: write tcp 253.248.0.1:53980->253.248.0.1:17070: write: broken pipe
stack trace:
write tcp 253.248.0.1:53980->253.248.0.1:17070: write: broken pipe
github.com/juju/juju/api.(*DeadlineStream).WriteJSON:93:
github.com/juju/juju/api/logsender.(*writer).WriteLog:99: sending log message
github.com/juju/juju/worker/logsender.New.func1:69:
2023-08-16 10:14:20 ERROR juju.worker.dependency engine.go:695 "log-sender" manifold worker returned unexpected error: sending log message: write tcp 253.248.0.1:53980->253.248.0.1:17070: write: broken pipe
2023-08-16 10:14:23 DEBUG juju.worker.dependency engine.go:580 "log-sender" manifold worker started at 2023-08-16 10:14:23.770404603 +0000 UTC
2023-08-16 10:14:29 INFO juju.worker.uniter.operation runhook.go:186 ran "update-status" hook (via hook dispatching script: dispatch)
2023-08-16 10:14:36 DEBUG juju.machinelock machinelock.go:206 created rotating log file "/var/log/juju/machine-lock.log" with max size 10 MB and max backups 5
2023-08-16 10:14:36 DEBUG juju.machinelock machinelock.go:190 machine lock "machine-lock" released for controller/0 uniter (run update-status hook)
2023-08-16 10:14:37 DEBUG juju.worker.uniter.operation executor.go:118 lock released for controller/0
2023-08-16 10:14:37 ERROR juju.worker.uniter agent.go:33 resolver loop error: executing operation "run update-status hook" for controller/0: writing state: connection is shut down
2023-08-16 10:14:37 DEBUG juju.worker.uniter agent.go:22 [AGENT-STATUS] failed: resolver loop error
2023-08-16 10:14:37 ERROR juju.worker.uniter agent.go:36 updating agent status: connection is shut down
2023-08-16 10:14:37 INFO juju.worker.uniter uniter.go:338 unit "controller/0" shutting down: executing operation "run update-status hook" for controller/0: writing state: connection is shut down
2023-08-16 10:14:37 DEBUG juju.worker.uniter.remotestate watcher.go:1092 starting secrets rotation watcher
2023-08-16 10:14:37 INFO juju.worker.logger logger.go:136 logger worker stopped
From unit-controller-1.log
2023-08-16 10:14:32 INFO juju.worker.logger logger.go:136 logger worker stopped
2023-08-16 10:14:32 DEBUG juju.worker.dependency engine.go:603 "api-config-watcher" manifold worker completed successfully
2023-08-16 10:14:32 DEBUG juju.worker.dependency engine.go:603 "s3-caller" manifold worker completed successfully
2023-08-16 10:14:32 DEBUG juju.worker.dependency engine.go:603 "agent" manifold worker completed successfully
2023-08-16 10:14:32 DEBUG juju.worker.dependency engine.go:603 "log-sender" manifold worker completed successfully
2023-08-16 10:14:32 INFO juju.worker.uniter uniter.go:338 unit "controller/1" shutting down: catacomb 0xc005911680 is dying
2023-08-16 10:14:32 DEBUG juju.worker.uniter runlistener.go:130 juju-exec listener stopping
2023-08-16 10:14:32 DEBUG juju.worker.dependency engine.go:603 "metric-sender" manifold worker completed successfully
2023-08-16 10:14:32 DEBUG juju.worker.dependency engine.go:603 "charm-dir" manifold worker completed successfully
2023-08-16 10:14:32 DEBUG juju.worker.dependency engine.go:603 "metric-spool" manifold worker completed successfully
2023-08-16 10:14:32 DEBUG juju.worker.dependency engine.go:603 "metric-collect" manifold worker completed successfully
2023-08-16 10:14:32 DEBUG juju.worker.uniter runlistener.go:149 juju-exec listener stopped
2023-08-16 10:14:32 DEBUG juju.worker.dependency engine.go:603 "migration-minion" manifold worker completed successfully
2023-08-16 10:14:32 DEBUG juju.worker.dependency engine.go:618 "leadership-tracker" manifold worker stopped: leadership failure: error making a leadership claim: connection is shut down
stack trace:
github.com/juju/juju/rpc.init:14: connection is shut down
github.com/juju/juju/rpc.(*Conn).Call:178:
github.com/juju/juju/api.(*state).APICall:1256:
github.com/juju/juju/api/agent/leadership.(*client).bulkClaimLeadership:89: error making a leadership claim
github.com/juju/juju/worker/leadership.(*Tracker).refresh:187: leadership failure
github.com/juju/juju/worker/leadership.(*Tracker).loop:147:
2023-08-16 10:14:32 DEBUG juju.worker.dependency engine.go:618 "migration-inactive-flag" manifold worker stopped: connection is shut down
stack trace:
github.com/juju/juju/rpc.init:14: connection is shut down
github.com/juju/juju/rpc.(*Conn).Call:178:
github.com/juju/juju/api.(*state).APICall:1256:
2023-08-16 10:14:32 DEBUG juju.worker.dependency engine.go:618 "meter-status" manifold worker stopped: connection is shut down
stack trace:
github.com/juju/juju/rpc.init:14: connection is shut down
github.com/juju/juju/rpc.(*Conn).Call:178:
github.com/juju/juju/api.(*state).APICall:1256:
2023-08-16 10:14:32 DEBUG juju.worker.dependency engine.go:618 "api-address-updater" manifold worker stopped: connection is shut down
stack trace:
github.com/juju/juju/rpc.init:14: connection is shut down
github.com/juju/juju/rpc.(*Conn).Call:178:
github.com/juju/juju/api.(*state).APICall:1256:
2023-08-16 10:14:32 DEBUG juju.worker.dependency engine.go:618 "secret-drain-worker" manifold worker stopped: connection is shut down
stack trace:
github.com/juju/juju/rpc.init:14: connection is shut down
github.com/juju/juju/rpc.(*Conn).Call:178:
github.com/juju/juju/api.(*state).APICall:1256:
2023-08-16 10:14:32 DEBUG juju.worker.dependency engine.go:618 "logging-config-updater" manifold worker stopped: connection is shut down
stack trace:
github.com/juju/juju/rpc.init:14: connection is shut down
github.com/juju/juju/rpc.(*Conn).Call:178:
github.com/juju/juju/api.(*state).APICall:1256:
2023-08-16 10:14:32 DEBUG juju.worker.dependency engine.go:618 "hook-retry-strategy" manifold worker stopped: connection is shut down
stack trace:
github.com/juju/juju/rpc.init:14: connection is shut down
github.com/juju/juju/rpc.(*Conn).Call:178:
github.com/juju/juju/api.(*state).APICall:1256:
2023-08-16 10:14:32 DEBUG juju.worker.dependency engine.go:603 "migration-fortress" manifold worker completed successfully
2023-08-16 10:14:32 DEBUG juju.worker.dependency engine.go:618 "uniter" manifold worker stopped: connection is shut down
stack trace:
github.com/juju/juju/rpc.init:14: connection is shut down
github.com/juju/juju/rpc.(*Conn).Call:178:
github.com/juju/juju/api.(*state).APICall:1256:
2023-08-16 10:14:32 DEBUG juju.worker.dependency engine.go:603 "api-caller" manifold worker completed successfully
We know there is a short outage caused by entering into HA, when the Dqlite node is rebound to a different IP address.
We are investigating mitigations.