As the code suggests, this is case 1 and it seems that there were two concurrent bootstrap attempts.
elif lead_cluster_state_uuid != cluster_state_uuid:
# this may mean 2 things:
# 1) the units have diverged, which it's bad and we do stop.
# 2) cluster_state_uuid could not be retrieved because it
# hasn't been bootstrapped, mysqld is stopped, etc.
The question is how is that even possible with a single leader?
unit-mysql-0 - never elected as a leader
unit-mysql-1 - elected as a once leader at 2017-11-14 17:27:07, never resigned
unit-mysql-2 - first leader ever for this app, elected as a leader once at 2017-11-14 17:17:00, resigned at 2017-11-14 17:21:33
unit-mysql-2.log
# there is no leader-elected hook implemented in percona-cluster so this is fine
post-install queued leader-elected event
2017-11-14 17:16:33 INFO juju.worker.uniter resolver.go:104 found queued "leader-elected" hook
2017-11-14 17:21:27 DEBUG juju.worker.leadership tracker.go:230 notifying mysql/2 ticket of impending loss of mysql leadership
2017-11-14 17:21:27 DEBUG juju.worker.leadership tracker.go:215 mysql/2 waiting for mysql leadership release
2017-11-14 17:21:27 DEBUG juju.worker.leadership tracker.go:269 mysql/2 is not mysql leader
- 2017-11-14 17:27:03 mysql/2 -> leader-settings-changed execution began by agent
- 2017-11-14 17:27:06 mysql/2 -> unit agent reported that this is not a leader:
DEBUG juju.worker.leadership tracker.go:269 mysql/2 is not mysql leader
I can confirm that alitvinov and I have seen exactly that behavior a few weeks ago on stable/17.08.
Leader UUID ('0dde54c4- c961-11e7- 9775-2f72db67cb da') != Unit UUID ('b3e1b7e8- c95f-11e7- 91fa-56c8581eb6 16')
In our case all leader-get invocations returned the same result (so the Juju leader bucket is consistent across units).
http:// paste.ubuntu. com/25962876/
As the code suggests, this is case 1 and it seems that there were two concurrent bootstrap attempts.
elif lead_cluster_ state_uuid != cluster_state_uuid:
# this may mean 2 things:
# 1) the units have diverged, which it's bad and we do stop.
# 2) cluster_state_uuid could not be retrieved because it
# hasn't been bootstrapped, mysqld is stopped, etc.
The question is how is that even possible with a single leader?
render_ config_ restart_ on_changed -> bootstrap_pxc
https:/ /github. com/openstack/ charm-percona- cluster/ blob/stable/ 17.08/hooks/ percona_ hooks.py# L323-L327
render_ config_ restart_ on_changed( clustered, hosts,
bootstrap =not bootstrapped)
if is_leader():
log("Leader unit - bootstrap required=%s" % (not bootstrapped),
DEBUG)
https:/ /github. com/openstack/ charm-percona- cluster/ blob/stable/ 17.08/hooks/ percona_ hooks.py# L221-L224
bootstrap_ pxc()
notify_ bootstrapped( )
# NOTE(dosaboy): this will not actually do anything if no cluster
# relation id exists yet.
If both units have passed is_leader gate than they were both leaders at the same time.
===
Each time a new leader is elected there will be a resignation operation by the operation resolver in uniter:
https:/ /github. com/juju/ juju/blob/ juju-2. 2.6/worker/ uniter/ leadership/ resolver. go#L56- L60
Normally, a current leader will keep its lease (lease renewal).
https:/ /github. com/juju/ juju/blob/ juju-2. 2.6/worker/ leadership/ tracker. go#L249
===
Looking at the logs:
unit-mysql-0 - never elected as a leader
unit-mysql-1 - elected as a once leader at 2017-11-14 17:27:07, never resigned
unit-mysql-2 - first leader ever for this app, elected as a leader once at 2017-11-14 17:17:00, resigned at 2017-11-14 17:21:33
unit-mysql-2.log
# there is no leader-elected hook implemented in percona-cluster so this is fine
post-install queued leader-elected event
2017-11-14 17:16:33 INFO juju.worker.uniter resolver.go:104 found queued "leader-elected" hook
2017-11-14 17:16:33 DEBUG juju.worker.uniter agent.go:17 [AGENT-STATUS] executing: running leader-elected hook
2017-11-14 17:16:35 INFO juju-log Unknown hook leader-elected - skipping.
2017-11-14 17:17:00 INFO juju.worker. uniter. operation runhook.go:113 ran "leader-elected" hook uniter. operation executor.go:100 committing operation "run leader-elected hook"
2017-11-14 17:17:00 DEBUG juju.worker.
...
2017-11-14 17:17:00 INFO juju.worker. uniter. operation runhook.go:113 ran "leader-elected" hook uniter. operation executor.go:100 committing operation "run leader-elected hook"
2017-11-14 17:17:00 DEBUG juju.worker.
...
2017-11-14 17:17:28 DEBUG juju-log Leader unit - bootstrap required=True
2017-11-14 17:17:32 DEBUG config-changed Unknown operation bootstrap-pxc.
2017-11-14 17:17:43 DEBUG juju-log Bootstrap PXC Succeeded
...
2017-11-14 17:21:27 DEBUG juju.worker. leadership tracker.go:230 notifying mysql/2 ticket of impending loss of mysql leadership leadership tracker.go:215 mysql/2 waiting for mysql leadership release leadership tracker.go:269 mysql/2 is not mysql leader
2017-11-14 17:21:27 DEBUG juju.worker.
2017-11-14 17:21:27 DEBUG juju.worker.
2017-11-14 17:21:33 DEBUG juju.worker. uniter. operation executor.go:100 committing operation "resign leadership" uniter. operation executor.go:69 running operation run leader- settings- changed hook
2017-11-14 17:21:33 DEBUG juju.worker.
unit-mysql-1.log:
2017-11-14 17:21:24 INFO juju.worker. leadership tracker.go:185 mysql/1 promoted to leadership of mysql
2017-11-14 17:27:07 INFO juju.worker. uniter. operation runhook.go:113 ran "leader-elected" hook
...
2017-11-14 17:27:09 DEBUG juju-log Leader unit - bootstrap required=True
2017-11-14 17:27:24 DEBUG juju-log Bootstrap PXC Succeeded
...
2017-11-14 17:29:48 DEBUG juju-log cluster:28: Bootstrap PXC Succeeded
2017-11-14 17:29:48 DEBUG juju-log cluster:28: Notifying peers that percona is bootstrapped (uuid=0dde54c4- c961-11e7- 9775-2f72db67cb da)
...
2017-11-14 17:29:32 DEBUG juju-log cluster:28: Leader unit - bootstrap required=True
2017-11-14 17:29:48 DEBUG juju-log cluster:28: Bootstrap PXC Succeeded
2017-11-14 17:30:02 DEBUG juju-log cluster:28: Notifying peers that percona is bootstrapped (uuid=0dde54c4- c961-11e7- 9775-2f72db67cb da)
2017-11-14 17:30:30 DEBUG juju-log cluster:28: Notifying peers that percona is bootstrapped (uuid=0dde54c4- c961-11e7- 9775-2f72db67cb da)
2017-11-14 17:32:13 DEBUG juju-log cluster:28: Notifying peers that percona is bootstrapped (uuid=0dde54c4- c961-11e7- 9775-2f72db67cb da)
====
So there was never a renewal on the unit-mysql-2:
mysql/2: 2017-11-14 17:13:32 DEBUG worker.uniter.jujuc server.go:178 running hook tool "leader-set"
mysql/2: 2017-11-14 17:13:34 DEBUG worker.uniter.jujuc server.go:178 running hook tool "leader-set"
mysql/2: 2017-11-14 17:13:35 DEBUG worker.uniter.jujuc server.go:178 running hook tool "leader-set"
# All leader-set hook tool invocations for the old leader mysql/2 happened after the above
- 2017-11-14 17:17:00 mysql/2 -> leader
- 2017-11-14 17:21:24 mysql/1 -> promoted to leader
- 2017-11-14 17:21:33 mysql/2 -> resigned leadership settings- changed operation started (not execution)
- 2017-11-14 17:21:33 mysql/2 -> leader-
- 2017-11-14 17:27:03 mysql/2 -> leader- settings- changed execution began by agent
- 2017-11-14 17:27:06 mysql/2 -> unit agent reported that this is not a leader: leadership tracker.go:269 mysql/2 is not mysql leader
DEBUG juju.worker.
- 2017-11-14 17:27:07 mysql/1 -> leader-elected
< ?
- 2017-11-14 17:27:11 mysql/2 -> leader- settings- changed hook hook execution ended, operation committed by uniter uniter. operation executor.go:100 committing operation "run leader- settings- changed hook"
DEBUG juju.worker.
# All leader-set hook tool invocations for the new leader mysql/1 happened after the above
mysql/1 2017-11-14 17:29:48 DEBUG worker.uniter.jujuc server.go:178 running hook tool "leader-set"
mysql/1 2017-11-14 17:30:02 DEBUG worker.uniter.jujuc server.go:178 running hook tool "leader-set"
mysql/1 2017-11-14 17:30:31 DEBUG worker.uniter.jujuc server.go:178 running hook tool "leader-set"
mysql/1 2017-11-14 17:32:13 DEBUG worker.uniter.jujuc server.go:178 running hook tool "leader-set"