clock updated concurrently on a non-ha controller

Bug #1900724 reported by Laurent Sesquès
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
High
Joseph Phillips

Bug Description

This is on a non-ha controller, recently upgraded from 2.7.6 to 2.8.3.

2020-10-12 22:19:05 ERROR juju.core.raftlease store.go:277 command Command(ver: 1, op: setTime, new time: 0002-10-02 08:39:05.716668447 +0000 UTC): clock was updated concurrently, retry
2020-10-13 04:11:06 ERROR juju.core.raftlease store.go:277 command Command(ver: 1, op: setTime, new time: 0002-10-02 14:22:47.694097189 +0000 UTC): clock was updated concurrently, retry

[...]

2020-10-20 14:39:57 ERROR juju.core.raftlease store.go:277 command Command(ver: 1, op: setTime, new time: 0001-01-01 00:00:01.015763373 +0000 UTC): clock was updated concurrently, retry

I think that this should not be possible in a non-ha controller.

description: updated
Pen Gale (pengale)
Changed in juju:
status: New → Triaged
importance: Undecided → High
milestone: none → 2.8.7
Revision history for this message
Pen Gale (pengale) wrote :

Triaging and adding to milestone for further investigation.

It's possible that we are firing off a worker twice, which might be responsible for other raft oddities. This is definitely worth digging into!

Revision history for this message
Joseph Phillips (manadart) wrote :

Got a reproducer upgrading from 2.8 HEAD to develop HEAD.

Changed in juju:
status: Triaged → In Progress
Changed in juju:
status: In Progress → Incomplete
Revision history for this message
Joseph Phillips (manadart) wrote :

This should be a transient error.

When the global clock updater advances time via the Raft store, it sends what it thinks is the old global time in the command.

If this time does not match the Raft FSM last known time, the concurrent update error is returned. The mechanism protects against concurrent updates, which is reflected in the error message, but that doesn't *quite* match what is happening here.

It actually results in the updater syncing its last known time with the FSM and retrying with a back-off.

From what I can tell, this is possible after restarting controller machine agents and restoring the Raft FSM from a snapshot, in which case the updater has a different value from the FSM.

Changed in juju:
status: Incomplete → In Progress
Revision history for this message
Joseph Phillips (manadart) wrote :

This is pertinent log capture following a restart.
https://pastebin.canonical.com/p/DjnWdt5rsW/

I can see some issues here.

1. The lease manager makes claims against what appears to be an empty FSM *before* it has completed restoring from its snapshot - you can see the "no leasing found, claiming.." entries.

2. The initial clock advancement, the worker having been instantiated with an empty FSM, is made with an old time of 0001-01-01 00:00:00 +0000 UTC, which is incorrect following the FSM restoration (mentioned above).

3. Raft leader election occurs, relevant workers bounce, there are lease claim time-outs following.

The logs show convergence and success following this jitter, but we don't really want large models spammed with time-out errors every time there is a Raft leader election.

Revision history for this message
Joseph Phillips (manadart) wrote :

I've proposed https://github.com/juju/juju/pull/12248 in light of this ticket.

It doesn't change the fundamentals, but I think uses better nomenclature.

Ian Booth (wallyworld)
Changed in juju:
assignee: nobody → Joseph Phillips (manadart)
Changed in juju:
status: In Progress → Fix Committed
Changed in juju:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.