worker/lease manager_expire_test.go:206: ExpireSuite.TestClaim_ExpiryInFuture is flaky
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Canonical Juju |
Fix Released
|
Medium
|
John A Meinel | ||
2.5 |
Fix Released
|
Medium
|
John A Meinel |
Bug Description
Running under:
go test -race -count=200 -failfast -test.timeout=20m -check.v
I got this test to fail:
FAIL: manager_
[LOG] 0:00.000 CRITICAL juju.clock timer 2073-03-03 02:00:00.000000005 -0840 -0840 not ready: 1h0m0s
[LOG] 0:00.000 TRACE lease_test [] next expire decided on 1h0m0s 2073-03-03 02:00:00.000000005 -0840 -0840
[LOG] 0:00.000 TRACE lease_test [] redis/0 asked for lease redis, no lease found, claiming for 1m0s
[LOG] 0:00.001 TRACE lease_test [] redis/0 claiming lease redis for 1m0s successful
[LOG] 0:00.001 TRACE lease_test [] ensuring we wake up before 1m0s at 2073-03-03 01:01:00.000000005 -0840 -0840
[LOG] 0:00.001 TRACE lease_test [] tick at 2073-03-03 01:01:03.000000004 -0840 -0840, running expiry checks
[LOG] 0:00.001 TRACE lease_test [] expire looking for leases to expire
[LOG] 0:00.001 WARNING lease_test [] reporting error to main loop: Store method called after test complete: Refresh []
[LOG] 0:00.001 TRACE lease_test [] exiting main loop with error: Store method called after test complete: Refresh []
fixture_
c.Check(err, jc.ErrorIsNil)
... value *errors.Err = &errors.
... error stack:
In a successful run the log is supposed to look like:
PASS: manager_
[LOG] 0:00.000 CRITICAL juju.clock timer 2073-03-03 02:00:00.000000005 -0840 -0840 not ready: 1h0m0s
[LOG] 0:00.000 TRACE lease_test [] next expire decided on 1h0m0s 2073-03-03 02:00:00.000000005 -0840 -0840
[LOG] 0:00.000 TRACE lease_test [] redis/0 asked for lease redis, no lease found, claiming for 1m0s
[LOG] 0:00.000 TRACE lease_test [] redis/0 claiming lease redis for 1m0s successful
[LOG] 0:00.000 TRACE lease_test [] ensuring we wake up before 1m0s at 2073-03-03 01:01:00.000000005 -0840 -0840
[LOG] 0:00.000 TRACE lease_test [] tick at 2073-03-03 01:01:03.000000006 -0840 -0840, running expiry checks
[LOG] 0:00.000 TRACE lease_test [] expire looking for leases to expire
[LOG] 0:00.000 TRACE lease_test [] checking expiry on 1 leases
[LOG] 0:00.000 DEBUG lease_test [] expired 1 leases: redis
[LOG] 0:00.000 TRACE lease_test [] evaluating 0 blocks
[LOG] 0:00.001 TRACE lease_test [] next expire decided on 1h0m0s 2073-03-03 02:01:03.000000006 -0840 -0840
[LOG] 0:00.001 TRACE lease_test [] exiting main loop with error: catacomb 0xc0000e0a00 is dying
START: <autogenerated>:1: ExpireSuite.
It seems there is some sort of race with the Done() condition/Refresh is being called at an unexpected time.
Given that both of them do seem to be waking up at 01:01:03 and it is expecting to Refresh and then expire, it seems odd that it would sometimes claim that Refresh shouldn't be called.
Needs investigation.
Changed in juju: | |
status: | Fix Committed → Fix Released |
I don't think this is a production issue, just a flaky test, as such it shouldn't block 2.5.1