Journal recovery failure on global timestamp mismatch
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Akiban Persistit |
Fix Released
|
Critical
|
Peter Beaman |
Bug Description
A customer system fails to start due to this Exception:
Caused by: com.persistit.
at com.persistit.
Upon further investigation we discovered a record in the journal with an invalid VolumeHandle:
3,712,000,036,776 58,869,008,137 IT ( 59) handle 00735 volume 2147483647 treeName information_
The volume handle refers to the newly add lock volume. No tree or volume records related to that volume should be found in the journal. The consequence is that RecoveryManager falsely determined that the journal is corrupt when except for the presence of this erroneous record it is not.
Almost certainly this bug was induced by support for the new Exchange#lock mechanism.
Related branches
- Akiban Build User: Needs Fixing
- Nathan Williams: Approve
-
Diff: 81 lines (+32/-2)4 files modifiedsrc/main/java/com/persistit/JournalManager.java (+3/-1)
src/main/java/com/persistit/Persistit.java (+4/-1)
src/main/java/com/persistit/RecoveryManager.java (+4/-0)
src/test/java/com/persistit/ExchangeLockTest.java (+21/-0)
Changed in akiban-persistit: | |
assignee: | nobody → Peter Beaman (pbeaman) |
Changed in akiban-persistit: | |
status: | Confirmed → Fix Committed |
Changed in akiban-persistit: | |
milestone: | none → 3.2.6 |
Changed in akiban-persistit: | |
status: | Fix Committed → Fix Released |
I had this occur to me earlier in the day, too. I saved the contents of /tmp/akiban_server before blowing it away and starting from scratch. Let me know if you need that.