1.23.2.1, mongo: document is larger than capped size
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
juju-core |
Fix Released
|
High
|
Ian Booth | ||
1.23 |
Fix Released
|
High
|
Ian Booth | ||
1.24 |
Fix Released
|
High
|
Ian Booth |
Bug Description
Nothing too special here, just a local environment running a few ubuntu services where I was testing the landscape client charm.
machine-0: 2015-05-14 00:50:02 ERROR juju.worker.resumer resumer.go:69 cannot resume transactions: document is larger than capped size 1659012 > 1048576
machine-0: 2015-05-14 00:51:03 ERROR juju.worker.resumer resumer.go:69 cannot resume transactions: document is larger than capped size 1659092 > 1048576
machine-0: 2015-05-14 00:52:12 ERROR juju.worker.resumer resumer.go:69 cannot resume transactions: document is larger than capped size 1659192 > 1048576
machine-0: 2015-05-14 00:53:21 ERROR juju.worker.resumer resumer.go:69 cannot resume transactions: document is larger than capped size 1659240 > 1048576
machine-0: 2015-05-14 00:54:21 ERROR juju.worker.resumer resumer.go:69 cannot resume transactions: document is larger than capped size 1659320 > 1048576
machine-0: 2015-05-14 00:55:30 ERROR juju.worker.resumer resumer.go:69 cannot resume transactions: document is larger than capped size 1659372 > 1048576
machine-0: 2015-05-14 00:56:31 ERROR juju.worker.resumer resumer.go:69 cannot resume transactions: document is larger than capped size 1659452 > 1048576
machine-0: 2015-05-14 00:57:40 ERROR juju.worker.resumer resumer.go:69 cannot resume transactions: document is larger than capped size 1659552 > 1048576
Dump of environment attached. Here is a stat dump requested by menn0 as well that didn't yield too many results:
MongoDB shell version: 2.6.3
connecting to: 127.0.0.
{
"ns" : "juju.txns.log",
"count" : 38600,
"size" : 4742664,
"nindexes" : 1,
"userFlags" : 0,
},
"capped" : true,
"max" : NumberLong(
"ok" : 1
}
bye
Changed in juju-core: | |
assignee: | nobody → Menno Smits (menno.smits) |
tags: | added: mongodb |
Changed in juju-core: | |
status: | New → Triaged |
milestone: | none → 1.25.0 |
importance: | Undecided → High |
Changed in juju-core: | |
assignee: | nobody → Ian Booth (wallyworld) |
status: | Triaged → Fix Committed |
Changed in juju-core: | |
status: | Fix Committed → Fix Released |
In the initial paste (private link): https:/ /pastebin. canonical. com/131827/ server- leadership" }, n: "d5b7100b" } update: { $set: { txn-queue: [ "55352def91a7b1 0f460007b4_ 127104b4" , "55352e1091a7b1 0f46000802_ 01edfd48" ,...
log line attempted (677k) over max size(10k), printing beginning and end ... update juju.txns.stash query: { _id: { c: "lease", id: "trusty-
That looks like something that is having severe problems with conflicting transactions on the trusty- server- leadership document. Enough so that it ends up with hundreds/thousands? of txn-queue ids (these are transactions that want to be applied to the document, but are unable to be cleaned up for some reason.)
Also on the document: /github. com/go- mgo/mgo/ blob/v2/ txn/flusher. go#L481
txn-revno: -37441
I'm pretty sure txn-revno is supposed to stay a positive value. And negative values are used to toggle some sort of boolean state.
https:/
Remove operations try to invert the revno, as do Insert operations (Insert because it should start at -1), and Update operations are supposed to just increment it.
So maybe one of the txns is trying to Remove the document, and a bunch of others are trying to Update it?