Bazaar

fetch for --2a formats seems to be fragmenting

Bug #402645 reported by John A Meinel on 2009-07-21

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Bazaar	Fix Released	Critical	John A Meinel	Bazaar 2.0rc1 "no worries"

Bug Description

split out from bug #402114

While debugging the log file for branching over http, we saw entries like this:
384.270 creating new compressed block on-the-fly in 0.000s 11139 bytes => 365 bytes
384.271 stripping trailing bytes from groupcompress block 11046 => 275
384.274 creating new compressed block on-the-fly in 0.001s 11139 bytes => 2757 bytes
384.277 creating new compressed block on-the-fly in 0.000s 11046 bytes => 269 bytes
384.278 creating new compressed block on-the-fly in 0.000s 11139 bytes => 917 bytes
384.281 creating new compressed block on-the-fly in 0.001s 11046 bytes => 2832 bytes
384.298 creating new compressed block on-the-fly in 0.003s 11139 bytes => 5331 bytes
384.306 creating new compressed block on-the-fly in 0.000s 11046 bytes => 782 bytes
384.309 creating new compressed block on-the-fly in 0.002s 11139 bytes => 2386 bytes

It would appear that the stream code believes the optimal ordering is to interleave the bytes from the two groups (one sized 11139 bytes and one sized 11046 bytes). In doing so, it fragments both groups, so that rather than having 2 medium sized groups we end up with 9+ groups.

The fetch order should be "unordered" which should cause it to be:
source_keys = self._get_io_ordered_source_keys(locations,
unadded_keys, source_result)

Which, in theory, would be one group at a time.

one possibility is that the ordering isn't 'unordered' but instead 'groupcompress' order, and the fact that topo_sort is not 100% stable is causing a different sorting between the two implementations. (So the fetch that created the repository thought one sort was optimal, the new fetch thinks a slightly different order is optimal.)

Related branches

lp://staging/~jameinel/bzr/2.0b1-402645-fragmentation

Merged into lp://staging/~bzr/bzr/trunk-old

Martin Pool: Approve on 2009-08-26

Robert Collins (lifeless) on 2009-08-14

Changed in bzr:
importance:	High → Critical

Revision history for this message

Robert Collins (lifeless) wrote on 2009-08-14:

Targeting for 2.0, because unpacking on fetch will lead to rather bad disk usage and our users propensity to look under the hood makes it important not to do that.

Changed in bzr:
milestone:	none → 2.0

Martin Pool (mbp) on 2009-08-23

Changed in bzr:
status:	Triaged → Confirmed

Revision history for this message

John A Meinel (jameinel) wrote on 2009-08-24:

I'm investigating

Changed in bzr:
assignee:	nobody → John A Meinel (jameinel)
status:	Confirmed → In Progress

Revision history for this message

Martin Pool (mbp) wrote on 2009-08-26:

There is an mp that needs review https://code.edge.launchpad.net/~jameinel/bzr/2.0b1-402645-fragmentation/+merge/10621

Revision history for this message

John A Meinel (jameinel) wrote on 2009-08-31:

The fetching is set to 'unordered' in the 2.0 branch right now. Which at least avoids fragmentation.
However, there maybe a different fix in 2.0 final based on bug #402652

Changed in bzr:
status:	In Progress → Fix Released

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.