A little bit of background for the gcc-linaro case. In bzr-2.4 we start requesting all the tagged revisions as heads, though this turns out not to help as much as we might like.
The first request is for 2690 tips, of which only 64 are found. (lots of tagged revs that aren't in the history).
We then get enough data back in that one request to step 4 times (64-78 revs). The next request is for 84 tips, which gives us back 16k compressed data, enough to make 32 more steps (at about 80 each time).
Seems pretty good so far.
The number of search tips starts dropping pretty rapidly after that point, though we seem to still do a decent job at filling in extra history. Overall I only see 52 Repository.get_parent_map RPC calls, though while walking the graph we make 71,595 steps. We are walking 10-at-a-time from step 1300, and one-at-a-time from step 6534 on. So there are a *lot* of steps that we do one-by-one on the graph. Which makes some sense, since gcc's history gets pretty linear.
I'm still trying to sort out where the time is spent, given how few RPCs we seem to actually be making. With my extra debugging, it is taking 335s (5.6min) to find all the revisions we want to transmit.
One bit that showed up as a point of concern is the _get_parent_map_rpc code. Which always walks the entire cached dict to formulate the next query. Which looks like it is going to be O(N^2) in nature. Since by the end of all this walking, we'll have something like 100k entries in the cache. However, lsprof shows that *by far* the bulk of the time is in the actual RPC.
I think, though, that the logic about start and stop revisions is causing us harm. Specifically, this is the time spent in each get_parent_map rpc call:
2.3, 0.7, 0.6, 0.8, 1.0, 0.8, 1.0, 1.3, 1.4, 1.8, 1.5, 2.1, 2.0, 3.1, 3.2, 3.7, 2.9, 2.8, 4.2, 4.9, 6.1, 5.0, 5.5, 5.3, 5.5, 5.5, 5.8, 5.9, 6.4, 6.2, 7.0, 6.6, 7.1, 7.0, 7.5, 7.4, 7.7, 8.0, 8.2, 8.2, 8.6, 8.9, 9.1, 8.9, 9.1, 9.4, 9.5, 11.6, 11.1, 10.0, 10.4, 13.8
If you graph that, it follows an almost completely straight line (R^2 = 0.95), with each request taking 0.2s longer than the previous one.
I'll need to profile the server side and see what might be causing this.
I have another .bzr.log where I was logging the call sizes. And I noticed that "key_count" gets really large for the recipe.
So that is only 5 start keys, but 110256 keys in the search set. Does that mean we are starting the search over the *entire ancestry* from scratch on each request.... ugh. This does not seem to be the most efficient way to do a get_parent_map call.
I believe the attempt is to succinctly describe the revisions you already know about, and have the remote side give you information about revisions outside of that set.
I believe that if we can avoid the N^2 behavior, we are already good about only sending 52 requests. And for me, many of the early requests complete in less than a second. So that would be ~1min rather than 5.5min for the discovery phase.
We could still try to shortcut the entire discovery phase by noticing the target repository is empty, and I'll consider adding keywords for it. But if we can make this api better, then we can still get good gains when fetching a lot of content into a mostly-but-not-completely empty repository.
A little bit of background for the gcc-linaro case. In bzr-2.4 we start requesting all the tagged revisions as heads, though this turns out not to help as much as we might like.
The first request is for 2690 tips, of which only 64 are found. (lots of tagged revs that aren't in the history).
We then get enough data back in that one request to step 4 times (64-78 revs). The next request is for 84 tips, which gives us back 16k compressed data, enough to make 32 more steps (at about 80 each time).
Seems pretty good so far.
The number of search tips starts dropping pretty rapidly after that point, though we seem to still do a decent job at filling in extra history. Overall I only see 52 Repository. get_parent_ map RPC calls, though while walking the graph we make 71,595 steps. We are walking 10-at-a-time from step 1300, and one-at-a-time from step 6534 on. So there are a *lot* of steps that we do one-by-one on the graph. Which makes some sense, since gcc's history gets pretty linear.
I'm still trying to sort out where the time is spent, given how few RPCs we seem to actually be making. With my extra debugging, it is taking 335s (5.6min) to find all the revisions we want to transmit.
One bit that showed up as a point of concern is the _get_parent_map_rpc code. Which always walks the entire cached dict to formulate the next query. Which looks like it is going to be O(N^2) in nature. Since by the end of all this walking, we'll have something like 100k entries in the cache. However, lsprof shows that *by far* the bulk of the time is in the actual RPC.
I think, though, that the logic about start and stop revisions is causing us harm. Specifically, this is the time spent in each get_parent_map rpc call:
2.3, 0.7, 0.6, 0.8, 1.0, 0.8, 1.0, 1.3, 1.4, 1.8, 1.5, 2.1, 2.0, 3.1, 3.2, 3.7, 2.9, 2.8, 4.2, 4.9, 6.1, 5.0, 5.5, 5.3, 5.5, 5.5, 5.8, 5.9, 6.4, 6.2, 7.0, 6.6, 7.1, 7.0, 7.5, 7.4, 7.7, 8.0, 8.2, 8.2, 8.6, 8.9, 9.1, 8.9, 9.1, 9.4, 9.5, 11.6, 11.1, 10.0, 10.4, 13.8
If you graph that, it follows an almost completely straight line (R^2 = 0.95), with each request taking 0.2s longer than the previous one.
I'll need to profile the server side and see what might be causing this.
I have another .bzr.log where I was logging the call sizes. And I noticed that "key_count" gets really large for the recipe.
The final request looks like this: map_rpc, 110260 cache size, 5 start_set, 110256 result_parents, 1 stop_keys, 110255 included_keys, 1 keys v4:138bc75d- 0d04-0410- 961f-82ee72b054 a4:trunk: 1152']) body: 'svn-v4: 138bc75d- 0d04-0410- 961f-82ee72b054 a4:branches/ gcc-4_6- branch: 175472 ...\n110260' get_parent_ map', '+branch/ gcc-linaro/ ', 'include-missing:', 'svn-v4: 138bc75d- 0d04-0410- 961f-82ee72b054 a4:trunk: 1152' ('svn-v4: 138bc75d- 0d04'.. .)
327.535 Doing _get_parent_
327.535 keys: set(['svn-
327.535 hpss call w/body: 'Repository.
327.535 378 bytes
337.595 result: ('ok',)
337.595 2526 body bytes read
So that is only 5 start keys, but 110256 keys in the search set. Does that mean we are starting the search over the *entire ancestry* from scratch on each request.... ugh. This does not seem to be the most efficient way to do a get_parent_map call.
I believe the attempt is to succinctly describe the revisions you already know about, and have the remote side give you information about revisions outside of that set.
I believe that if we can avoid the N^2 behavior, we are already good about only sending 52 requests. And for me, many of the early requests complete in less than a second. So that would be ~1min rather than 5.5min for the discovery phase.
We could still try to shortcut the entire discovery phase by noticing the target repository is empty, and I'll consider adding keywords for it. But if we can make this api better, then we can still get good gains when fetching a lot of content into a mostly- but-not- completely empty repository.