My current thought is this. We are trying to tell the server about the revisions we've already seen, so it doesn't send us those again.
However, locality says that if I'm currently looking at X, then really only revisions 'near' X are going to be encountered again. This isn't strictly true, because of history shortcutting (daggy fixes, in particular, mean that you'll skip a lot of history in one half of the search, which you'll run into again after a long time in the other half of the search.)
While it isn't strictly true, it could certainly limit the effort tremendously.
Having greatest-distance-from-origin (generation count), would be another way to limit this. Instead of walking all parents equally, you'd favor parents that have higher generation number, presuming that parents with low generation number are skipping all of the interesting revisions. You could do a little bit of both, but you could come up with better batches this way. (So if you had 100 parents to search, and 2 of them are at GDFO 1000, and the other 98 of them are at GDFO 10000, you can skip searching the 2 low ones for now.)
Anyway, I'm going to try something that just generates a fake 'local' search, and see how that performs.
My current thought is this. We are trying to tell the server about the revisions we've already seen, so it doesn't send us those again.
However, locality says that if I'm currently looking at X, then really only revisions 'near' X are going to be encountered again. This isn't strictly true, because of history shortcutting (daggy fixes, in particular, mean that you'll skip a lot of history in one half of the search, which you'll run into again after a long time in the other half of the search.)
While it isn't strictly true, it could certainly limit the effort tremendously.
Having greatest- distance- from-origin (generation count), would be another way to limit this. Instead of walking all parents equally, you'd favor parents that have higher generation number, presuming that parents with low generation number are skipping all of the interesting revisions. You could do a little bit of both, but you could come up with better batches this way. (So if you had 100 parents to search, and 2 of them are at GDFO 1000, and the other 98 of them are at GDFO 10000, you can skip searching the 2 low ones for now.)
Anyway, I'm going to try something that just generates a fake 'local' search, and see how that performs.