A significant part of the slowness (but not the space-inefficiency) is that blobs only enter the idmap when there's a content_changed event. If an InventoryFile shows up with a new revision but without new content, it can cause the file to be rehashed in every subsequent commit until the content actually changes.
Hacking the fallback in _tree_to_objects to forcibly cache any misses makes things several times faster on long complex histories.
A significant part of the slowness (but not the space-inefficiency) is that blobs only enter the idmap when there's a content_changed event. If an InventoryFile shows up with a new revision but without new content, it can cause the file to be rehashed in every subsequent commit until the content actually changes.
Hacking the fallback in _tree_to_objects to forcibly cache any misses makes things several times faster on long complex histories.