Need a way to remove unreferenced data from a repository
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Bazaar |
Confirmed
|
Low
|
Unassigned | ||
Breezy |
Triaged
|
Medium
|
Unassigned |
Bug Description
Repositories can accumulate data which is no longer referenced by any branch. This can happen because of uncommit, pull --overwrite, push --overwrite, or removing branch directories. The data is semantically meaningless and can't normally be seen.
In some respects this is useful as it makes these operations less destructive and they can be undone. (At the moment the user must find and re-branch the revision manually, but it could be made more automatic.)
However, the data uses space, and people might want to remove it. mpe says:
"""I think in the medium term there must be a way to clean out unused (unreferenced?) revisions in a shared repo. Simply in terms of space you need to be able to do it, otherwise over time your repo will fill up with lots of junk. And there might be people who have a legal requirement that all code from branch x is physically removed."""
One approach is a command to copy a repository, including branches within it, and only including revision data referenced by those branches. Another approach would be to "vacuum" or "garbage-collect" the repository in-place.
This is only safely possible if all in-use revisions are referenced by branches which can be located at the time of copying - ie if they're physically inside the repository directory.
Changed in bzr: | |
importance: | Untriaged → Low |
status: | Unconfirmed → Confirmed |
tags: | added: check-for-breezy |
tags: |
added: gc removed: check-for-breezy |
Changed in brz: | |
importance: | Undecided → Low |
status: | New → Triaged |
importance: | Low → Medium |
The Launchpad mirroring service should also reflect data removal done in the source branch. Filed bug 58650 about this.