[master] bzr holds whole files in memory; raises MemoryError on large files
Bug #109114 reported by
Martin Pool
This bug affects 60 people
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Bazaar |
Confirmed
|
High
|
Unassigned | ||
Breezy |
Triaged
|
Medium
|
Unassigned |
Bug Description
Bazaar reads each source file completely into memory when committing, which means files that are comparable in size to the machine's vm can't be committed. This currently gives a MemoryError.
We also store too many copies in memory. This is being worked on by John. However even when fixed we'll still be limited by size. Fixing this bug requires some sort of fragmentation capacity when large files are detected.
Related branches
lp://staging/~jelmer/brz/get-full
- Martin Packman: Approve
-
Diff: 236 lines (+18/-90)7 files modifiedbreezy/bzr/groupcompress.py (+0/-18)
breezy/bzr/knit.py (+0/-15)
breezy/bzr/versionedfile.py (+0/-30)
breezy/bzr/vf_repository.py (+11/-12)
breezy/plugins/weave_fmt/repository.py (+5/-3)
breezy/tests/test_tuned_gzip.py (+1/-3)
breezy/tuned_gzip.py (+1/-9)
Changed in bzr: | |
importance: | Undecided → Medium |
status: | Unconfirmed → Confirmed |
description: | updated |
Changed in bzr: | |
assignee: | John A Meinel (jameinel) → nobody |
status: | In Progress → Triaged |
Changed in bzr: | |
status: | Triaged → Confirmed |
Changed in bzr: | |
status: | Confirmed → In Progress |
importance: | Medium → High |
assignee: | nobody → Martin Pool (mbp) |
Changed in bzr: | |
assignee: | Martin Pool (mbp) → nobody |
Changed in bzr: | |
status: | In Progress → Confirmed |
Changed in brz: | |
status: | New → Triaged |
importance: | Undecided → Medium |
To post a comment you must log in.
I think the file doesn't have to comparable to the VM, I have a 300MB binary file and python is using up 2GB before crashing due to out of memory.
The reason this is a problem is that we have stored lots of graphics in CVS today
bzr: ERROR: exceptions. MemoryError:
Traceback (most recent call last): Python\ Lib\site- packages\ bzrlib\ commands. py", line 817, in run_bzr_ catch_errors Python\ Lib\site- packages\ bzrlib\ commands. py", line 779, in run_bzr Python\ Lib\site- packages\ bzrlib\ commands. py", line 477, in run_argv_aliases **all_cmd_ args) Python\ Lib\site- packages\ bzrlib\ builtins. py", line 2283, in run reporter, revprops= properties) Python\ Lib\site- packages\ bzrlib\ decorators. py", line 165, in write_locked Python\ Lib\site- packages\ bzrlib\ workingtree_ 4.py", line 246, in commit commit( self, message, revprops, *args, **kwargs) Python\ Lib\site- packages\ bzrlib\ decorators. py", line 165, in write_locked Python\ Lib\site- packages\ bzrlib\ mutabletree. py", line 207, in commit revprops, *args, **kwargs) Python\ Lib\site- packages\ bzrlib\ commit. py", line 300, in commit _update_ builder_ with_changes( ) Python\ Lib\site- packages\ bzrlib\ commit. py", line 607, in _update_ builder_ with_changes _populate_ from_inventory( specific_ files) Python\ Lib\site- packages\ bzrlib\ commit. py", line 679, in _populate_ from_inventory Python\ Lib\site- packages\ bzrlib\ commit. py", line 731, in _record_entry Python\ Lib\site- packages\ bzrlib\ repository. py", line 2133, in record_ entry_contents snapshot( self._new_ revision_ id, path, previous_entries, tree, self) Python\ Lib\site- packages\ bzrlib\ inventory. py", line 438, in snapshot Python\ Lib\site- packages\ bzrlib\ inventory. py", line 453, in _snapshot_ into_revision _snapshot_ text(previous_ entries, work_tree, commit_builder) Python\ Lib\site- packages\ bzrlib\ inventory. py", line 724, in _snapshot_text byte_lines, self.text_sha1, self.text_size) Python\ Lib\site- packages\ bzrlib\ repository. py", line 2180, in modified_file_text _add_text_ to_weave( file_id, new_lines, file_parents. keys()) Python\ Lib\site- packages\ bzrlib\ repository. py", line 2196, in _add_text_to_weave le.add_ lines(self. _new_revision_ id, parents, new_lines) Python\ Lib\site- packages\ bzrlib\ versionedfile. py", line 148...
File "C:\Program Files\PyGTK\
return run_bzr(argv)
File "C:\Program Files\PyGTK\
ret = run(*run_argv)
File "C:\Program Files\PyGTK\
return self.run(
File "C:\Program Files\PyGTK\
reporter=
File "C:\Program Files\PyGTK\
return unbound(self, *args, **kwargs)
File "C:\Program Files\PyGTK\
result = WorkingTree3.
File "C:\Program Files\PyGTK\
return unbound(self, *args, **kwargs)
File "C:\Program Files\PyGTK\
revprops=
File "C:\Program Files\PyGTK\
self.
File "C:\Program Files\PyGTK\
self.
File "C:\Program Files\PyGTK\
parent_id, definitely_changed, existing_ie)
File "C:\Program Files\PyGTK\
path, self.work_tree)
File "C:\Program Files\PyGTK\
ie.
File "C:\Program Files\PyGTK\
work_tree, commit_builder)
File "C:\Program Files\PyGTK\
self.
File "C:\Program Files\PyGTK\
self.file_id, file_parents, get_content_
File "C:\Program Files\PyGTK\
self.
File "C:\Program Files\PyGTK\
versionedfi
File "C:\Program Files\PyGTK\