Bug #494406 “Smart server leaks memory each commit” : Bugs : Bazaar

Revision history for this message

John A Meinel (jameinel) wrote on 2009-12-09: Re: [Bug 494406] [NEW] Smart server leaks memory each commit

#1

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Gareth White wrote:
> Public bug reported:
>
> When running a smart server, bzr.exe leaks about 200-400 KB per commit. To reproduce:
> * start bzr serve --allow-writes
> * bzr init bzr://localhost/testbranch
> * bzr checkout --lightweight bzr://localhost/testbranch testcheckout
> * cd testcheckout
> * bzr commit -m "test" --unchanged
> * (repeat previous command)
>
> Every time you repeat the "bzr commit", the memory usage of bzr.exe
> increases by about 200-400 KB. I've tested this with both Bazaar 2.0.1
> and 2.1.0b3 with Python 2.5 on Windows XP SP3.
>
> ** Affects: bzr
> Importance: Undecided
> Status: New
>

200-400KB could easily be 'noise' that will be cleaned up when the
python garbage collector runs. It depends more on whether the leak is
sustained after say 100 commits (then equaling 20-40MB or so). Also note
that while you are committing repeatedly, the size of the repository is
*also* increasing. So it isn't like the first run is dealing with the
same content that the last run is dealing with.

importance: medium
status: incomplete
tags: memory

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAksf8c8ACgkQJdeBCYSNAAP7jgCgyhHzmJNeOjxbLIbNvdunEzwN
/hQAoJ/jx/0Bs+pnGSmmzDGy6SM12L8s
=higI
-----END PGP SIGNATURE-----

Changed in bzr:
importance:	Undecided → Medium
status:	New → Incomplete

Revision history for this message

Gareth White (gwhite-deactivatedaccount) wrote on 2009-12-10:

#2

The memory usage *linearly* increases with the number of commits. I just did a test involving 999 "unchanged" commits to an empty shared repository and the server memory usage reached 500,008 KB. Since it started at 14,564 this represents an average of 485 KB increase per commit.

That was with 2.0.1 on the server and client. With 2.1.0b3 it went from 13,996 to 279,840 KB after 999 commits (280 KB per commit). This was on a separate empty shared repository.

Are you able to reproduce this behaviour?

Revision history for this message

John A Meinel (jameinel) wrote on 2009-12-10: Re: [Bug 494406] Re: Smart server leaks memory each commit

#3

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Gareth White wrote:
> The memory usage *linearly* increases with the number of commits. I just
> did a test involving 999 "unchanged" commits to an empty shared
> repository and the server memory usage reached 500,008 KB. Since it
> started at 14,564 this represents an average of 485 KB increase per
> commit.
>
> That was with 2.0.1 on the server and client. With 2.1.0b3 it went from
> 13,996 to 279,840 KB after 999 commits (280 KB per commit). This was on
> a separate empty shared repository.
>
> Are you able to reproduce this behaviour?
>

So it wasn't something that was linear at all. I did run a loop with
~700 --unchanged commits. I watched it a bit at the beginning and it was
pretty much fixed memory consumption at about 84MB. When I came back
later, it had jumped to 237MB (295MB peak).

I haven't found any specific cause for this, though I suspect something
with the autopacking code. (Given that it was steady most of the time,
and then jumped at some point.)

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkshfiIACgkQJdeBCYSNAAOlzwCgi8JFfbcGFx76Xp8MZqZU72/q
OzIAn22qwV9CWANSUIGZWt3VjN3+gmLT
=9UGZ
-----END PGP SIGNATURE-----

Revision history for this message

John A Meinel (jameinel) wrote on 2009-12-10:

#4

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

John A Meinel wrote:
> Gareth White wrote:
>> The memory usage *linearly* increases with the number of commits. I just
>> did a test involving 999 "unchanged" commits to an empty shared
>> repository and the server memory usage reached 500,008 KB. Since it
>> started at 14,564 this represents an average of 485 KB increase per
>> commit.
>
>> That was with 2.0.1 on the server and client. With 2.1.0b3 it went from
>> 13,996 to 279,840 KB after 999 commits (280 KB per commit). This was on
>> a separate empty shared repository.
>
>> Are you able to reproduce this behaviour?
>
>
> So it wasn't something that was linear at all. I did run a loop with
> ~700 --unchanged commits. I watched it a bit at the beginning and it was
> pretty much fixed memory consumption at about 84MB. When I came back
> later, it had jumped to 237MB (295MB peak).
>
> I haven't found any specific cause for this, though I suspect something
> with the autopacking code. (Given that it was steady most of the time,
> and then jumped at some point.)
>
> John
> =:->

I also wanted to comment that the memory does indeed seem to have
'leaked'. At least, a memory dump from meliae only shows about 12MB of
referenced memory, which is a far cry from the 250MB reported by the OS.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkshgGwACgkQJdeBCYSNAAO+ywCdEfN2d2Ef1YeyoElLfZzpILJO
PjEAoNkoqjtNcJ0YrD+lHdKN/YahBBFz
=aCsH
-----END PGP SIGNATURE-----

Revision history for this message

John A Meinel (jameinel) wrote on 2009-12-11:

#5

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

John A Meinel wrote:
> John A Meinel wrote:
>> Gareth White wrote:
>>> The memory usage *linearly* increases with the number of commits. I just
>>> did a test involving 999 "unchanged" commits to an empty shared
>>> repository and the server memory usage reached 500,008 KB. Since it
>>> started at 14,564 this represents an average of 485 KB increase per
>>> commit.
>>> That was with 2.0.1 on the server and client. With 2.1.0b3 it went from
>>> 13,996 to 279,840 KB after 999 commits (280 KB per commit). This was on
>>> a separate empty shared repository.
>>> Are you able to reproduce this behaviour?
>
>> So it wasn't something that was linear at all. I did run a loop with
>> ~700 --unchanged commits. I watched it a bit at the beginning and it was
>> pretty much fixed memory consumption at about 84MB. When I came back
>> later, it had jumped to 237MB (295MB peak).
>
>> I haven't found any specific cause for this, though I suspect something
>> with the autopacking code. (Given that it was steady most of the time,
>> and then jumped at some point.)
>
>> John
>> =:->
>
> I also wanted to comment that the memory does indeed seem to have
> 'leaked'. At least, a memory dump from meliae only shows about 12MB of
> referenced memory, which is a far cry from the 250MB reported by the OS.
>
> John
> =:->

Further info.

I can reproduce this with a 'bzr://' server running bzr.dev, and a
simple python loop around "wt.commit()". I'm pretty darn sure that I'm
seeing the memory increase only when the server tries to autopack.

If everything is done in the local process, I *don't* see the memory go up.

Specifically, if I hold the lock for all commits, looping over
'wt.commit()' starting at 33.6MB (42MB peak), at the end of 1000
commits, I was at 50.7kB peak and 42.5kB resident.

Doing the same thing on the server seemed to result in about 240MB consumed.

Note that I also encountered bug #495588 (commit() in a loop slows down
if you hold a write lock because get_parent_map() suddenly transmits
large amounts of data on each commit.) However, this can be demonstrated
with the command line client, which would be opening a new connection
for each commit. So whatever is leaking memory is triggered by the rpc,
not by a cache being held and then 'lost'.

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAksimQUACgkQJdeBCYSNAAN2ZwCgw1Kgc3nRtqGuZSyUdyF7T0nF
sIwAnA4p+B7/rO8lxeRe+GSmZCvY44BG
=jfpo
-----END PGP SIGNATURE-----

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

John A Meinel wrote:
> John A Meinel wrote:
>> Gareth White wrote:
>>> The memory usage *linearly* increases with the number of commits. I just
>>> did a test involving 999 "unchanged" commits to an empty shared
>>> repository and the server memory usage reached 500,008 KB. Since it
>>> started at 14,564 this represents an average of 485 KB increase per
>>> commit.
>>> That was with 2.0.1 on the server and client. With 2.1.0b3 it went from
>>> 13,996 to 279,840 KB after 999 commits (280 KB per commit). This was on
>>> a separate empty shared repository.
>>> Are you able to reproduce this behaviour?
> 
>> So it wasn't something that was linear at all. I did run a loop with
>> ~700 --unchanged commits. I watched it a bit at the beginning and it was
>> pretty much fixed memory consumption at about 84MB. When I came back
>> later, it had jumped to 237MB (295MB peak).
> 
>> I haven't found any specific cause for this, though I suspect something
>> with the autopacking code. (Given that it was steady most of the time,
>> and then jumped at some point.)
> 
>> John
>> =:->
> 
> I also wanted to comment that the memory does indeed seem to have
> 'leaked'. At least, a memory dump from meliae only shows about 12MB of
> referenced memory, which is a far cry from the 250MB reported by the OS.
> 
> John
> =:->

Further info.

I can reproduce this with a 'bzr://' server running bzr.dev, and a
simple python loop around "wt.commit()". I'm pretty darn sure that I'm
seeing the memory increase only when the server tries to autopack.

If everything is done in the local process, I *don't* see the memory go up.

Specifically, if I hold the lock for all commits, looping over
'wt.commit()' starting at 33.6MB (42MB peak), at the end of 1000
commits, I was at 50.7kB peak and 42.5kB resident.

Doing the same thing on the server seemed to result in about 240MB consumed.

Note that I also encountered bug #495588 (commit() in a loop slows down
if you hold a write lock because get_parent_map() suddenly transmits
large amounts of data on each commit.) However, this can be demonstrated
with the command line client, which would be opening a new connection
for each commit. So whatever is leaking memory is triggered by the rpc,
not by a cache being held and then 'lost'.

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAksimQUACgkQJdeBCYSNAAN2ZwCgw1Kgc3nRtqGuZSyUdyF7T0nF
sIwAnA4p+B7/rO8lxeRe+GSmZCvY44BG
=jfpo
-----END PGP SIGNATURE-----

Revision history for this message

Gareth White (gwhite-deactivatedaccount) wrote on 2009-12-14:

#6

So we're in agreement that there appears to be some sort of leak in the server process, but we're seeing different symptoms. I'm definitely seeing a steady increase in memory usage each commit but you're seeing it only during autopack.

Just to rule out other variables, can you confirm you don't see the per-commit leak in "bzr serve" when using the standalone Windows 2.0.1 installer? i.e. http://launchpad.net/bzr/2.0/2.0.1/+download/bzr-2.0.1-1-setup.exe

If it doesn't happen on bzr.dev then it could be a bug that's already been fixed or some other difference in how our environments are configured.

Revision history for this message

John A Meinel (jameinel) wrote on 2009-12-14:

#7

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Gareth White wrote:
> So we're in agreement that there appears to be some sort of leak in the
> server process, but we're seeing different symptoms. I'm definitely
> seeing a steady increase in memory usage each commit but you're seeing
> it only during autopack.
>
> Just to rule out other variables, can you confirm you don't see the per-
> commit leak in "bzr serve" when using the standalone Windows 2.0.1
> installer? i.e.
> http://launchpad.net/bzr/2.0/2.0.1/+download/bzr-2.0.1-1-setup.exe
>
> If it doesn't happen on bzr.dev then it could be a bug that's already
> been fixed or some other difference in how our environments are
> configured.
>

Memory fluctuates a bit, but it does steadily go up. I may be seeing it
per commit, it certainly didn't look like that when I was first testing
it, but I appear to seeing it now. Though IIRC the original operating
memory was also significantly higher, so it is possible that there is a
pool that was getting re-used because of something I had done.

John
=;->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAksmw8wACgkQJdeBCYSNAAP+AgCeJ3W/StrAnHMCT2t13bDkuZ8q
3DYAn1Eh/mSAUpTtobbQGx+pB9tLyH2l
=hN3z
-----END PGP SIGNATURE-----

Revision history for this message

Gareth White (gwhite-deactivatedaccount) wrote on 2009-12-17:

#8

Let me know if you need me to provide any more information to make this a confirmed bug. Since the leak is fairly small I'll work around it by restarting the server once a week or so.

Revision history for this message

John A Meinel (jameinel) wrote on 2009-12-18:

#9

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Gareth White wrote:
> Let me know if you need me to provide any more information to make this
> a confirmed bug. Since the leak is fairly small I'll work around it by
> restarting the server once a week or so.
>

I can reproduce it fairly easily here. I just haven't been able to track
down a cause.

I also just realized that we could be linking refcounted objects.
Specifically, if we had the refcount wrong on PyString or PyInt objects,
they wouldn't show up in the garbage collector, but they wouldn't get
collected if the refcount was too high. I'm going to try poking at that
and see what I get.

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAksr6+wACgkQJdeBCYSNAAMiXgCeIqTcf6t3pnmyvOSFbAEK1eMo
TE0AoLKQOQlvVXGkrAY9E0IkiWpV2EQ2
=i5Oe
-----END PGP SIGNATURE-----

Revision history for this message

John A Meinel (jameinel) wrote on 2009-12-18:

#10

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Gareth White wrote:
> Let me know if you need me to provide any more information to make this
> a confirmed bug. Since the leak is fairly small I'll work around it by
> restarting the server once a week or so.
>

So I think I've tracked it down to a leak in the bzrlib.bencode.bencode
functionality. Specifically:

>>> from bzrlib import trace, bencode
>>> trace.enabled_default_logging()
>>> l = ['a string', {'a': 'dict'}, 234234, 'a really big string'*10000]
>>> trace.debug_memory()
WorkingSize 10636KB PeakWorking 10636KB
>>> for i in xrange(200):
... enc = bencode.bencode(l)
...
>>> trace.debug_memory()
WorkingSize 48996KB PeakWorking 48996KB

So memory went from 11M to 49MB.

Note that I don't see the same result for the bdecoder

>>> for i in xrange(200):
... m = bdecode.bdecode(enc)
...
>>> trace.debug_memory()
WorkingSize 49212KB PeakWorking 49212KB

As expected, bdecode only increased the memory ~ as much as the new 'm'
object.

Will dig further.

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAksr7b8ACgkQJdeBCYSNAAMC3ACg0ZDcHegIVZykbAIqY/CKo0Gz
dzQAoKcFZbAPvnRKYFy0RyFqEG6L16Xe
=g2RX
-----END PGP SIGNATURE-----

Revision history for this message

John A Meinel (jameinel) wrote on 2009-12-18:

#11

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

This is strictly an issue with the bencode.Encoder class. I haven't
found the bug, as I see it using realloc and free where I would expect
to need it. But doing:

bigstr = 'this is a fairly big string' * 10000
e = bencode_pyx.Encoder()
for i in xrange(10):
e.process(bigstr)
del e

Doing this repeatedly, I see memory consumption grow by about 2.7MB.
Which is exactly the "e.size" value...

I tracked it down. For whatever reason I didn't notice, but the Encoder
class was using __del__ as its deconstructor, but it is supposed to use
__dealloc__
=== modified file 'bzrlib/_bencode_pyx.pyx'
- --- bzrlib/_bencode_pyx.pyx 2009-06-05 01:48:32 +0000
+++ bzrlib/_bencode_pyx.pyx 2009-12-18 21:26:50 +0000
@@ -261,7 +261,7 @@
self.maxsize = maxsize
self.tail = p

- - def __del__(self):
+ def __dealloc__(self):
         free(self.buffer)
         self.buffer = NULL
         self.maxsize = 0

dealloc is the C level deallocation, and it always runs, and doesn't
mess with the Python garbage collector.

For whatever reason __del__ is never getting called on this class.
Possibly because it is an extension type. I'll also note that none of
the members of Encoder are 'object' types, which means it doesn't have
to participate in GC the way other classes do.

Anyway, the above patch fixes the looping over encoders bug,
and it appears to fix the memory leak.

status fixcommitted

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAksr+uQACgkQJdeBCYSNAANpAACfa43sEiAst4cT3h31dkD8lQy3
i3EAn1fQqBV28TDz1P8CpWakYNyXyOHj
=rxMZ
-----END PGP SIGNATURE-----

Changed in bzr:
status:	Incomplete → Fix Committed

John A Meinel (jameinel) on 2009-12-22

Changed in bzr:
assignee:	nobody → John A Meinel (jameinel)
milestone:	none → 2.1.0b4
status:	Fix Committed → Fix Released

Bazaar

Smart server leaks memory each commit

Bug Description

Related branches

Other bug subscribers

Remote bug watches

Affects		Status	Importance	Assigned to	Milestone
	Bazaar	Fix Released	Medium	John A Meinel	Bazaar 2.1.0b4 "san francisco airport"
	2.0	Fix Released	Undecided	John A Meinel	Bazaar 2.0.4 "smooth sailing"