Andrew and I poked at this a bit, and found that installing python-gmpy is actually a good way to reduce startup overhead. (publickey crypto does a lot of large-integer math, and gmpy is about 6x faster than python's builtin pow())
It also turns out that the perceived scaling isn't as bad on production as it is on the other machines I tested. Specifically, on my local single-cpu instance:
--load=1
time 0.28s
--load=10
time 2.80s
(or about linear scaling)
On qa-staging I also see
5.4s => 13.5s
going from a load of 1 to a load of 10. So a 2.5x scaling for a 10x load factor.
On production, though, I see:
3.0s => 4.0s
So a 25% increase at a 10x load factor.
My guess is that some of the scaling is just being CPU bound on my local machine. So the fact that bazaar.launchpad.net has better and more CPUs means it can finish one of the connections fast enough to avoid it affecting another connection.
Note that the load dependence is also much more apparent when the lp-forking-service is active, because the total time to run becomes much lower. So a 1s overhead is a much larger fraction of total time.
Andrew and I poked at this a bit, and found that installing python-gmpy is actually a good way to reduce startup overhead. (publickey crypto does a lot of large-integer math, and gmpy is about 6x faster than python's builtin pow())
It also turns out that the perceived scaling isn't as bad on production as it is on the other machines I tested. Specifically, on my local single-cpu instance:
--load=1
time 0.28s
--load=10
time 2.80s
(or about linear scaling)
On qa-staging I also see
5.4s => 13.5s
going from a load of 1 to a load of 10. So a 2.5x scaling for a 10x load factor.
On production, though, I see:
3.0s => 4.0s
So a 25% increase at a 10x load factor.
My guess is that some of the scaling is just being CPU bound on my local machine. So the fact that bazaar. launchpad. net has better and more CPUs means it can finish one of the connections fast enough to avoid it affecting another connection.
Note that the load dependence is also much more apparent when the lp-forking-service is active, because the total time to run becomes much lower. So a 1s overhead is a much larger fraction of total time.