Comment 17 for bug 423804

Revision history for this message
John A Meinel (jameinel) wrote :

So I did a bit more debugging, since I was looking at this code a bit while trying to address the os.write code.

Anyway, it looks like the ssh process allows us about 1.6MB of in-transit data before it starts blocking.

This was detected by having a canned request return a lot of bulk data, and then having a trivial client that just reads the response 64kB at a time.

My network is still a bit strange, but I can see that the server-side claims to write as many as 30 64kB chunks that the client has not claimed to have read yet. This was done by breaking up a large request into 64kB chunks, and calling out.write() on each one, and timing how long it took, while at the same time, the local client reports when it reads 64kB.

This used ssh's stderr multiplexing to get the messages on stderr locally. Which I'm sure isn't perfectly aligned between processes, but it should be at least close.

Almost all of the write calls look like: wrote 65536 sub bytes in 0.000s
In that it takes <= 3ms for a write. Every so often you get:
wrote 65536 sub bytes in 2.143s

Which I assume means that ssh is blocking waiting for some of the buffer to clear up.

This means we have something less than 2MB of buffering that we get for bzr+ssh. (I assume this would be different with Launchpad's Twisted implementation.) Though it may also depend on the Window, etc settings.

Though according to Wireshark my peak "Win" is only about 260,000 bytes. (Wireshark also tells me I get a fair number of retransmissions, etc. Which I assume is because of something with my network, and not something bzr could do anything about.)

Anyway, it would appear that there isn't much to be done for bzr+ssh, since as near as I can tell, ssh is already handling the buffering for us. I wonder if it would help to decrease the 1MB internal buffer, so that we aren't too close to the limit of the ssh subprocess. Conceptually, that would allow us to multiplex a bit more. So that we don't wait and have the buffer empty before we then supply a huge content chunk that ends up blocking.

Something to consider, at least.