2011-08-10 04:54:32 |
Martin Pool |
description |
Recent haproxy rollout on launchpad has illuminated a bzrlib bug:
20:29 <lifeless> mtaylor: wgrant: so, this is a bzrlib bug.
20:29 <lifeless> not the holding open
20:29 <lifeless> the handling of the disconnects.
20:29 <lifeless> transports are defined stateless.
20:29 <lifeless> so are RPC calls.
20:31 <lifeless> the haproxy stuff is showing this up very visibly, but the bug pre-existed - bzr didn't seen keepalives, and if ssh wasn't configured to do so it could naturally happen anyhow
The following traceback shows the carnage which happens when tarmac holds open its connection around the post-merge hook:
Traceback (most recent call last):
File "/usr/bin/tarmac", line 6, in <module>
main()
File "/usr/lib/pymodules/python2.6/tarmac/bin/__init__.py", line 30, in main
registry.run(args)
File "/usr/lib/pymodules/python2.6/tarmac/bin/registry.py", line 60, in run
self._run(args)
File "/usr/lib/pymodules/python2.6/tarmac/bin/registry.py", line 48, in _run
run_bzr(args)
File "/usr/lib/pymodules/python2.6/bzrlib/commands.py", line 1124, in run_bzr
ret = run(*run_argv)
File "/usr/lib/pymodules/python2.6/bzrlib/commands.py", line 689, in run_argv_aliases
return self.run(**all_cmd_args)
File "/usr/lib/pymodules/python2.6/bzrlib/commands.py", line 711, in run
return self._operation.run_simple(*args, **kwargs)
File "/usr/lib/pymodules/python2.6/bzrlib/cleanup.py", line 135, in run_simple
self.cleanups, self.func, *args, **kwargs)
File "/usr/lib/pymodules/python2.6/bzrlib/cleanup.py", line 165, in _do_with_cleanups
result = func(*args, **kwargs)
File "/usr/lib/pymodules/python2.6/tarmac/bin/commands.py", line 356, in run
self._do_merges(branch_url)
File "/usr/lib/pymodules/python2.6/tarmac/bin/commands.py", line 274, in _do_merges
authors=source.authors,
File "/usr/lib/pymodules/python2.6/tarmac/branch.py", line 163, in authors
self.bzr_branch.lock_read()
File "/usr/lib/pymodules/python2.6/bzrlib/remote.py", line 2430, in lock_read
self.repository.lock_read()
File "/usr/lib/pymodules/python2.6/bzrlib/remote.py", line 1021, in lock_read
self._real_repository.lock_read()
File "/usr/lib/pymodules/python2.6/bzrlib/repofmt/pack_repo.py", line 2402, in lock_read
repo.lock_read()
File "/usr/lib/pymodules/python2.6/bzrlib/remote.py", line 1021, in lock_read
self._real_repository.lock_read()
File "/usr/lib/pymodules/python2.6/bzrlib/repofmt/pack_repo.py", line 2403, in lock_read
self._refresh_data()
File "/usr/lib/pymodules/python2.6/bzrlib/repofmt/pack_repo.py", line 2327, in _refresh_data
self._pack_collection.reload_pack_names()
File "/usr/lib/pymodules/python2.6/bzrlib/repofmt/pack_repo.py", line 2040, in reload_pack_names
orig_disk_nodes) = self._diff_pack_names()
File "/usr/lib/pymodules/python2.6/bzrlib/repofmt/pack_repo.py", line 1906, in _diff_pack_names
for index, key, value in self._iter_disk_pack_index():
File "/usr/lib/pymodules/python2.6/bzrlib/btree_index.py", line 996, in iter_all_entries
if not self.key_count():
File "/usr/lib/pymodules/python2.6/bzrlib/btree_index.py", line 1443, in key_count
self._get_root_node()
File "/usr/lib/pymodules/python2.6/bzrlib/btree_index.py", line 940, in _get_root_node
self._get_internal_nodes([0])
File "/usr/lib/pymodules/python2.6/bzrlib/btree_index.py", line 965, in _get_internal_nodes
return self._get_nodes(self._internal_node_cache, node_indexes)
File "/usr/lib/pymodules/python2.6/bzrlib/btree_index.py", line 957, in _get_nodes
found.update(self._get_and_cache_nodes(needed))
File "/usr/lib/pymodules/python2.6/bzrlib/btree_index.py", line 734, in _get_and_cache_nodes
for node_pos, node in self._read_nodes(sorted(nodes)):
File "/usr/lib/pymodules/python2.6/bzrlib/btree_index.py", line 1530, in _read_nodes
bytes = self._transport.get_bytes(self._name)
File "/usr/lib/pymodules/python2.6/bzrlib/transport/remote.py", line 226, in get_bytes
resp, response_handler = self._client.call_expecting_body('get', remote)
File "/usr/lib/pymodules/python2.6/bzrlib/smart/client.py", line 145, in call_expecting_body
method, args, expect_response_body=True)
File "/usr/lib/pymodules/python2.6/bzrlib/smart/client.py", line 79, in _call_and_read_response
readv_body=readv_body, body_stream=body_stream)
File "/usr/lib/pymodules/python2.6/bzrlib/smart/client.py", line 63, in _send_request
encoder.call(method, *args)
File "/usr/lib/pymodules/python2.6/bzrlib/smart/protocol.py", line 1309, in call
self._write_end()
File "/usr/lib/pymodules/python2.6/bzrlib/smart/protocol.py", line 1133, in _write_end
self.flush()
File "/usr/lib/pymodules/python2.6/bzrlib/smart/protocol.py", line 1099, in flush
self._real_write_func(''.join(self._buf))
File "/usr/lib/pymodules/python2.6/bzrlib/smart/medium.py", line 395, in accept_bytes
self._accept_bytes(bytes)
File "/usr/lib/pymodules/python2.6/bzrlib/smart/medium.py", line 977, in _accept_bytes
self._medium._accept_bytes(bytes)
File "/usr/lib/pymodules/python2.6/bzrlib/smart/medium.py", line 794, in _accept_bytes
self._real_medium.accept_bytes(bytes)
File "/usr/lib/pymodules/python2.6/bzrlib/smart/medium.py", line 688, in accept_bytes
self._accept_bytes(bytes)
File "/usr/lib/pymodules/python2.6/bzrlib/smart/medium.py", line 861, in _accept_bytes
osutils.send_all(self._socket, bytes, self._report_activity)
File "/usr/lib/pymodules/python2.6/bzrlib/osutils.py", line 2075, in send_all
sent = sock.send(buffer(bytes, sent_total, MAX_SOCKET_CHUNK))
socket.error: [Errno 32] Broken pipe |
Recent haproxy rollout on launchpad has illuminated a bzrlib bug:
20:29 <lifeless> mtaylor: wgrant: so, this is a bzrlib bug.
20:29 <lifeless> not the holding open
20:29 <lifeless> the handling of the disconnects.
20:29 <lifeless> transports are defined stateless.
20:29 <lifeless> so are RPC calls.
20:31 <lifeless> the haproxy stuff is showing this up very visibly, but the bug pre-existed - bzr didn't seen keepalives, and if ssh wasn't configured to do so it could naturally happen anyhow
The following traceback shows the carnage which happens when tarmac holds open its connection around the post-merge hook:
Traceback (most recent call last):
File "/usr/bin/tarmac", line 6, in <module>
main()
File "/usr/lib/pymodules/python2.6/tarmac/bin/__init__.py", line 30, in main
registry.run(args)
File "/usr/lib/pymodules/python2.6/tarmac/bin/registry.py", line 60, in run
self._run(args)
File "/usr/lib/pymodules/python2.6/tarmac/bin/registry.py", line 48, in _run
run_bzr(args)
File "/usr/lib/pymodules/python2.6/bzrlib/commands.py", line 1124, in run_bzr
ret = run(*run_argv)
File "/usr/lib/pymodules/python2.6/bzrlib/commands.py", line 689, in run_argv_aliases
return self.run(**all_cmd_args)
File "/usr/lib/pymodules/python2.6/bzrlib/commands.py", line 711, in run
return self._operation.run_simple(*args, **kwargs)
File "/usr/lib/pymodules/python2.6/bzrlib/cleanup.py", line 135, in run_simple
self.cleanups, self.func, *args, **kwargs)
File "/usr/lib/pymodules/python2.6/bzrlib/cleanup.py", line 165, in _do_with_cleanups
result = func(*args, **kwargs)
File "/usr/lib/pymodules/python2.6/tarmac/bin/commands.py", line 356, in run
self._do_merges(branch_url)
File "/usr/lib/pymodules/python2.6/tarmac/bin/commands.py", line 274, in _do_merges
authors=source.authors,
File "/usr/lib/pymodules/python2.6/tarmac/branch.py", line 163, in authors
self.bzr_branch.lock_read()
File "/usr/lib/pymodules/python2.6/bzrlib/remote.py", line 2430, in lock_read
self.repository.lock_read()
File "/usr/lib/pymodules/python2.6/bzrlib/remote.py", line 1021, in lock_read
self._real_repository.lock_read()
File "/usr/lib/pymodules/python2.6/bzrlib/repofmt/pack_repo.py", line 2402, in lock_read
repo.lock_read()
File "/usr/lib/pymodules/python2.6/bzrlib/remote.py", line 1021, in lock_read
self._real_repository.lock_read()
File "/usr/lib/pymodules/python2.6/bzrlib/repofmt/pack_repo.py", line 2403, in lock_read
self._refresh_data()
File "/usr/lib/pymodules/python2.6/bzrlib/repofmt/pack_repo.py", line 2327, in _refresh_data
self._pack_collection.reload_pack_names()
File "/usr/lib/pymodules/python2.6/bzrlib/repofmt/pack_repo.py", line 2040, in reload_pack_names
orig_disk_nodes) = self._diff_pack_names()
File "/usr/lib/pymodules/python2.6/bzrlib/repofmt/pack_repo.py", line 1906, in _diff_pack_names
for index, key, value in self._iter_disk_pack_index():
File "/usr/lib/pymodules/python2.6/bzrlib/btree_index.py", line 996, in iter_all_entries
if not self.key_count():
File "/usr/lib/pymodules/python2.6/bzrlib/btree_index.py", line 1443, in key_count
self._get_root_node()
File "/usr/lib/pymodules/python2.6/bzrlib/btree_index.py", line 940, in _get_root_node
self._get_internal_nodes([0])
File "/usr/lib/pymodules/python2.6/bzrlib/btree_index.py", line 965, in _get_internal_nodes
return self._get_nodes(self._internal_node_cache, node_indexes)
File "/usr/lib/pymodules/python2.6/bzrlib/btree_index.py", line 957, in _get_nodes
found.update(self._get_and_cache_nodes(needed))
File "/usr/lib/pymodules/python2.6/bzrlib/btree_index.py", line 734, in _get_and_cache_nodes
for node_pos, node in self._read_nodes(sorted(nodes)):
File "/usr/lib/pymodules/python2.6/bzrlib/btree_index.py", line 1530, in _read_nodes
bytes = self._transport.get_bytes(self._name)
File "/usr/lib/pymodules/python2.6/bzrlib/transport/remote.py", line 226, in get_bytes
resp, response_handler = self._client.call_expecting_body('get', remote)
File "/usr/lib/pymodules/python2.6/bzrlib/smart/client.py", line 145, in call_expecting_body
method, args, expect_response_body=True)
File "/usr/lib/pymodules/python2.6/bzrlib/smart/client.py", line 79, in _call_and_read_response
readv_body=readv_body, body_stream=body_stream)
File "/usr/lib/pymodules/python2.6/bzrlib/smart/client.py", line 63, in _send_request
encoder.call(method, *args)
File "/usr/lib/pymodules/python2.6/bzrlib/smart/protocol.py", line 1309, in call
self._write_end()
File "/usr/lib/pymodules/python2.6/bzrlib/smart/protocol.py", line 1133, in _write_end
self.flush()
File "/usr/lib/pymodules/python2.6/bzrlib/smart/protocol.py", line 1099, in flush
self._real_write_func(''.join(self._buf))
File "/usr/lib/pymodules/python2.6/bzrlib/smart/medium.py", line 395, in accept_bytes
self._accept_bytes(bytes)
File "/usr/lib/pymodules/python2.6/bzrlib/smart/medium.py", line 977, in _accept_bytes
self._medium._accept_bytes(bytes)
File "/usr/lib/pymodules/python2.6/bzrlib/smart/medium.py", line 794, in _accept_bytes
self._real_medium.accept_bytes(bytes)
File "/usr/lib/pymodules/python2.6/bzrlib/smart/medium.py", line 688, in accept_bytes
self._accept_bytes(bytes)
File "/usr/lib/pymodules/python2.6/bzrlib/smart/medium.py", line 861, in _accept_bytes
osutils.send_all(self._socket, bytes, self._report_activity)
File "/usr/lib/pymodules/python2.6/bzrlib/osutils.py", line 2075, in send_all
sent = sock.send(buffer(bytes, sent_total, MAX_SOCKET_CHUNK))
socket.error: [Errno 32] Broken pipe
implementation plan:
* catch the socket.error and change it into a higher-level "connection closed" error
* this might need care to cover both internal/paramiko and external ssh clients; external clients might give different errors on Windows
* in the client per-call method, catch this error if it happens while sending the initial request, log a message, and build a new medium(?) object
To test it automatically:
* make a special medium that raises this error through monkeypatching
* use existing ssh tests that start a real server to check that the right error is raised when the
Interactive tests are probably a good idea considering the interaction with external systems
* start a bzr ssh client from a python shell
* kill the server process
* send a new request and check that it logs and reconnects |
|