bzr hangs (deadlock) on push/pull/merge
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Bazaar |
Confirmed
|
Low
|
Unassigned |
Bug Description
(See also bug 405251 comment 34 (https:/
Interaction between bzr 1.16 and 1.17 causes deadlocked hangs often. In the call above it happened when I pushed from a 1.17 bzr to a 1.16.1 server; I now have the same problem between a 1.16 client and an 1.17 server doing a merge.
When hanging the client hangs forever in recv when attaching an strace to it. The server hangs forever just doing polls and nothing else. I tried to use different protocols to let the push complete in the earlier call and discovered that bzr+ssh just starts another server remotely but which does not listen to a socket. And in that case I see the client hanging on recv() and the server hanging on recv also.
I did a kill -QUIT on the hanging client trying to merge and issued a 'bt':
jo@seahorse:
** SIGQUIT received, entering debugger7KB 1KB/s | Fetching
revisions:Inserting stream
** Type 'c' to continue or 'q' to stop the process
** Or SIGQUIT again to quit (and possibly dump core)
> /usr/lib/
-> signal.
(Pdb) bt
/usr/
-> exit_val = bzrlib.
/usr/
-> ret = run_bzr_
/usr/
-> return exception_
/usr/
-> return the_callable(*args, **kwargs)
/usr/
-> ret = run(*run_argv)
/usr/
-> return self.run(
/usr/
-> location, revision, remember, possible_
/usr/
-> other_revision_id, base_revision_id, other_branch, base_branch)
/usr/
-> merger.
/usr/
-> self._maybe_
/usr/
-> target.
/usr/
-> result = unbound(self, *args, **kwargs)
/usr/
-> pb=pb)
/usr/
-> find_ghosts=
/usr/
-> result = unbound(self, *args, **kwargs)
/usr/
-> pb=pb, find_ghosts=
/usr/
-> self.__fetch()
/usr/
-> self._fetch_
/usr/
-> stream, from_format, [])
/usr/
-> return self._locked_
/usr/
-> for substream_type, substream in stream:
/usr/
-> for kind, stream in self._get_
/usr/
-> for bytes in byte_stream:
/usr/
-> self._read_more()
/usr/
-> bytes = self._medium_
/usr/
-> return self._read_
/usr/
-> return self._medium.
/usr/
-> return self._read_
/usr/
-> self._socket.recv, count, self._report_
/usr/
-> bytes = osutils.
/usr/
-> return f(*a, **kw)
> /usr/lib/
-> signal.
(Pdb)
the same on the server killed the server.
Changed in bzr: | |
status: | New → Confirmed |
Changed in bzr: | |
importance: | Undecided → Low |
tags: | added: hpss |
tags: | added: check-for-breezy |
!!!!!!!!!!!!!
It now hangs between 2 1.17 versions too!
!!!!!!!!!!!!!
The server bt after a kill -QUIT: lib/python2. 4/site- packages/ bzrlib/ breakin. py(33)_ debug() signal( signal. SIGQUIT, _debug) local/bin/ bzr(142) ?() commands. main() local/lib/ python2. 4/site- packages/ bzrlib/ commands. py(1108) main() catch_errors( argv) local/lib/ python2. 4/site- packages/ bzrlib/ commands. py(1120) run_bzr_ catch_errors( ) to_return_ code(run_ bzr, argv) local/lib/ python2. 4/site- packages/ bzrlib/ commands. py(835) exception_ to_return_ code() local/lib/ python2. 4/site- packages/ bzrlib/ commands. py(1030) run_bzr( ) local/lib/ python2. 4/site- packages/ bzrlib/ commands. py(647) run_argv_ aliases( ) **all_cmd_ args) local/lib/ python2. 4/site- packages/ bzrlib/ builtins. py(4725) run() local/lib/ python2. 4/site- packages/ bzrlib/ smart/server. py(342) serve_bzr( ) serve() local/lib/ python2. 4/site- packages/ bzrlib/ smart/server. py(121) serve() socket. accept( ) lib/python2. 4/site- packages/ bzrlib/ breakin. py(33)_ debug() signal( signal. SIGQUIT, _debug)
-bash-3.00$ bzr serve --directory /media/d02/bzr --allow-writes
listening on port: 4155
** SIGQUIT received, entering debugger
** Type 'c' to continue or 'q' to stop the process
** Or SIGQUIT again to quit (and possibly dump core)
> /usr/local/
-> signal.
(Pdb) bt
/usr/
-> exit_val = bzrlib.
/usr/
-> ret = run_bzr_
/usr/
-> return exception_
/usr/
-> return the_callable(*args, **kwargs)
/usr/
-> ret = run(*run_argv)
/usr/
-> return self.run(
/usr/
-> protocol(transport, host, port, inet)
/usr/
-> smart_server.
/usr/
-> conn, client_addr = self._server_
> /usr/local/
-> signal.
(Pdb)
The kill -QUIT with bt on the client at this point: ~/bzr/vp- trunk$ bzr merge bzr://bzr. hosts.itris. nl/vp-trunk python2. 5/site- packages/ bzrlib/ breakin. py(33)_ debug() signal( signal. SIGQUIT, _debug) bin/bzr( 142)<module> () commands. main() lib/python2. 5/site- packages/ bzrlib/ commands. py(1108) main() catch_errors( argv) lib/python2. 5/site- packages/ bzrlib/ commands. py(1120) run_bzr_ catch_errors( ) to_return_ code(run_ bzr, argv) lib/python2. 5/site- packages/ bzrlib/ commands. py(835) exception_ to_return_ code() lib/python2. 5/site- packages/ bzrlib/ commands. py(1030) run_bzr( ) lib/python2. 5/site- packages/ bzrlib/ commands. py(647) run_argv_ aliases( ) **all_cmd_ args) lib/python2. 5/site- packages/ bzrlib/ builtins. py(3683) run() transports, pb) lib/python2. 5/site- packages/ bzrlib/ builtins. py(3790) _get_merger_ from_branch( ) lib/python2. 5/site- packages/ bzrlib/ merge.py( 204)from_ revision_ ids() set_other_ revision( other, other_branch) lib/python2. 5/site- packages/ bzrlib/ merge.py( 343)set_ other_revision( ) fetch(other_ branch, self.this_branch, self.other_rev_id) lib/python2. ..
jo@seahorse:
** SIGQUIT received, entering debuggerB/s | Fetching revisions:Inserting
stream
** Type 'c' to continue or 'q' to stop the process
** Or SIGQUIT again to quit (and possibly dump core)
> /usr/lib/
-> signal.
(Pdb) bt
/usr/
-> exit_val = bzrlib.
/usr/
-> ret = run_bzr_
/usr/
-> return exception_
/usr/
-> return the_callable(*args, **kwargs)
/usr/
-> ret = run(*run_argv)
/usr/
-> return self.run(
/usr/
-> location, revision, remember, possible_
/usr/
-> other_revision_id, base_revision_id, other_branch, base_branch)
/usr/
-> merger.
/usr/
-> self._maybe_
/usr/