connection closed while executing maas command

Bug #1965554 reported by Marian Gasparovic
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Triaged
Medium
Unassigned
3.3
Triaged
Medium
Unassigned

Bug Description

We found this during one of SQA test runs, it is a one time occurrence, we have not encountered it before
snap 3.1/stable

2022-03-18-12:05:15 root DEBUG [localhost]: maas root tag update-nodes juju add=rg8f6d
2022-03-18-12:05:19 foundationcloudengine.layers.configuremaas INFO Creating vault-2 in noma
2022-03-18-12:05:19 root DEBUG [localhost]: maas root vm-host compose 2 hostname=vault-2 cores=2 memory=4096 storage=40.0 zone=3 'interfaces=eth0:space=oam-space;eth1:space=internal-space'
2022-03-18-12:05:26 root ERROR [localhost] Command failed: maas root vm-host compose 2 hostname=vault-2 cores=2 memory=4096 storage=40.0 zone=3 'interfaces=eth0:space=oam-space;eth1:space=internal-space'
2022-03-18-12:05:26 root ERROR [localhost] STDOUT follows:
Connection was closed cleanly.
2022-03-18-12:05:26 root ERROR [localhost] STDERR follows:
b''

Artifacts
https://oil-jenkins.canonical.com/artifacts/245e2d6e-6723-42a2-8b7b-e2198054aaa8/index.html

Revision history for this message
Alberto Donato (ack) wrote :
Download full text (3.8 KiB)

regiond on the second node has the corresponding traceback, and just before it it seems that there were disconnection from the rackd:

2022-03-18 12:05:26 maasserver.rpc.regionservice: [info] Rack controller 'gbmc77' disconnected.
2022-03-18 12:05:26 RegionServer,1,::ffff:10.245.222.214: [info] RegionServer connection lost (HOST:IPv6Address(type='TCP', host='::ffff:10.245.222.213', port=5250, flowInfo=0, scopeID=0) PEER:IPv6Address(type='TCP', host='::ffff:10.245.222.214', port=40408, flowInfo=0, scopeID=0))
2022-03-18 12:05:26 maasserver.rpc.regionservice: [info] Rack controller 'gbmc77' disconnected.
2022-03-18 12:05:26 RegionServer,1,::ffff:10.245.222.214: [info] RegionServer connection lost (HOST:IPv6Address(type='TCP', host='::ffff:10.245.222.213', port=5251, flowInfo=0, scopeID=0) PEER:IPv6Address(type='TCP', host='::ffff:10.245.222.214', port=40690, flowInfo=0, scopeID=0))
2022-03-18 12:05:26 maasserver.rpc.regionservice: [info] Rack controller 'gbmc77' disconnected.
2022-03-18 12:05:26 RegionServer,1,::ffff:192.168.33.158: [info] RegionServer connection lost (HOST:IPv6Address(type='TCP', host='::ffff:192.168.33.157', port=5253, flowInfo=0, scopeID=0) PEER:IPv6Address(type='TCP', host='::ffff:192.168.33.158', port=40514, flowInfo=0, scopeID=0))
2022-03-18 12:05:26 maasserver.rpc.regionservice: [info] Rack controller 'gbmc77' disconnected.
2022-03-18 12:05:26 RegionServer,1,::ffff:10.246.64.7: [info] RegionServer connection lost (HOST:IPv6Address(type='TCP', host='::ffff:10.246.64.6', port=5252, flowInfo=0, scopeID=0) PEER:IPv6Address(type='TCP', host='::ffff:10.246.64.7', port=48690, flowInfo=0, scopeID=0))
2022-03-18 12:05:26 maasserver: [error] ################################ Exception: Connection was closed cleanly. ################################
2022-03-18 12:05:26 maasserver: [error] Traceback (most recent call last):
  File "/snap/maas/18199/usr/lib/python3/dist-packages/django/core/handlers/base.py", line 113, in _get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "/snap/maas/18199/lib/python3.8/site-packages/maasserver/utils/views.py", line 284, in view_atomic_with_post_commit_savepoint
    return view_atomic(*args, **kwargs)
  File "/usr/lib/python3.8/contextlib.py", line 75, in inner
    return func(*args, **kwds)
  File "/snap/maas/18199/lib/python3.8/site-packages/maasserver/api/support.py", line 56, in __call__
    response = super().__call__(request, *args, **kwargs)
  File "/snap/maas/18199/usr/lib/python3/dist-packages/django/views/decorators/vary.py", line 20, in inner_func
    response = func(*args, **kwargs)
  File "/snap/maas/18199/usr/lib/python3.8/dist-packages/piston3/resource.py", line 197, in __call__
    result = self.error_handler(e, request, meth, em_format)
  File "/snap/maas/18199/usr/lib/python3.8/dist-packages/piston3/resource.py", line 195, in __call__
    result = meth(request, *args, **kwargs)
  File "/snap/maas/18199/lib/python3.8/site-packages/maasserver/api/support.py", line 308, in dispatch
    return function(self, request, *args, **kwargs)
  File "/snap/maas/18199/lib/python3.8/site-packages/maasserver/api/support.py", line 158, in wrapper
    retu...

Read more...

Revision history for this message
Alberto Donato (ack) wrote :

We should at least investigate adding better debugging information, since the current traceback doesn't provide a way to see where the error originated.

Changed in maas:
status: New → Triaged
importance: Undecided → Medium
milestone: none → next
Revision history for this message
Jerzy Husakowski (jhusakowski) wrote :

We will improve diagnostic logging to capture the reason for disconnection.

Changed in maas:
milestone: next → 3.3.0
Changed in maas:
milestone: 3.3.0 → 3.4.0
Revision history for this message
Marian Gasparovic (marosg) wrote :
Revision history for this message
Marian Gasparovic (marosg) wrote :
Revision history for this message
Jeffrey Chang (modern911) wrote :

Solutions QA hit this more frequently in last 2 weeks like 15 failures, mostly on MAAS 3.3.3 and also during 3.4.0 testing.
Overall failure rate is ~3.5% on 3.3.3, but the highest among all MAAS bugs.

All occurrence could be found from https://solutions.qa.canonical.com/v2/bugs/1965554

Revision history for this message
Jeffrey Chang (modern911) wrote :
Download full text (4.1 KiB)

Post some debug log from 3.4.0-beta3, from https://oil-jenkins.canonical.com/artifacts/6616e90f-975d-4443-9fe7-bbc27fa62bb8/generated/generated/maas/logs-2023-06-15-14.53.09.tgz.
Very close to what we have in #1.

Jun 15 14:52:21 sunset sh[1523069]: 2023-06-15 14:52:21 RegionServer,29,::ffff:10.245.222.215: [info] RegionServer connection lost (HOST:IPv6Address(type='TCP', host='::ffff:10.245.222.204', port=5250, flowInfo=0, scopeID=0) PEER:IPv6Address(type='TCP', host='::ffff:10.245.222.215', port=45764, flowInfo=0, scopeID=0))
Jun 15 14:52:21 sunset sh[1523069]: 2023-06-15 14:52:21 maasserver.ipc: [info] Worker pid:1523454 lost RPC connection to ('tmhdw8', '10.245.222.204', 5250).
Jun 15 14:52:21 sunset sh[1523069]: 2023-06-15 14:52:21 RegionServer,30,::ffff:10.245.222.215: [info] RegionServer connection lost (HOST:IPv6Address(type='TCP', host='::ffff:10.245.222.204', port=5250, flowInfo=0, scopeID=0) PEER:IPv6Address(type='TCP', host='::ffff:10.245.222.215', port=45766, flowInfo=0, scopeID=0))
Jun 15 14:52:21 sunset sh[1523069]: 2023-06-15 14:52:21 maasserver: [error] ################################ Exception: Connection was closed cleanly. ################################
Jun 15 14:52:21 sunset sh[1523069]: 2023-06-15 14:52:21 maasserver: [error] Traceback (most recent call last):
Jun 15 14:52:21 sunset sh[1523069]: File "/usr/lib/python3/dist-packages/django/core/handlers/base.py", line 181, in _get_response
Jun 15 14:52:21 sunset sh[1523069]: response = wrapped_callback(request, *callback_args, **callback_kwargs)
Jun 15 14:52:21 sunset sh[1523069]: File "/usr/lib/python3/dist-packages/maasserver/utils/views.py", line 293, in view_atomic_with_post_commit_savepoint
Jun 15 14:52:21 sunset sh[1523069]: return view_atomic(*args, **kwargs)
Jun 15 14:52:21 sunset sh[1523069]: File "/usr/lib/python3.10/contextlib.py", line 79, in inner
Jun 15 14:52:21 sunset sh[1523069]: return func(*args, **kwds)
Jun 15 14:52:21 sunset sh[1523069]: File "/usr/lib/python3/dist-packages/maasserver/api/support.py", line 62, in __call__
Jun 15 14:52:21 sunset sh[1523069]: response = super().__call__(request, *args, **kwargs)
Jun 15 14:52:21 sunset sh[1523069]: File "/usr/lib/python3/dist-packages/django/views/decorators/vary.py", line 20, in inner_func
Jun 15 14:52:21 sunset sh[1523069]: response = func(*args, **kwargs)
Jun 15 14:52:21 sunset sh[1523069]: File "/usr/lib/python3.10/dist-packages/piston3/resource.py", line 197, in __call__
Jun 15 14:52:21 sunset sh[1523069]: result = self.error_handler(e, request, meth, em_format)
Jun 15 14:52:21 sunset sh[1523069]: File "/usr/lib/python3.10/dist-packages/piston3/resource.py", line 195, in __call__
Jun 15 14:52:21 sunset sh[1523069]: result = meth(request, *args, **kwargs)
Jun 15 14:52:21 sunset sh[1523069]: File "/usr/lib/python3/dist-packages/maasserver/api/support.py", line 371, in dispatch
Jun 15 14:52:21 sunset sh[1523069]: return function(self, request, *args, **kwargs)
Jun 15 14:52:21 sunset sh[1523069]: File "/usr/lib/python3/dist-packages/maasserver/api/support.py", line 217, in wrapper
Jun 15 14:52:21 sunset sh[1523069]: return func(self, request, *args...

Read more...

Alberto Donato (ack)
Changed in maas:
milestone: 3.4.0 → 3.4.x
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.