SSL SYSCALL error: EOF detected

Bug #2038346 reported by Moises Emilio Benzan Mora
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Expired
Undecided
Unassigned

Bug Description

While running MAAS commands on a deployment, we get the following error with no other indication of something being wrong.

2023-09-30-02:40:07 root DEBUG [localhost]: maas root tags create name=microk8s
2023-09-30-02:40:51 root DEBUG [localhost]: maas root tag update-nodes microk8s add=wpgrx6
2023-09-30-02:40:56 root ERROR [localhost] Command failed: maas root tag update-nodes microk8s add=wpgrx6
2023-09-30-02:40:56 root ERROR 1[localhost] STDOUT follows:
SSL SYSCALL error: EOF detected

On test run: https://solutions.qa.canonical.com/testruns/78a51246-d073-4b6e-b9b3-b334bfd3ffc4/

Artifacts: https://oil-jenkins.canonical.com/artifacts/78a51246-d073-4b6e-b9b3-b334bfd3ffc4/index.html

MAAS Specific Artifacts: https://oil-jenkins.canonical.com/artifacts/78a51246-d073-4b6e-b9b3-b334bfd3ffc4/generated/generated/maas/logs-2023-09-30-02.42.18.tgz

Running on MAAS Version: 3.4.0~rc2-14314-g.53d5ac1f4

Alberto Donato (ack)
summary: - [3.4] SSL SYSCALL error: EOF detected
+ SSL SYSCALL error: EOF detected
Revision history for this message
Alberto Donato (ack) wrote :
Download full text (3.9 KiB)

Right before that error, similar errors are reported in regiond.log on one unit (.32) due to disconnection from Postgres:

2023-09-30 02:40:37 provisioningserver.rpc.common: [critical] Unhandled failure dispatching AMP command. This is probably a bug. Please ensure that this error is handled within application code or declared in the signature of the b'GetControllerType' command. [infra3:pid=2038237:cmd=GetControllerType:ask=cf]
 Traceback (most recent call last):
   File "/snap/maas/30885/usr/lib/python3/dist-packages/twisted/internet/asyncioreactor.py", line 271, in _onTimer
     self.runUntilCurrent()
   File "/snap/maas/30885/usr/lib/python3/dist-packages/twisted/internet/base.py", line 991, in runUntilCurrent
     call.func(*call.args, **call.kw)
   File "/snap/maas/30885/usr/lib/python3/dist-packages/twisted/internet/defer.py", line 700, in errback
     self._startRunCallbacks(fail)
   File "/snap/maas/30885/usr/lib/python3/dist-packages/twisted/internet/defer.py", line 763, in _startRunCallbacks
     self._runCallbacks()
 --- <exception caught here> ---
   File "/snap/maas/30885/usr/lib/python3/dist-packages/twisted/internet/defer.py", line 857, in _runCallbacks
     current.result = callback( # type: ignore[misc]
   File "/snap/maas/30885/usr/lib/python3/dist-packages/twisted/protocols/amp.py", line 1138, in checkKnownErrors
     key = error.trap(*command.allErrors)
   File "/snap/maas/30885/usr/lib/python3/dist-packages/twisted/python/failure.py", line 451, in trap
     self.raiseException()
   File "/snap/maas/30885/usr/lib/python3/dist-packages/twisted/python/failure.py", line 475, in raiseException
     raise self.value.with_traceback(self.tb)
   File "/snap/maas/30885/usr/lib/python3/dist-packages/twisted/python/threadpool.py", line 244, in inContext
     result = inContext.theWork() # type: ignore[attr-defined]
   File "/snap/maas/30885/usr/lib/python3/dist-packages/twisted/python/threadpool.py", line 260, in <lambda>
     inContext.theWork = lambda: context.call( # type: ignore[attr-defined]
   File "/snap/maas/30885/usr/lib/python3/dist-packages/twisted/python/context.py", line 117, in callWithContext
     return self.currentContext().callWithContext(ctx, func, *args, **kw)
   File "/snap/maas/30885/usr/lib/python3/dist-packages/twisted/python/context.py", line 82, in callWithContext
     return func(*args, **kw)
   File "/snap/maas/30885/lib/python3.10/site-packages/provisioningserver/utils/twisted.py", line 856, in callInContext
     return func(*args, **kwargs)
   File "/snap/maas/30885/lib/python3.10/site-packages/provisioningserver/utils/twisted.py", line 203, in wrapper
     result = func(*args, **kwargs)
   File "/snap/maas/30885/lib/python3.10/site-packages/maasserver/utils/orm.py", line 771, in call_within_transaction
     return func_outside_txn(*args, **kwargs)
   File "/snap/maas/30885/lib/python3.10/site-packages/maasserver/utils/orm.py", line 574, in retrier
     return func(*args, **kwargs)
   File "/usr/lib/python3.10/contextlib.py", line 78, in inner
     with self._recreate_cm():
   File "/snap/maas/30885/usr/lib/python3/dist-packages/django/db/transaction.py", line 207, in __enter__
     connection.set_au...

Read more...

Revision history for this message
Alberto Donato (ack) wrote :

From postgresql logs, it seems indeed the region disconnected around that time:

023-09-30 02:40:36.138 UTC [159374] maas@maasdb LOG: could not receive data from client: Connection reset by peer

but I don't see anything in region logs that would indicate why that happens.

Is this an isolated case or has it been observed other times?

Changed in maas:
status: New → Incomplete
Revision history for this message
Moises Emilio Benzan Mora (moisesbenzan) wrote :

UP until now we've only seen this one time, but future occurrences could be found here: https://solutions.qa.canonical.com/bugs/2038346

Changed in maas:
status: Incomplete → New
Revision history for this message
Anton Troyanov (troyanov) wrote :

SSL SYSCALL error is reported by PostgreSQL and there might be multiple reasons for that.

Moving it to Incomplete, as this was (and still is) just a single occurrence.

Changed in maas:
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for MAAS because there has been no activity for 60 days.]

Changed in maas:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.