> @rbalint if you can reproduce the problem easily, it would be
> interesting to monitor the received ACPI events via acpi_listen.
I believe the reproduction is very easy for everyone.
I could not reproduce newer acpid fixing the problem OTOH. I'd like to
have the reproduction steps for that experiment.
> What I see during my tests is that acpi_listen is always showing the
> sleep events, meaning that the kernel receives them correctly at least,
> and then the failure happens in the delivery of these sleep events to
> the proper user-space daemon (acpid). So my guess is that something
> wrong is happening in the communication between kernel and user-space to
> deliver these events.
Since only the kernel changed it may be a regression in the kernel or
a change in kernel's behaviour that is still valid but acpid somehow
breaks.
Please either provide the reproduction steps where a different acpid
fixes the issue or point at the change in the kernel to which acpid
should adapt.
I believe this can be found by bisecting the kernel, but I don't have
the setup to do it efficiently myself.
> Just to make sure, when you say "the second hibernation attempt still
> fails" you mean that the system is still up & running (you can still ssh
> on it) and the sleep event is lost / not delivered properly, right?
Yes, exactly. I can still log in back to the system after a few seconds.
> @rbalint if you can reproduce the problem easily, it would be
> interesting to monitor the received ACPI events via acpi_listen.
I believe the reproduction is very easy for everyone.
I could not reproduce newer acpid fixing the problem OTOH. I'd like to
have the reproduction steps for that experiment.
> What I see during my tests is that acpi_listen is always showing the
> sleep events, meaning that the kernel receives them correctly at least,
> and then the failure happens in the delivery of these sleep events to
> the proper user-space daemon (acpid). So my guess is that something
> wrong is happening in the communication between kernel and user-space to
> deliver these events.
Since only the kernel changed it may be a regression in the kernel or
a change in kernel's behaviour that is still valid but acpid somehow
breaks.
Please either provide the reproduction steps where a different acpid
fixes the issue or point at the change in the kernel to which acpid
should adapt.
I believe this can be found by bisecting the kernel, but I don't have
the setup to do it efficiently myself.
> Just to make sure, when you say "the second hibernation attempt still
> fails" you mean that the system is still up & running (you can still ssh
> on it) and the sleep event is lost / not delivered properly, right?
Yes, exactly. I can still log in back to the system after a few seconds.