The work-around identified in comment #9 can be used to bypass this. It delays further services from starting up an attempting to interact with the mlnx cards which appears to cause kernel hung tasks due to the kernel hung task timeout of 120 seconds. I'm not convinced at this moment in time that managing the systemd service files from the charm is the correct thing to do here. Notably, this would likely be a general problem on Ubuntu with VFs etc. It may end up being that increasing the timeout is a longer term solution rather than a work around, however we need to understand the problem better in order to address the problem in the right space.
The work-around identified in comment #9 can be used to bypass this. It delays further services from starting up an attempting to interact with the mlnx cards which appears to cause kernel hung tasks due to the kernel hung task timeout of 120 seconds. I'm not convinced at this moment in time that managing the systemd service files from the charm is the correct thing to do here. Notably, this would likely be a general problem on Ubuntu with VFs etc. It may end up being that increasing the timeout is a longer term solution rather than a work around, however we need to understand the problem better in order to address the problem in the right space.