third party Mellanox OFED 5.8-3.0.7.1 fail on kernel above 5.15.0-82-generic

Bug #2037533 reported by torel
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Kernel SRU Workflow
New
Undecided
Unassigned

Bug Description

Description:
Nvidia Mellanox MLNX_OFED_LINUX-5.8-3.0.7.0-LTS and thus BeeGFS 7.4.1 fail for kernels 5.15.0-83-generic and 5.15.0-84-generic and possibly above.

Issue:
beegfs client kernel module refuses to be inserted (insmod, modprobe).

Syslog:
BeeGFS client fails to load and in syslog

Sep 21 12:51:57 n017 kernel: beegfs: disagrees about version of symbol rdma_resolve_addr
Sep 21 12:51:57 n017 kernel: beegfs: Unknown symbol rdma_resolve_addr (err -22)
Sep 21 12:51:57 n017 kernel: beegfs: disagrees about version of symbol rdma_set_service_type
Sep 21 12:51:57 n017 kernel: beegfs: Unknown symbol rdma_set_service_type (err -22)
Sep 21 12:51:57 n017 kernel: beegfs: disagrees about version of symbol rdma_reject
Sep 21 12:51:57 n017 kernel: beegfs: Unknown symbol rdma_reject (err -22)
Sep 21 12:51:57 n017 kernel: beegfs: disagrees about version of symbol rdma_disconnect
Sep 21 12:51:57 n017 kernel: beegfs: Unknown symbol rdma_disconnect (err -22)
Sep 21 12:51:57 n017 kernel: beegfs: disagrees about version of symbol __rdma_create_kernel_id
Sep 21 12:51:57 n017 kernel: beegfs: Unknown symbol __rdma_create_kernel_id (err -22)
Sep 21 12:51:57 n017 kernel: beegfs: disagrees about version of symbol rdma_resolve_route
Sep 21 12:51:57 n017 kernel: beegfs: Unknown symbol rdma_resolve_route (err -22)
Sep 21 12:51:57 n017 kernel: beegfs: disagrees about version of symbol rdma_bind_addr
Sep 21 12:51:57 n017 kernel: beegfs: Unknown symbol rdma_bind_addr (err -22)
Sep 21 12:51:57 n017 kernel: beegfs: disagrees about version of symbol rdma_create_qp
Sep 21 12:51:57 n017 kernel: beegfs: Unknown symbol rdma_create_qp (err -22)

WorkARound:
Kernels up to 5.15.0-82-generic works fine.

Changes:
Looking through diffs of config-5.15.0-82-generic versus later, I can't say that I understand why this fails.

Third party bug report:
Reported as ThinkParQ RT #12388: BeeGFS stopped working after kernel upgrade. Also confirm by ThinkParQ.

https://doc.beegfs.io/latest/release_notes.html

Has anything changed on Infiniband ofed which affects third party MOFED modules?

--Tore

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.