Ubuntu 20.04, mlx5 driver - CQE with err creating geneve tunnel in a VF

Bug #1922472 reported by Amir Tzin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Incomplete
Undecided
Unassigned
Focal
In Progress
Undecided
Tim Gardner

Bug Description

[Impact]

On mlx5 device, When creating geneve tunnel in a Virtual Function and setting it up error are logged in kernel buffer.

[test case]

using two setups connected back to back
on both sides create vf's
$ echo 1 > /sys/class/net/ens5f0/device/sriov_numvfs
add ip-es to vf's interfaces on both sides (13.194.5.1/16, 13.194.6.1/16)
$ ip a add 13.194.5.1/16 dev ens5f0v0
set interfaces on both sides up
$ ip l set dev ens5f0v0 up
check connectivity
$ ping 13.194.6.1 -I 13.194.5.1 -c 6
on both sides define geneve tunnel with same id over the vf-s
$ ip link add name gen_vf type geneve id 300 remote 13.194.6.1
add ip addresses to geneve interfaces on both sides (14.194.5.1/16, 14.194.6.1/16)
$ ip a add 14.194.5.1/16 dev gen_vf
set geneve interfaces up
$ ip l set dev gen_vf up
check log
$ dmesg
[ 1221.501048] mlx5_core 0000:24:00.2 ens5f0v0: Error cqe on cqn 0x48a, ci 0x9, sqn 0x116, opcode 0xd, syndrome 0x2, vendor syndrome 0x68
[ 1221.501179] 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 1221.501183] 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 1221.501185] 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 1221.501188] 00000030: 00 00 00 00 6a 10 68 02 0a 00 01 16 00 09 20 d2
[ 1221.501240] mlx5_core 0000:24:00.2 ens5f0v0: ERR CQE on SQ: 0x116
[ 1222.930608] mlx5_core 0000:24:00.2 ens5f0v0: Error cqe on cqn 0x48e, ci 0x5, sqn 0x11b, opcode 0xd, syndrome 0x2, vendor syndrome 0x68
[ 1222.930733] 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 1222.930736] 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 1222.930739] 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 1222.930741] 00000030: 00 00 00 00 6a 10 68 02 0a 00 01 1b 00 05 2d d2
[ 1222.930791] mlx5_core 0000:24:00.2 ens5f0v0: ERR CQE on SQ: 0x11b

[Fix]

The issue was solved upstream v5.12-rc1 with
e1c3940c6003 net/mlx5e: Check tunnel offload is required before setting SWP
The attached patch is modification of e1c3940c6003 for focal kernel.

[Regression Potential]
Regression risk is low as It is a very small fix which was also tested thoroughly on upstream
setups.

Tags: focal
Revision history for this message
Amir Tzin (amirtz) wrote :
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1922472

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: focal
Amir Tzin (amirtz)
description: updated
Tim Gardner (timg-tpi)
Changed in linux (Ubuntu Focal):
status: New → In Progress
assignee: nobody → Tim Gardner (timg-tpi)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.