Netdev watchdog closes link on PowerEdge R6515 Realtek BCM5720

Bug #1914022 reported by Mladen Marinović
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

A few months ago we got 3 Dell PowerEdge™ R6515 machines and installed Ubuntu 20.04 LTS. After a few days problems started showing on the network interface for the internal network. The network link is disconnected, and a few seconds later it is connected again. We have updated all the server firmware, checked the cables and switch but the problem persists. Digging deeper in the logs we found this in dmesg (full log in the attached file):

[Sat Jan 30 09:57:07 2021] NETDEV WATCHDOG: eno2 (tg3): transmit queue 0 timed out
[Sat Jan 30 09:57:07 2021] tg3 0000:c1:00.1 eno2: transmit timed out, resetting
[Sat Jan 30 09:57:08 2021] tg3 0000:c1:00.1 eno2: Link is down
[Sat Jan 30 09:57:12 2021] tg3 0000:c1:00.1 eno2: Link is up at 1000 Mbps, full duplex

For some reason, the link is disconnected and reconnected a few seconds later. This happens randomly on all 3 servers and it does not seem to correlate with the current traffic on that interface. The problem seems to be very similar to Bug #1331513 but this happens on the up to date Ubutnu 20.04 (last updated mid last week).

If you need any additional info, feel free to ask.

Additional info:
--------------------------------------

linux-firmware/now 1.187.8 all
linux-image-5.4.0-64-generic/focal-updates,focal-updates,now 5.4.0-64.72 amd64
linux-image-generic/now 5.4.0.64.67 amd64

--------------------------------------

lspci -nnk | grep -iA2 net
c1:00.0 Ethernet controller [0200]: Broadcom Inc. and subsidiaries NetXtreme BCM5720 2-port Gigabit Ethernet PCIe [14e4:165f]
        DeviceName: NIC1
        Subsystem: Dell PowerEdge R6515/R7515 LOM [1028:08fd]
--
c1:00.1 Ethernet controller [0200]: Broadcom Inc. and subsidiaries NetXtreme BCM5720 2-port Gigabit Ethernet PCIe [14e4:165f]
        DeviceName: NIC2
        Subsystem: Dell PowerEdge R6515/R7515 LOM [1028:08fd]

--------------------------------------

modinfo tg3 | grep ^version
version: 3.137

--------------------------------------

lshw -C network
  *-network:0
       description: Ethernet interface
       product: NetXtreme BCM5720 2-port Gigabit Ethernet PCIe
       vendor: Broadcom Inc. and subsidiaries
       physical id: 0
       bus info: pci@0000:c1:00.0
       logical name: eno1
       version: 00
       serial: 34:48:ed:ef:07:0e
       size: 1Gbit/s
       capacity: 1Gbit/s
       width: 64 bits
       clock: 33MHz
       capabilities: pm vpd msi msix pciexpress bus_master cap_list rom ethernet physical tp 10bt 10bt-fd 100bt 100bt-fd 1000bt 1000bt-fd autonegotiation
       configuration: autonegotiation=on broadcast=yes driver=tg3 driverversion=3.137 duplex=full firmware=FFV21.60.16 bc 5720-v1.39 ip=213.133.114.208 latency=0 link=yes multicast=yes port=twisted pair speed=1Gbit/s
       resources: irq:160 memory:c7030000-c703ffff memory:c7040000-c704ffff memory:c7050000-c705ffff memory:c7100000-c713ffff
  *-network:1
       description: Ethernet interface
       product: NetXtreme BCM5720 2-port Gigabit Ethernet PCIe
       vendor: Broadcom Inc. and subsidiaries
       physical id: 0.1
       bus info: pci@0000:c1:00.1
       logical name: eno2
       version: 00
       serial: 34:48:ed:ef:07:0f
       size: 1Gbit/s
       capacity: 1Gbit/s
       width: 64 bits
       clock: 33MHz
       capabilities: pm vpd msi msix pciexpress bus_master cap_list rom ethernet physical tp 10bt 10bt-fd 100bt 100bt-fd 1000bt 1000bt-fd autonegotiation
       configuration: autonegotiation=on broadcast=yes driver=tg3 driverversion=3.137 duplex=full firmware=FFV21.60.16 bc 5720-v1.39 ip=10.0.1.6 latency=0 link=yes multicast=yes port=twisted pair speed=1Gbit/s
       resources: irq:163 memory:c7000000-c700ffff memory:c7010000-c701ffff memory:c7020000-c702ffff memory:c7140000-c717ffff

--------------------------------

ethtool --show-offload eno2
Features for eno2:
rx-checksumming: on
tx-checksumming: on
        tx-checksum-ipv4: on
        tx-checksum-ip-generic: off [fixed]
        tx-checksum-ipv6: on
        tx-checksum-fcoe-crc: off [fixed]
        tx-checksum-sctp: off [fixed]
scatter-gather: on
        tx-scatter-gather: on
        tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: on
        tx-tcp-segmentation: on
        tx-tcp-ecn-segmentation: on
        tx-tcp-mangleid-segmentation: off
        tx-tcp6-segmentation: on
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: on [fixed]
tx-vlan-offload: on [fixed]
ntuple-filters: off [fixed]
receive-hashing: off [fixed]
highdma: on
rx-vlan-filter: off [fixed]
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-gre-csum-segmentation: off [fixed]
tx-ipxip4-segmentation: off [fixed]
tx-ipxip6-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
tx-udp_tnl-csum-segmentation: off [fixed]
tx-gso-partial: off [fixed]
tx-sctp-segmentation: off [fixed]
tx-esp-segmentation: off [fixed]
tx-udp-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off [fixed]
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
l2-fwd-offload: off [fixed]
hw-tc-offload: off [fixed]
esp-hw-offload: off [fixed]
esp-tx-csum-hw-offload: off [fixed]
rx-udp_tunnel-port-offload: off [fixed]
tls-hw-tx-offload: off [fixed]
tls-hw-rx-offload: off [fixed]
rx-gro-hw: off [fixed]
tls-hw-record: off [fixed]
---
ProblemType: Bug
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 Jan 26 10:58 seq
 crw-rw---- 1 root audio 116, 33 Jan 26 10:58 timer
AplayDevices: aplay: device_list:274: no soundcards found...
ApportVersion: 2.20.11-0ubuntu27.16
Architecture: amd64
ArecordDevices: arecord: device_list:274: no soundcards found...
AudioDevicesInUse: Error: [Errno 2] No such file or directory: 'fuser'
CasperMD5CheckResult: skip
DistroRelease: Ubuntu 20.04
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
MachineType: Dell Inc. PowerEdge R6515
Package: linux (not installed)
PciMultimedia:

ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB:

ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.4.0-64-generic root=UUID=b16f8f4d-982b-45bf-a75f-e45c11f1961a ro nomodeset consoleblank=0
ProcVersionSignature: Ubuntu 5.4.0-64.72-generic 5.4.78
RelatedPackageVersions:
 linux-restricted-modules-5.4.0-64-generic N/A
 linux-backports-modules-5.4.0-64-generic N/A
 linux-firmware 1.187.8
RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
Tags: focal
Uname: Linux 5.4.0-64-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: N/A
_MarkForUpload: True
dmi.bios.date: 08/10/2020
dmi.bios.vendor: Dell Inc.
dmi.bios.version: 1.5.3 [Hetzner 1.0.0]
dmi.board.name: 0R4CNN
dmi.board.vendor: Dell Inc.
dmi.board.version: A00
dmi.chassis.type: 23
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvr1.5.3[Hetzner1.0.0]:bd08/10/2020:svnDellInc.:pnPowerEdgeR6515:pvr:rvnDellInc.:rn0R4CNN:rvrA00:cvnDellInc.:ct23:cvr:
dmi.product.family: PowerEdge
dmi.product.name: PowerEdge R6515
dmi.product.sku: SKU=NotProvided;ModelName=PowerEdge R6515
dmi.sys.vendor: Dell Inc.

Revision history for this message
Mladen Marinović (marin-smartivo) wrote :
Revision history for this message
Jeff Lane  (bladernr) wrote :

Hi.

I think you've filed this bug in the wrong place. Please file a public kernel bug here:

https://bugs.launchpad.net/ubuntu/+source/linux/+filebug

Thank you,

Jeff

information type: Proprietary → Public
affects: dell-poweredge → linux (Ubuntu)
Revision history for this message
Jeff Lane  (bladernr) wrote :

On second thought, nevermind, I was able to do it. There does not appear to be any identifying information in this summary so I've been able to mark this public and moved it to the appropriate project.

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1914022

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: focal
Revision history for this message
Mladen Marinović (marin-smartivo) wrote : CRDA.txt

apport information

tags: added: apport-collected
description: updated
Revision history for this message
Mladen Marinović (marin-smartivo) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Mladen Marinović (marin-smartivo) wrote : Lspci.txt

apport information

Revision history for this message
Mladen Marinović (marin-smartivo) wrote : Lspci-vt.txt

apport information

Revision history for this message
Mladen Marinović (marin-smartivo) wrote : Lsusb.txt

apport information

Revision history for this message
Mladen Marinović (marin-smartivo) wrote : Lsusb-t.txt

apport information

Revision history for this message
Mladen Marinović (marin-smartivo) wrote : Lsusb-v.txt

apport information

Revision history for this message
Mladen Marinović (marin-smartivo) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Mladen Marinović (marin-smartivo) wrote : ProcCpuinfoMinimal.txt

apport information

Revision history for this message
Mladen Marinović (marin-smartivo) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Mladen Marinović (marin-smartivo) wrote : ProcModules.txt

apport information

Revision history for this message
Mladen Marinović (marin-smartivo) wrote : UdevDb.txt

apport information

Revision history for this message
Mladen Marinović (marin-smartivo) wrote : WifiSyslog.txt

apport information

Revision history for this message
Mladen Marinović (marin-smartivo) wrote : acpidump.txt

apport information

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.