bad graphing values for device e1000

Bug #55989 reported by Olivier Cortès
2
Affects Status Importance Assigned to Milestone
Linux
Fix Released
Low
linux-source-2.6.15 (Ubuntu)
Invalid
Undecided
Unassigned
netspeed (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

During a huge file transfer between 2 ubuntu dapper (up-to-date) machines, i use to display the netspeed graph on both machine for various reasons. Knowing that both machines only do the file transfer (no other perturbating network task that could "pollute" the graph values), I think that the polling of the device is bad. It could be something else, but The graph is misleading : it goes often over 100Mbps, and the physical link is a 100Mbps.

One machine has a rtl8139 100Mbps card.
The machine which displays a "faulty" graph has an e1000, clamped to 100Mbps because the switch (Netgear FS116) is a 100Mbps one). Network cables are new, it tested with other cables if that could be a problem.
Both machines use last -686 kernel.

Note : doing the same file tranfer between my machine and another e1000 machine (a server without X) produce the same sort of graph ("bad" peaks).
I attach two images, one from the rtl8139 machine, which I qualify as "good" graph, and one from my laptop (e1000) with bad peaks.
I could guess that the average value of the bad graph is good, I mean averaging the peak and the bottoms we could obtain the real network trafic average value, because the bottom values are smaller than those on the rtl8139 machine.

Revision history for this message
Olivier Cortès (olive) wrote :

good graph on the rtl8139 machine during a file transfer

Revision history for this message
Olivier Cortès (olive) wrote :

bad graph with faulty peaks on the e1000 machine during the same file transfer

Revision history for this message
Olivier Cortès (olive) wrote :

should I add that on both machines, the update interval of netspeed is set to 500ms ?

Revision history for this message
Olivier Cortès (olive) wrote :

This bug comes from the kernel e1000 driver which doesn't update counter values more than one time per 2 seconds.

see http://lkml.org/lkml/2003/12/20/30

Revision history for this message
Olivier Cortès (olive) wrote :

temporary removing kernel to re-add it with the upstream bug number.

Revision history for this message
Olivier Cortès (olive) wrote :

Info : this problem is still the same in kernel 2.6.18-rc4.

patch wich resolves the problem : http://lkml.org/lkml/diff/2003/12/20/30/1

the code :

--- linux-2.4.20/drivers/net/e1000/e1000_main.c~ 2003-08-03 00:40:21.000000000 +0200
+++ linux-2.4.20/drivers/net/e1000/e1000_main.c 2003-08-08 13:20:06.000000000 +0200
@@ -1390,7 +1390,7 @@
        netif_stop_queue(netdev);

    /* Reset the timer */
- mod_timer(&adapter->watchdog_timer, jiffies + 2 * HZ);
+ mod_timer(&adapter->watchdog_timer, jiffies + HZ);
 }

 #define E1000_TX_FLAGS_CSUM 0x00000001

Revision history for this message
Ben Collins (ben-collins) wrote :

I'm going to have to wait for comments on this patch. From the archived lkml stuff, I saw that one Intel person mentioned possible problems with changing this timer, specfically with half-duplex, and one particular chipset.

Changed in linux-source-2.6.15:
status: Unconfirmed → Needs Info
Revision history for this message
Ben Collins (ben-collins) wrote :

Bug is in kernel.

Changed in netspeed:
status: Unconfirmed → Rejected
Changed in linux:
status: Unknown → Confirmed
Changed in linux:
status: Confirmed → In Progress
Changed in linux:
status: In Progress → Confirmed
Changed in linux:
status: Confirmed → In Progress
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux-source-2.6.15 (Ubuntu) because there has been no activity for 60 days.]

Changed in linux:
status: In Progress → Fix Released
Changed in linux:
importance: Unknown → Low
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.