Fast Models [7.1.42 (May 25 2012)] Does not open userNetPorts while socket is in TIME_WAIT state
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
LAVA Dispatcher |
Won't Fix
|
High
|
Unassigned | ||
Linaro Fast Models |
New
|
Undecided
|
Unassigned |
Bug Description
I use -C motherboard.
however, sometimes more often than not, that doesnt' work and the fastmodel never brings up the 5555 port.
we rely on this feature in LAVA and we see lots of flakiness on many sides, with this potentially triggering a large part of it.
dmart wondered on IRC:
11:13 < dmart> Can the port be customised? Is this a TIME_WAIT problem?
11:14 < asac> dmart: so i use
11:14 < asac> -C motherboard.
11:14 < asac> and so on
11:14 < asac> dmart: now if i start the fast model, if things go well i see a LISTEN 5555 on the host mahcine right away
11:15 < asac> dmart: but in 8 out of 10 runs the port is not opened atall
11:15 < asac> dmart: in cases where it works it seems to be not coupled with the actual target system opening a port
11:15 < asac> it just opens the port right away (which makse sense)
11:15 < asac> so yeah... short: hostbridge.
note that the telnet ports for serial are always opened properly.
peter maydell pointed out:
11:28 < pm215> it would be good to be able to make the model use the socket option that allows rebinding
also we observed some weird effects where network traffic through this 5555 port gets stalled from time to time and telnet into one of the serial soockets will make the traffic resume - but no further details on this observation yet. Might just indicate a wider issue on the userport traffic side
Related branches
- Michael Hudson-Doyle (community): Approve
-
Diff: 31 lines (+7/-0)1 file modifiedlava_dispatcher/client/fastmodel.py (+7/-0)
description: | updated |
description: | updated |
summary: |
- Fast Models [7.1.42 (May 25 2012)] UserPort Networking is sometimes - flaky + Fast Models [7.1.42 (May 25 2012)] UserPort Networking feels flaky |
Changed in lava-dispatcher: | |
status: | Triaged → Won't Fix |
OK, from my experiments I can say that TIME_WAIT might indeed have an influence on this. I couldn't reproduce a not-created socket for 5555 if i wait for TIME_WAIT socket to go away.
Fastmodel probably should use SO_REUSEADDR to fix this.
We probably can work around in LAVA by waiting for TIME_WAIT socket to die.