callbackRequestDelay not waiting for 1/60 sec on vxWorks
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
EPICS Base |
Triaged
|
Low
|
Unassigned |
Bug Description
The problem is documented in this tech-talk thread: https:/
In 7.0.3.1, but not in 7.0.3, callbackRequest
This happens on vxWorks 6.9.4.1 and was first seen with the scaler record with multiple types of scaler hardware and device support.
Ralph Lange (ralph-lange) wrote : | #1 |
Ralph Lange (ralph-lange) wrote : | #2 |
Wasn't 1/60 the default tick time on VxWorks?
The difference between delaying one tick and delaying zero ticks could be callback/reschedule happening every tick or as fast as the CPU can.
rivers (rivers) wrote : | #3 |
> Looking at release notes and commits
> https:/
> it looks like this could be a side
> Wasn't 1/60 the default tick time on VxWorks? effect of switching internal timers to monotonic time.
Yes, I discussed both of these in the tech-talk thread.
The Application Developers Guide does not document a minimum value for the delay time in callbackRequest
A reasonable expectation for the minimum value of the delay, given that it is not documented, is that it should be the same as the minimum value of epicsThreadSleep. That is documented as follows:
"Sleep for the specified period of time, i.e. sleep without using the cpu.
If delay is >0 then the thread will sleep at least until the next clock tick.
The exact time is determined by the underlying architecture. If delay is
<= 0 then a delay of 0 is requested of the underlying architecture. What
happens is architecture dependent but often it allows other threads of the
same priority to run."
Ralph Lange (ralph-lange) wrote : | #4 |
Sure. This has been working for a long time, including the 7.0.3 release.
The interesting question is why the behavior has changed with the switch to monotonic time.
BTW. Can you verify that is was exactly that commit changing the behavior?
https:/
That would make sure that we're talking about the real things here.
rivers (rivers) wrote : | #5 |
Perhaps my hypothesis about callbackRequest
I looked for where callbackRequest
corvette:
13LAB:userTranE
13LAB:userTranE
13LAB:userTranE
13LAB:userTranE
13LAB:userTranE
13LAB:userTranE
13LAB:userTranE
13LAB:userTranE
13LAB:userTranE
13LAB:userTranE
13LAB:userTranE
13LAB:userTranE
13LAB:userTranE
13LAB:userTranE
13LAB:userTranE
13LAB:userTranE
13LAB:userTranE
13LAB:userTranE
13LAB:userTranE
13LAB:userTranE
13LAB:userTranE
13LAB:userTranE
13LAB:userTranE
13LAB:userTranE
13LAB:userTranE
13LAB:userTranE
13LAB:userTranE
13LAB:userTranE
13LAB:userTranE
So the transition from 1 (Enable) to 0 (Disable) seems to take a random amount of time between 0 and 0.01666 seconds. This is exactly what I expect if the delay does wait for the next clock tick, just like epicsThreadSleep. The tick rate in my IOC is 60.
However, if I change HIGH from .01 to .005 the behavior changes, and it is only high for 20-40 microseconds (measuring time on a Linux client).
corvette:
13LAB:userTranE
13LAB:userTranE
13LAB:userTranE
...
rivers (rivers) wrote : | #6 |
As Ralph suggested while having 7.0.3.1 checked out I reverted commit https:/
That fixed the problem with the scaler record, I can now run at 60 Hz display rate with no problems.
It did not change the behavior of the bo record HIGH field. I see the same as above, with .01 sec it waits till the next clock tick to go low, with .005 sec it goes low immediately. I don't understand this behavior.
Andrew Johnson (anj) wrote : | #7 |
I don't have any definitive answers (or access to a vxWorks system to experiment with down here at about 18 Degrees North), but I can provide more data.
callbackRequest
On vxWorks-ppc the getMonotonic() time source is the CPU's time-base register, which ticks at some multiple of the CPU's bus clock frequency. Different boards have different bus rates and multipliers, and if the BSP doesn't provide the optional routine sysTimeBaseFreq() to return the nominal tick rate it gets measured in osdMonotonicInit() which prints "osdMonotonicInit: Measuring CPU time-base frequency ... %llu ticks/sec." – I think that's called from a C++ static initializer so it should appear when loading the munch file, but I might be mistaken about that part.
A workaround would be to increase your system clock tick rate by calling sysClkRateSet(100) or higher which I would put towards the top of your startup scripts; you can call it later and our code should adjust (thanks to Dirk IIRC), but before loading the munch file is probably safest. The OS routines and any taskDelay() calls will still be using this tick rate for delay granularity, and that includes the semTake() that gets used when the timer queue needs to sleep. In epicsEventWaitW
max(1, (int) (delay * sysClkRateGet()))
I haven't had any revelations over what might be causing the problem yet though.
rivers (rivers) wrote : | #8 |
I do see the message that Andrew referred to when the IOC boots:
osdMonotonicInit: Measuring CPU time-base frequency ... 25000172 ticks/sec.
I just modified the scaler record to print the current value of the first counter in the callback routine. I print with errlogPrintf(), rather than printf(), so it does not slow down the operation very much.
My scaler has a 10 MHz input on scaler[0], so the ticks are 100 ns.
This is what I see if the RATE=59. The expected delta tick is 1/59.*1e7=169492. The average delta for the first 6 readings is 163981, which is pretty close.
updateCounts: posting scaler values counts[0]=74
updateCounts: posting scaler values counts[0]=150723
updateCounts: posting scaler values counts[0]=317296
updateCounts: posting scaler values counts[0]=483968
updateCounts: posting scaler values counts[0]=651202
updateCounts: posting scaler values counts[0]=817458
updateCounts: posting scaler values counts[0]=983962
This is what I see if RATE=60
updateCounts: posting scaler values counts[0]=65
updateCounts: posting scaler values counts[0]=186497
updateCounts: posting scaler values counts[0]=188068
updateCounts: posting scaler values counts[0]=188639
updateCounts: posting scaler values counts[0]=189067
updateCounts: posting scaler values counts[0]=189495
updateCounts: posting scaler values counts[0]=189924
updateCounts: posting scaler values counts[0]=190351
So now the average delta from the last 7 readings is 642 ticks, which means the callback is running every 64 microseconds, or 15.6 kHz.
This confirms that the problem is indeed that the callback is being called far too often when RATE=60.
If the problem is that callbackRequest
Ralph Lange (ralph-lange) wrote : | #9 |
Well, in both cases (HIGH delays and callbackRequest
Even with your statement
> ...seems to take a random amount of time between 0 and 0.01666 seconds.
> This is exactly what I expect if the delay does wait for the next
> clock tick, just like epicsThreadSleep.
I have to say that I do not like that behavior. Agreed that it may only play out in few real life use cases, a delay API that cannot promise the delay seems awkward and wrong. Note that for any delay request, a system with a clock tick larger than the requested delay will response with an arbitrary delay between zero and one clock tick, which is unrelated to the requested value.
Much like the fundamental difference between a "snooze" timer (don't wake me up before...) and an alarm setting (wake me up no later than...) the use cases for a delay and a timer are different and the API promises should reflect that.
IMHO: A requested delay > 0 shall never call back before the requested delay has passed. If waiting for the next system tick comes back too early, delay shall wait for another system tick.
Immediate call back is acceptable (and somewhat expected) for a requested delay = 0.
rivers (rivers) wrote : | #10 |
> Well, in both cases (HIGH delays and callbackRequest
> for the requested delay with a response switching between a 1 tick delay and a 0 tick delay.
But HIGH delays use callbackRequest
mdavidsaver (mdavidsaver) wrote : | #11 |
- Remove use of time quantum in timer delay calc Edit (583 bytes, text/plain)
The attached patch is a bit of a shot in the dark, but maybe worth testing.
rivers (rivers) wrote : | #12 |
I tried Michael's patch. I does prevent the very fast callbacks RATE=60, but it slows down the callbacks at all rates.
This is what I now see with RATE=59 when I count for 0.2 seconds.
updateCounts: posting scaler values counts[0]=72
updateCounts: posting scaler values counts[0]=210254
updateCounts: posting scaler values counts[0]=543797
updateCounts: posting scaler values counts[0]=876856
updateCounts: posting scaler values counts[0]=1210206
updateCounts: posting scaler values counts[0]=1543730
updateCounts: posting scaler values counts[0]=1876830
updateCounts: posting scaler values counts[0]=2000000
The average delta of the first 7 readings is (1876830-
At RATE=60 I see
updateCounts: posting scaler values counts[0]=73
updateCounts: posting scaler values counts[0]=255761
updateCounts: posting scaler values counts[0]=588997
updateCounts: posting scaler values counts[0]=922397
updateCounts: posting scaler values counts[0]=1255750
updateCounts: posting scaler values counts[0]=1589057
updateCounts: posting scaler values counts[0]=1922543
updateCounts: posting scaler values counts[0]=2000000
So the average callback interval is (1922543-73)/6. = 32.04 ms, while it should be 16.67 ms.
mdavidsaver (mdavidsaver) wrote : | #13 |
Can you test with a second RATE (say 30Hz) to see how this error in delay scales? This may give a hint as to where to focus (looking for more quantum games, or ticksPerSec).
Also, what do epicsThreadSlee
rivers (rivers) wrote : | #14 |
This is 30 Hz.
updateCounts: posting scaler values counts[0]=343492
updateCounts: posting scaler values counts[0]=842531
updateCounts: posting scaler values counts[0]=1342611
updateCounts: posting scaler values counts[0]=1842512
updateCounts: posting scaler values counts[0]=2342578
updateCounts: posting scaler values counts[0]=2842581
updateCounts: posting scaler values counts[0]=3342583
updateCounts: posting scaler values counts[0]=3842498
updateCounts: posting scaler values counts[0]=4342557
updateCounts: posting scaler values counts[0]=4842722
So the average delta is (4842722-
epicsThreadSlee
epicsMonotonicR
rivers (rivers) wrote : | #15 |
Reverting Michael's patch I see this with RATE=30Hz.
updateCounts: posting scaler values counts[0]=340710
updateCounts: posting scaler values counts[0]=673958
updateCounts: posting scaler values counts[0]=1007853
updateCounts: posting scaler values counts[0]=1340645
updateCounts: posting scaler values counts[0]=1673964
updateCounts: posting scaler values counts[0]=2007397
updateCounts: posting scaler values counts[0]=2340634
updateCounts: posting scaler values counts[0]=2673957
updateCounts: posting scaler values counts[0]=3008932
updateCounts: posting scaler values counts[0]=3341156
So the average delta ticks is (3341156-
mdavidsaver (mdavidsaver) wrote : | #16 |
The difference between expected and actual seems ~= epicsThreadSlee
The code in epicsEventWaitW
> if (timeOut <= 0.0) {
> ticks = 0;
> } else if (timeOut >= (double) INT_MAX / rate) {
> ticks = WAIT_FOREVER;
> } else {
> ticks = timeOut * rate; // MD - rate=sysClkRate
> if (ticks <= 0)
> ticks = 1;
> }
Previously, subtracting could result in timeOut <=0. Now, this is never the case so the minimum wait is 1 tick.
mdavidsaver (mdavidsaver) wrote : | #17 |
Maybe relevant https:/
Andrew Johnson (anj) wrote : | #18 |
libCom's epicsEventTest measures epicsEventWaitW
# epicsEventWaitW
# epicsEventWaitW
# epicsEventWaitW
# epicsEventWaitW
# epicsEventWaitW
# epicsEventWaitW
# epicsEventWaitW
# epicsEventWaitW
# epicsEventWaitW
# epicsEventWaitW
# epicsEventWaitW
# epicsEventWaitW
# epicsEventWaitW
# epicsEventWaitW
# epicsEventWaitW
# epicsEventWaitW
# epicsEventWaitW
# epicsEventWaitW
# epicsEventWaitW
# epicsEventWaitW
# epicsEventWaitW
I then set the clock to 1KHz and got these:
# epicsEventWaitW
# epicsEventWaitW
# epicsEventWaitW
# epicsEventWaitW
# epicsEventWaitW
# epicsEventWaitW
# epicsEventWaitW
# epicsEventWaitW
# epicsEventWaitW
# epicsEventWaitW
# epicsEventWaitW
# epicsEventWaitW
# epicsEventWaitW
# epicsEventWaitW
# epicsEventWaitW
# epicsEventWaitW
# epicsEventWaitW
# epicsEventWaitW
# epicsEventWaitW
# epicsEvent...
Changed in epics-base: | |
status: | New → Triaged |
importance: | Undecided → Critical |
milestone: | none → 7.0.3.2 |
Andrew Johnson (anj) wrote : | #19 |
I propose to turn the individual delay error checks in epicsEventTest.cpp into tests that fail if the error is -ve (using my changes described above). Currently we only fail a test if the mean of the absolute delay errors is 50ms or more.
mdavidsaver (mdavidsaver) wrote : | #20 |
I happened upon a difference between the float to int calculations of epicsThreadSleep() vs epicsEventWaitW
epicsThreadSleep() has
> seconds *= mSecPerSec;
> seconds += 0.99999999; /* 8 9s here is optimal */
Perhaps Andrew can comment on why 8 9s is optimal.
(cf. 32f556637de8c0b
epicsEventWaitW
> tmo = ( DWORD ) ( ( timeOut * mSecPerSec ) + 0.5 );
Andrew Johnson (anj) wrote : | #21 |
You'd have to ask me that question back in July 2012 for my thinking back then, 0.99999999 is probably the largest double that doesn't get rounded up to 1.0 by the compiler.
I tried changing the RTEMS version of epicsEventWaitW
not ok 14 - epicsEventWaitW
not ok 14 - epicsEventWaitW
not ok 14 - epicsEventWaitW
The result is better though, previously I was getting a lot more:
ok 12 - epicsEventWaitW
not ok 13 - epicsEventWaitW
not ok 14 - epicsEventWaitW
not ok 15 - epicsEventWaitW
not ok 16 - epicsEventWaitW
not ok 17 - epicsEventWaitW
not ok 18 - epicsEventWaitW
ok 19 - epicsEventWaitW
The mean delay error is higher with that change as expected, by about 1 tick:
ok 33 - Mean delay error was 0.014291 sec
ok 33 - Mean delay error was 0.014293 sec
ok 33 - Mean delay error was 0.014296 sec
instead of
ok 33 - Mean delay error was 0.013231 sec
ok 33 - Mean delay error was 0.013224 sec
Requests of less than 1 tick generally insert a delay very close to 2 ticks (~0.019 seconds).
Note though that this test probably starts most of the delays shortly after the previous tick occurred; I wasn't using the busy-wait here that I described in
https:/
mdavidsaver (mdavidsaver) wrote : | #22 |
I'll try to summarize some thoughts and discussions of the past 2 days or so.
Some of this is (I think) reiterating what Andrew has written. I'm repeating
it to hopefully demonstrate that I understand it.
1. epicsEventWaitW
With RTEMS and vxWorks I think this can be understood if the integer delay N underlying primitives
rtems_semaphore
So a timeout of N may result in times between T*(N-1) and T*N where T is the tick timer interval.
This functions like a truncation, but is separate, and cumulative to the truncation/rounding done in epicsEventWaitW
So the minimal change to ensure epicsEventWaitW
> double delay = ...;
> unsigned ticks = tickRate * delay;
> ticks += 2;
2. There may be more than one issue involved.
epicsEventWaitW
The effect of this commit on the timer queue which Mark reports is (I think) in changing the delay values passed to epicsEventWaitW
I think this question could be answered using an RT linux or RTEMS5 system with a software DAC and an oscilloscope.
3. Windows is peculiar
The Windows impl. uses WaitForSingleOb
but seems to behave as if it were internally doing a similar truncation.
The delays I've observed seem to be round down to multiples of epicsThreadSlee
4. Quantization of monotonic time effecting epicsEventTest
epicsEventTest on Windows is sometimes reporting measured delays of exactly zero.
I think this is due to effective delays being less than the monotonic time quantum,
which is reported as 1ms for the appveyor builds. epicsEventTest tries to test down
to the 10's of microseconds.
mdavidsaver (mdavidsaver) wrote : | #23 |
To make a long story short.
I've pushed up a change to do epicsEventWaitW
https:/
I'd like to see the result of running epicsEventTest with this change on vxWorks and/or RTEMS 4.x before spending further time on this issue.
I'm also uncertain if the windows version can be made to meet spec.
Ben Franksen (bfrk) wrote : Re: [Bug 1861612] Re: callbackRequestDelay not waiting for 1/60 sec on vxWorks | #24 |
> So the minimal change to ensure epicsEventWaitW
> specified (timeout after >=delay) involves a float to int conversion
> like:
>
>> double delay = ...;
>> unsigned ticks = tickRate * delay;
>> ticks += 2;
I'd say it should be
unsigned ticks = ceil(tickRate * delay) + 1
In this way, a delay of 0.0 seconds results in a delay of 0.0 <= delay
<= tickRate, which is probably the desired outcome in this case.
mdavidsaver (mdavidsaver) wrote : | #25 |
- Timing diagram Edit (22.3 KiB, image/png)
I'm attaching an image with a timing diagram which might help to motivate this conversation.
mdavidsaver (mdavidsaver) wrote : | #26 |
@Ben, The short answer is the both Andrew and I have tried +=1 and it doesn't work.
I think I can explain the reason by referring to the Timing diagram (delay.png) I've attached to this issue.
Consider a case where D (requested timeout) is slightly less than T (tick period), and A (time from call to next tick) is almost zero.
In this case the float to int conversion floor(D/T)==0. So +=1 gives a delay of 1. This will wait until the next tick (time A) which is almost zero.
The primitives we're working with on RTEMS and vxWorks seem to respond to a requested non-zero timeout of 'd' ticks by waiting A+T*(d-1) seconds.
Ben Franksen (bfrk) wrote : | #27 |
I didn't mean to dispute your anaysis. I think it is correct and I
understood it exactly in the way suggested by your diagram.
> The short answer is the both Andrew and I have tried +=1 and it
doesn't work.
It cannot work if you convert using trunc, which is the default "cast"
conversion between float and int.
> In this case the float to int conversion floor(D/T)==0. So +=1 gives a
delay of 1. This will wait until the next tick (time A) which is almost
zero.
I was proposing to use ceil. This will result in 1, except if the input
is exactly 0.0.
mdavidsaver (mdavidsaver) wrote : | #28 |
> I didn't mean to dispute your anaysis.
I wasn't reading your comment this way. I hadn't clearly stated an analysis to dispute.
> In this way, a delay of 0.0 seconds
Since delay==0 has a special meaning (wait forever), lets talk about D==T. In this case floor(D/T)+2==3 and ceil(D/T)+1==2.
I choose floor()+2 (aka. truncation) in part because I don't know if vxWorks provides ceil(double), which isn't required by C89. Also, if floor()+2 doesn't pass Andrew's test, then ceil()+1 won't either. So the longer delay seems to me the place to test.
Ben Franksen (bfrk) wrote : | #29 |
Okay, fine. I just wanted to make sure you considered this as a possibility.
(And sorry for the multiple comments, I got errors when I hit the "Post Comment" button and then sent a modified version via email.)
mdavidsaver (mdavidsaver) wrote : | #30 |
@Andrew, the failures in epicsEventTest are confusing all of the my appveyor build results for unrelated changes. Would you object to my marking these to be TODO when #ifdef _WIN32?
mdavidsaver (mdavidsaver) wrote : | #31 |
After some private discussions and testing, I've given up trying to fix this issue and reverted most of the changes made by 4f2228fb1d7527f
Freddie Akeroyd (freddie-akeroyd) wrote : | #32 |
If you need somebody to run tests and try a few things on a Windows system, let me know
mdavidsaver (mdavidsaver) wrote : | #33 |
Thanks for the offer Freddie. In this case the test systems I had in mind are principally RTEMS and vxWorks.
Changed in epics-base: | |
importance: | Critical → Low |
Changed in epics-base: | |
milestone: | 7.0.3.2 → none |
Andrew Johnson (anj) wrote : | #34 |
Towards the end of my tl;dr comment #18 above I wrote:
The solution to this probably has to involve reading the monotonic clock and calling semTake(id, 1) again if the OS returned before the hi-res delay has expired (we might have to do that twice).
So far I don't believe anybody has tried actually coding that fairly obvious solution for any of the OSs where this problem exists.
Looking at release notes and commits
https:/ /github. com/epics- base/epics- base/compare/ R7.0.3. ..R7.0. 3.1
it looks like this could be a side effect of switching internal timers to monotonic time.