invalid TCP circuit channel instal
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
EPICS Base |
Fix Released
|
High
|
Jeff Hill |
Bug Description
In the process of trying continuous runs of connect-
At least two threads, in this example #8 and #5, are waiting for a mutex held by the CAC-UDP thread (#11).
Menwhile, the udpRecvThread is stuck waiting for a lock related to
tcpiiu:
This time around, I've instrumented libCom/
I added dummy vars to the start and end of epicsEventOSD:
typedef struct epicsEventOSD -{
int dummy1;
pthread_
pthread_cond_t cond;
int isFull;
int dummy2;
}epicsEventOSD;
... which I initialize here:
epicsEventId epicsEventCreat
epicsEventOSD *pevent;
int status;
pevent = callocMustSucce
...
pevent->dummy1 = 1;
pevent->dummy2 = 2;
return(
}
Then I clobber all with 0xFF here:
void epicsEventDestr
/* For debugging */
memset(pevent, 0xFF, sizeof(*pevent));
free(pevent);
}
In cpiiu::
There's a lot of stuff in there that I have not tried to understand.
But the epicsEvent sendThreadFlush
(gdb) fra 9
#9 0x0000002a965a8c9d in tcpiiu:
countIn=0)
at ../tcpiiu.cpp:1872
1872 this->sendThrea
(gdb) print *this->
$9 = {dummy1 = -1, mutex = {__m_reserved = 2, __m_count = -1, __m_owner = 0xffffffffffffffff, __m_kind = -1, __m_lock = {__status = -1, __spinlock = -1}}, cond = {__c_lock = {__status = -1, __spinlock = -1}, __c_waiting = 0xffffffffffffffff, __padding = 'ø' <repeats 16 times>, __align = -1}, isFull = -1, dummy2 = -1}
***
That's all 0xFF!
So this looks like the tcpiiu sendThreadFlush
Of course this would be something that purify might find.
With valgrind, I cannot run the same test, it's just too slow to ever get anywhere.
Ernest, this time I think it would be good to have purify.
But of course we'd need it for at least the 32 and 64bit linux, so I don't know if their licensing even allows that without paying twice.
Jeff, any obvious idea where to look for the destroyer of the sendThreadFlush
Of course I can't preclude that the engine is somehow causing this, but I don't know where to look, other than slowly adding "dummy" spacers and memset(0xff) to all destructors.
-Kay
Thread 13 (Thread 1074010464 (LWP 18272)):
#0 0x00000032343088da in pthread_
#1 0x0000002a9671a23d in condWait (condId=0x56c6b0,
mutexId=0x56c688) at ../../.
#2 0x0000002a9671a591 in epicsEventWait (pevent=0x56c680) at ../../.
#3 0x0000002a967130f5 in epicsEvent::wait (this=0x56c458) at ../../.
#4 0x0000002a9670ff52 in ipAddrToAsciiEn
(this=0x56c010) at ../../.
ipAddrToAsciiAs
#5 0x0000002a967113b9 in epicsThreadCall
Thread 12 (Thread 1074542944 (LWP 18274)):
#0 0x000000323430ad1b in __lll_mutex_
#1 0x000000000056b6d8 in ?? ()
#2 0x0000000000000000 in ?? ()
Thread 11 (Thread 1074809184 (LWP 18287)):
#0 0x000000323430ad1b in __lll_mutex_
#1 0x0000000000705d20 in ?? ()
#2 0x0000000000705d20 in ?? ()
#3 0x0000003234307b04 in pthread_mutex_lock () from /lib64/tls/ libpthread.so.0
#4 0x000000000056b790 in ?? ()
#5 0x0000000040103a70 in ?? ()
#6 0x0000002a96712bb4 in epicsMutex::lock (this=0x6bd348) at ../../.
#7 0x0000002a9671a441 in epicsEventSignal (pevent=0x6bd340) at ../../.
#8 0x0000002a967130da in epicsEvent::signal (this=0x6b5ff8) at ../../.
#9 0x0000002a965a8c9d in tcpiiu:
countIn=0) at ../tcpiiu.cpp:1872
#10 0x0000002a9658a3e2 in cac::transferCh
#11 0x0000002a965a125f in udpiiu:
udpiiu.cpp:665
#12 0x0000002a965a16b1 in udpiiu::postMsg (this=0x725350, net_addr=
currentTime=
#13 0x0000002a965a0cb2 in udpRecvThread::run (this=0x735780) at ../ udpiiu.cpp:380
#14 0x0000002a967113b9 in epicsThreadCall
Thread 10 (Thread 1079310688 (LWP 18317)):
#0 0x00000032343088da in pthread_
#1 0x0000002a9671a23d in condWait (condId=0x5bcab0,
mutexId=0x5bca88) at ../../.
#2 0x0000002a9671a591 in epicsEventWait (pevent=0x5bca80) at ../../.
#3 0x0000002a96702cd9 in errlogThread () at ../../.
error/errlog.c:468
Thread 9 (Thread 1087519072 (LWP 18355)):
#0 0x00000032338bebe6 in __select_nocancel () from /lib64/
#1 0x000000000041b9d2 in HTTPServer::run (this=0x2a96a36440) at ../
HTTPServer.cpp:154
#2 0x0000002a967113b9 in epicsThreadCall
(pPvt=0x2a96a36448) at ../../.
*** Waiting for mutex held by CAC-UDP (thread 11) Thread 8 (Thread 1075865952 (LWP 14383)):
#0 0x000000323430ad1b in __lll_mutex_
#1 0x000000000056b6d8 in ?? ()
#2 0x0000000000000010 in ?? ()
#3 0x0000002a9671a0c1 in epicsMutexOsdLock (pmutex=0x56b750) at ../../.
#4 0x0000002a967128f5 in epicsMutexLock (pmutexNode=
#5 0x0000002a96712bb4 in epicsMutex::lock (this=0x56b6d8) at ../../.
#6 0x0000002a9621ca3d in epicsGuard<
#7 0x0000002a96593c55 in ca_create_channel (name_str=0x776e00 "CCL_Diag:
puser=0x776b90, priority=20, chanptr=0x40205b80) at ../access.cpp:315
#8 0x0000000000424731 in ProcessVariable
guard=@0x40205bd0) at ../ProcessVaria
#9 0x000000000041f5d1 in SampleMechanism
guard=@0x40205c50) at ../SampleMechan
guard=@0x40205cb0) at ../ArchiveChann
#11 0x000000000040bcbc in Engine::start (this=0x56afe0,
engine_
#12 0x00000000004109d4 in restart (connection=
#13 0x000000000041d304 in HTTPClientConne
(this=0x113e220) at ../HTTPServer.
#14 0x000000000041cebb in HTTPClientConne
(this=0x113e220) at ../HTTPServer.
#15 0x000000000041c9e6 in HTTPClientConne
#16 0x0000002a967113b9 in epicsThreadCall
Thread 7 (Thread 1078778208 (LWP 14385)):
#0 0x000000323430ad1b in __lll_mutex_
#1 0x000000000056b6d8 in ?? ()
#2 0x0000000000000000 in ?? ()
Thread 6 (Thread 1076926816 (LWP 14387)):
#0 0x00000032343088da in pthread_
#1 0x0000002a9671a23d in condWait (condId=0xe3a8f0,
mutexId=0xe3a8c8) at ../../.
#2 0x0000002a9671a591 in epicsEventWait (pevent=0xe3a8c0) at ../../.
#3 0x0000002a967130f5 in epicsEvent::wait (this=0x6b4e00) at ../../.
#4 0x0000002a965a2c29 in tcpSendThread::run (this=0x6b4be0) at ../
tcpiiu.cpp:85
#5 0x0000002a967113b9 in epicsThreadCall
*** Also waiting for mutex held by CAC-UDP (thread 11) Thread 5 (Thread 1082222944 (LWP 14389)):
#0 0x000000323430ad1b in __lll_mutex_
#1 0x000000000056b6d8 in ?? ()
#2 0x0000000000000063 in ?? ()
#3 0x0000002a9671a0c1 in epicsMutexOsdLock (pmutex=0x56b750) at ../../.
#4 0x0000002a967128f5 in epicsMutexLock (pmutexNode=
#5 0x0000002a96712bb4 in epicsMutex::lock (this=0x56b6d8) at ../../.
#6 0x0000002a9621ca3d in epicsGuard<
#7 0x0000002a965b4302 in ca_field_type (pChan=0x6eba50) at ../
oldChannelNotif
#8 0x0000000000425919 in ProcessVariable
#9 0x0000002a965b296b in oldChannelNotif
#11 0x0000002a9658bcd7 in cac::createChan
cac.cpp:1062
#12 0x0000002a9658c06e in cac::executeRes
#13 0x0000002a965a6d3f in tcpiiu:
#14 0x0000002a965a3dff in tcpRecvThread::run (this=0x6b3990) at ../ tcpiiu.cpp:530
#15 0x0000002a967113b9 in epicsThreadCall
Thread 4 (Thread 1090697568 (LWP 14391)):
#0 0x00000032343088da in pthread_
#1 0x0000002a9671a23d in condWait (condId=0xe36760,
mutexId=0xe36738) at ../../.
#2 0x0000002a9671a591 in epicsEventWait (pevent=0xe36730) at ../../.
#3 0x0000002a967130f5 in epicsEvent::wait (this=0x6b3c08) at ../../.
#4 0x0000002a965a2c29 in tcpSendThread::run (this=0x6b39e8) at ../
tcpiiu.cpp:85
#5 0x0000002a967113b9 in epicsThreadCall
Thread 3 (Thread 1092553056 (LWP 13156)):
#0 0x000000323430ad1b in __lll_mutex_
#1 0x000000000056e9d8 in ?? ()
#2 0x0000003233a2f620 in __malloc_
libc.so.6
#3 0x0000003234307b1f in pthread_mutex_lock () from /lib64/tls/ libpthread.so.0
#4 0x000000000056e9b0 in ?? ()
#5 0x00000000411efba0 in ?? ()
#6 0x0000000000443d06 in ~epicsMutexGuard (this=0x56b1b8) at ../ Guard.h:30
#7 0x0000002a9671a0c1 in epicsMutexOsdLock (pmutex=0x56b1b0) at ../../.
#8 0x0000002a967128f5 in epicsMutexLock (pmutexNode=
#9 0x00000000004439b3 in OrderedMutex::lock (this=0x56aff0,
file=0x449593 "../EngineServe
260
#10 0x00000000004416bc in Guard::lock (this=0x411efd30, file=0x449593 "../EngineServe
#11 0x00000000004099b5 in Guard (this=0x411efd30, file=0x449593 "../ EngineServer.cpp", line=593, guardable=
include/Guard.h:67
#12 0x0000000000410990 in restart (connection=
#13 0x000000000041d304 in HTTPClientConne
(this=0x11709d0) at ../HTTPServer.
#14 0x000000000041cebb in HTTPClientConne
(this=0x11709d0) at ../HTTPServer.
#15 0x000000000041c9e6 in HTTPClientConne
#16 0x0000002a967113b9 in epicsThreadCall
Thread 2 (Thread 1088575840 (LWP 12394)):
#0 0x000000323430ad1b in __lll_mutex_
#1 0x000000000056e9d8 in ?? ()
#2 0x0000003233a2f620 in __malloc_
libc.so.6
#3 0x0000003234307b1f in pthread_mutex_lock () from /lib64/tls/ libpthread.so.0
#4 0x000000000056e9b0 in ?? ()
#5 0x0000000040e24ba0 in ?? ()
#6 0x0000000000443d06 in ~epicsMutexGuard (this=0x56b1b8) at ../ Guard.h:30
#7 0x0000002a9671a0c1 in epicsMutexOsdLock (pmutex=0x56b1b0) at ../../.
#8 0x0000002a967128f5 in epicsMutexLock (pmutexNode=
#9 0x00000000004439b3 in OrderedMutex::lock (this=0x56aff0,
file=0x449593 "../EngineServe
260
#10 0x00000000004416bc in Guard::lock (this=0x40e24d30, file=0x449593 "../EngineServe
#11 0x00000000004099b5 in Guard (this=0x40e24d30, file=0x449593 "../ EngineServer.cpp", line=593, guardable=
include/Guard.h:67
#12 0x0000000000410990 in restart (connection=
#13 0x000000000041d304 in HTTPClientConne
(this=0x1171270) at ../HTTPServer.
#14 0x000000000041cebb in HTTPClientConne
(this=0x1171270) at ../HTTPServer.
#15 0x000000000041c9e6 in HTTPClientConne
#16 0x0000002a967113b9 in epicsThreadCall
Thread 1 (Thread 182914034624 (LWP 18263)):
#0 0x000000323430ad1b in __lll_mutex_
#1 0x000000000056e9d8 in ?? ()
#2 0x0000000000000000 in ?? ()
Original Mantis Bug: mantis-258
http://
possibly related to #257