DTM UnknownTransaction exception messages floods the system and greatly impact the Trafodion installation

Bug #1439387 reported by Joanie Cooper
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Trafodion
Fix Committed
High
Joanie Cooper

Bug Description

Hundreds of thousands of DTM UnknownTransaction exception messages were being generated, flooding the system, greatly impacting performance on the Trafodion instance.

Additional information is being obtained to determine if there is a corner case in the DTM that enters a retry loop causing the multiple messages.

Revision history for this message
Joanie Cooper (joanie-cooper) wrote :

This problem was experienced on the bronto[05-08] system, with a CDH 5.3 Hadoop installation, Trafodion R1.1. 032915 build.

Changed in trafodion:
status: New → In Progress
Revision history for this message
Joanie Cooper (joanie-cooper) wrote :
Download full text (11.5 KiB)

Analysis of the trafodion.dtm.log on bronto05 demonstrate that we have a window of opportunity for an UnknownTransactionException to be received for abort and commit TransactionManager requests.

449039 2015-04-01 16:54:30,308 ERROR transactional.TransactionManager: Abort HasException true: java.io.IOException: UnknownTransactionException
449040 2015-04-01 16:54:30,309 ERROR transactional.TransactionManager: Abort HasException true: java.io.IOException: UnknownTransactionException
449041 2015-04-01 16:54:30,310 ERROR transactional.TransactionManager: Abort HasException true: java.io.IOException: UnknownTransactionException
449042 2015-04-01 16:54:30,314 ERROR transactional.TransactionManager: Abort HasException true: java.io.IOException: UnknownTransactionException
449043 2015-04-01 17:00:18,959 INFO dtm.HBaseTxClient: useForgotten is true
449044 2015-04-01 17:00:18,959 INFO dtm.HBaseTxClient: forceForgotten is false
449045 2015-04-01 17:00:18,984 INFO dtm.TmAuditTlog: forceControlPoint is false
449046 2015-04-01 17:00:18,984 INFO dtm.TmAuditTlog: useAutoFlush is false
449047 2015-04-01 17:00:18,984 INFO dtm.TmAuditTlog: ageCommitted is false
449048 2015-04-01 17:00:18,984 INFO dtm.TmAuditTlog: disableBlockCache is false
449049 2015-04-01 17:00:19,046 INFO Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
449050 2015-04-01 17:00:21,002 INFO dtm.HBaseAuditControlPoint: disableBlockCache is false
449051 2015-04-01 17:00:21,004 INFO dtm.HBaseAuditControlPoint: useAutoFlush is false
449052 2015-04-01 18:25:58,248 ERROR transactional.TransactionManager: doAbortX, received incorrect result size: 0
449053 2015-04-01 18:25:58,250 ERROR transactional.TransactionManager: doAbortX, received incorrect result size: 0
449054 2015-04-01 18:25:59,064 ERROR transactional.TransactionManager: doAbortX, received incorrect result size: 0
449055 2015-04-01 18:25:59,065 ERROR transactional.TransactionManager: doAbortX, received incorrect result size: 0
449056 2015-04-01 18:25:59,880 ERROR transactional.TransactionManager: doAbortX, received incorrect result size: 0
449057 2015-04-01 18:25:59,880 ERROR transactional.TransactionManager: doAbortX, received incorrect result size: 0
449058 2015-04-01 19:22:21,209 INFO dtm.HBaseTxClient: useForgotten is true
449059 2015-04-01 19:22:21,210 INFO dtm.HBaseTxClient: forceForgotten is false
449060 2015-04-01 19:22:21,240 INFO dtm.TmAuditTlog: forceControlPoint is false
449061 2015-04-01 19:22:21,240 INFO dtm.TmAuditTlog: useAutoFlush is false
449062 2015-04-01 19:22:21,240 INFO dtm.TmAuditTlog: ageCommitted is false
449063 2015-04-01 19:22:21,240 INFO dtm.TmAuditTlog: disableBlockCache is false
449064 2015-04-01 19:22:21,327 INFO Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
449065 2015-04-01 19:22:23,543 INFO dtm.HBaseAuditControlPoint: disableBlockCache is false
449066 2015-04-01 19:22:23,546 INFO dtm.HBaseAuditControlPoint: useAutoFlush is false
449067 2015-04-02 17:18:14,651 ERROR transactional.TransactionManager: doCommitX, received incorrect result size: 0
449068 2015-04-02 17:18:14,691 ERROR dtm.TmAuditTlog: deleteAgedEntr...

Revision history for this message
Joanie Cooper (joanie-cooper) wrote :

A change will be made increment the retry counter when the refresh variable is equal to true or false in the TransactionManager methods.

Changed in trafodion:
status: In Progress → Fix Committed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.