Trafodion

Memory leak in TRX causes heap to get used up in the hbase regionserver.

Bug #1410849 reported by Guy Groulx on 2015-01-14

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Trafodion	Fix Committed	Critical	Joanie Cooper	Trafodion r1.0

Bug Description

Running longevity tests on a system revealed that we have a memory leak in the TRX code that causes the hbase regionserver heap to accumulate structures.

The java flight simulator heap section only shows areas that is taking more than .5% of the total heap, since our heap is so big (32gb), it takes a while to have our sections show up.

The first object count capture I did was 36 mins into the run.

Class Instances Size(bytes) Percentage of Heap(%)
byte[] 7,056,680 10,569,632,596 93.782
org.apache.hadoop.hbase.KeyValue 7,452,818 238,490,160 2.116
java.util.concurrent.ConcurrentSkipListMap$Node
                                     5,566,644 133,599,456 1.185
java.util.TreeMap$Entry 2,933,007 117,320,290 1.041
org.apache.hadoop.hbase.regionserver.ColumnCount
                                     2,321,024 74,272,760 0.659
java.util.TreeMap 1,465,205 70,329,828 0.624
java.util.concurrent.ConcurrentSkipListMap$Index
                                     2,782,684 66,784,404 0.593

------------------------

At 127 mins into the run:

Class Instances Size(bytes) Percentage of Heap(%)
byte[] 19,458,266 12,997,350,664 88.254
java.util.TreeMap$Entry 8,624,495 344,979,810 2.342
org.apache.hadoop.hbase.KeyValue 7,387,822 236,410,304 1.605
org.apache.hadoop.hbase.regionserver.ColumnCount
                                     7,076,402 226,444,856 1.538
java.util.TreeMap 4,293,451 206,085,636 1.399
java.util.concurrent.ConcurrentHashMap$HashEntry
                                     4,414,411 141,261,144 0.959
org.apache.hadoop.hbase.regionserver.ScanQueryMatcher
                                     1,405,598 134,937,432 0.916
org.apache.hadoop.hbase.regionserver.StoreScanner
                                     1,405,598 134,937,384 0.916
java.lang.Object[] 2,693,893 102,438,228 0.696
org.apache.hadoop.hbase.client.Scan 1,405,614 101,204,226 0.687
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl
                                     1,405,598 101,203,092 0.687

-----------------------------------

At 203 mins into the run, notice that we now have explicit transaction stuff:

Class Instances Size(bytes) Percentage of Heap(%)
byte[] 28,810,276 13,622,665,676 80.878
java.util.TreeMap$Entry 12,995,652 519,826,080 3.086
org.apache.hadoop.hbase.KeyValue 12,450,314 398,410,032 2.365
org.apache.hadoop.hbase.regionserver.ColumnCount
                                     10,751,942 344,062,128 2.043
java.util.TreeMap 6,455,446 309,861,408 1.84
java.util.concurrent.ConcurrentHashMap$HashEntry
                                     6,607,272 211,432,704 1.255
org.apache.hadoop.hbase.regionserver.ScanQueryMatcher
                                     2,135,673 205,024,608 1.217
org.apache.hadoop.hbase.regionserver.StoreScanner
                                     2,135,672 205,024,560 1.217
org.apache.hadoop.hbase.client.Scan 2,135,674 153,768,528 0.913
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl
                                     2,135,672 153,768,420 0.913
java.lang.Object[] 4,043,772 151,503,184 0.899
java.util.concurrent.ConcurrentSkipListMap$Node
                                     5,766,866 138,404,772 0.822
org.apache.hadoop.hbase.regionserver.ExplicitColumnTracker
                                     2,135,663 119,597,128 0.71
java.lang.Long 4,375,168 105,004,044 0.623
org.apache.hadoop.hbase.regionserver.ScanDeleteTracker
                                     2,135,673 102,512,304 0.609
org.apache.hadoop.hbase.regionserver.transactional.TransactionalRegionScannerHolder
                                     2,135,644 102,510,936 0.609

------------------------

At 335 mins into the run:

Class Instances Size(bytes) Percentage of Heap(%)
byte[] 46,075,938 14,785,540,816 73.006
java.util.TreeMap$Entry 20,638,510 825,540,420 4.076
org.apache.hadoop.hbase.KeyValue 17,511,236 560,359,568 2.767
org.apache.hadoop.hbase.regionserver.ColumnCount
                                     17,068,892 546,204,560 2.697
java.util.TreeMap 10,275,938 493,245,048 2.435
java.util.concurrent.ConcurrentHashMap$HashEntry
                                     10,379,315 332,138,080 1.64
org.apache.hadoop.hbase.regionserver.ScanQueryMatcher
                                     3,390,409 325,479,264 1.607
org.apache.hadoop.hbase.regionserver.StoreScanner
                                     3,390,408 325,479,216 1.607
org.apache.hadoop.hbase.client.Scan 3,390,410 244,109,520 1.205
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl
                                     3,390,409 244,109,448 1.205
java.lang.Object[] 6,453,036 243,666,964 1.203
org.apache.hadoop.hbase.regionserver.ExplicitColumnTracker
                                     3,390,399 189,862,344 0.937
java.lang.Long 6,884,497 165,227,928 0.816
org.apache.hadoop.hbase.regionserver.ScanDeleteTracker
                                     3,390,409 162,739,632 0.804
org.apache.hadoop.hbase.regionserver.transactional.TransactionalRegionScannerHolder
                                     3,390,376 162,738,048 0.804
java.util.concurrent.ConcurrentSkipListMap$Node
                                     6,496,738 155,921,724 0.77
org.apache.hadoop.hbase.filter.SingleColumnValueFilter
                                     3,507,308 140,292,340 0.693
org.apache.hadoop.hbase.regionserver.ColumnCount[]
                                     3,390,399 127,198,348 0.628
java.util.PriorityQueue 3,512,182 112,389,840 0.555
java.util.TreeMap$KeySet 6,883,018 110,128,296 0.544

NOTE: Above lots of objects are in the 3,3 million instance range. Are those all related to each other? And some are in the 6 million instance range. That could be viewed as 2 for 1 of the 3.3 million objets. Again, are those related?

--------------------------------

And the last object count I took which is at 463 mins into the run:

Class Instances Size(bytes) Percentage of Heap(%)
byte[] 60,097,314 15,249,837,392 65.914
java.util.TreeMap$Entry 27,283,876 1,091,355,020 4.717
org.apache.hadoop.hbase.regionserver.ColumnCount
                                     22,658,774 725,080,752 3.134
java.util.TreeMap 13,563,104 651,028,968 2.814
org.apache.hadoop.hbase.KeyValue 19,632,867 628,251,744 2.715
java.util.concurrent.ConcurrentHashMap$HashEntry
                                     13,711,284 438,761,104 1.896
org.apache.hadoop.hbase.regionserver.ScanQueryMatcher
                                     4,500,728 432,069,936 1.868
org.apache.hadoop.hbase.regionserver.StoreScanner
                                     4,500,728 432,069,888 1.868
org.apache.hadoop.hbase.client.Scan 4,500,729 324,052,488 1.401
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl
                                     4,500,728 324,052,416 1.401
java.lang.Object[] 8,428,334 315,143,268 1.362
org.apache.hadoop.hbase.regionserver.ExplicitColumnTracker
                                     4,500,720 252,040,320 1.089
java.lang.Long 9,104,852 218,516,460 0.944
org.apache.hadoop.hbase.regionserver.ScanDeleteTracker
                                     4,500,728 216,034,968 0.934
org.apache.hadoop.hbase.regionserver.transactional.TransactionalRegionScannerHolder
                                     4,500,689 216,033,072 0.934
org.apache.hadoop.hbase.filter.SingleColumnValueFilter
                                     4,655,912 186,236,480 0.805
org.apache.hadoop.hbase.regionserver.ColumnCount[]
                                     4,500,720 168,854,464 0.73
java.util.PriorityQueue 4,634,415 148,301,280 0.641
java.util.TreeMap$KeySet 9,060,228 144,963,656 0.627
java.util.concurrent.locks.ReentrantLock$NonfairSync
                                     4,508,994 144,287,824 0.624
java.util.TreeSet 9,001,684 144,026,952 0.623
org.apache.hadoop.hbase.io.TimeRange 4,500,729 144,023,328 0.623
org.apache.hadoop.hbase.coprocessor.transactional.TrxRegionEndpoint$2
                                     4,500,690 144,022,080 0.623
java.util.concurrent.ConcurrentSkipListMap$Node
                                     5,751,773 138,042,552 0.597
java.util.HashMap$Entry[] 323,567 134,805,752 0.583
org.apache.hadoop.hbase.filter.FilterList
                                     3,103,951 124,158,040 0.537

Tags:

Atanu Mishra (atanu-mishra) on 2015-01-14

Changed in trafodion:
milestone:	none → r1.0
assignee:	nobody → Joanie Cooper (joanie-cooper)
status:	New → In Progress

Revision history for this message

Joanie Cooper (joanie-cooper) wrote on 2015-01-15:

I’ve been researching two hbase-trx coprocessor memory leaks. One in the TrxTransactionState object and one in the TransactionalRegionScannerHolder object.

For the TrxTransactionState memory leak, we were a little too conservative in our cleaning up of dependent objects. I have added additional code to clean up the TrxTransactionState objects when the transaction state is ABORT or READ_ONLY. This
fix was submitted with git job 955.

Git job 955 is having problems passing seabase gate tests due to an intermittent
failure in seabase/TEST010.

The second memory leak is with the TransactionalRegionScannerHolder object.

This object is created in the TrxRegionEndpoint coprocessor
when an “openScanner” call is made. It is placed
on a scanners map for each region.

It lives and is used by “performScan”.

The object is released and cleared out of the scanners map when a “closeScanner”
call is made.

I added some extra line tracing and can see a possible leak.

It looks like for some transactions, we perform an “openScanner”,
“performScan” and “closeScanner” series of calls. This clears
out the TransactionalRegionScannerHolder object from the scanners map.

For others, we only perform an “openScanner”. We never seem
to get a “performScan” or “closeScanner” call.

This leaves the TransactionalRegionScannerHolder scanners map constantly growing and leaving
objects that are unable to be cleaned. This would definitely have performance implications.

We are looking at the hbase-trx endpoint coprocessor client/server interaction for a change in behavior.

However, scanners are normally managed and cleaned using leases.
We do not use leases with our scanners. We allow the underlying hbase scanner to be managed
by the client scanner lease timeout. If we do have TransactionalRegionScannerHolder scanners that
are left stranded, we have no mechanism to clean them out.

I've added a cleaner method to the hbase-trx coprocessor endpoint chore thread. This tests for stale
hbase-trx scanners and cleans them at regular intervals.

This could have a direct correlation to the increased number of hbase scanners that are experiencing
client scanner timeouts seen recently.

These changes are being tested with the performance team on zircon4.

I’ve been researching two hbase-trx coprocessor memory leaks.  One in the TrxTransactionState object and one in the TransactionalRegionScannerHolder object.

For the TrxTransactionState memory leak, we were a little too conservative in our cleaning up of dependent objects.  I have added additional code to clean up the TrxTransactionState objects when the transaction state is ABORT or READ_ONLY.  This
fix was submitted with git job 955.

Git job 955 is having problems passing seabase gate tests due to an intermittent
failure in seabase/TEST010.

The second memory leak is with the TransactionalRegionScannerHolder object.

This object is created in the TrxRegionEndpoint coprocessor
when an “openScanner” call is made.  It is placed
on a scanners map for each region.

It lives and is used by “performScan”.

The object is released and cleared out of the scanners map when a “closeScanner” 
call is made.

I added some extra line tracing and can see a possible leak.

It looks like for some transactions, we perform an “openScanner”,
“performScan” and “closeScanner” series of calls.  This clears
out the TransactionalRegionScannerHolder object from the scanners map.

For others, we only perform an “openScanner”.  We never seem
to get a “performScan” or “closeScanner” call.

This leaves the TransactionalRegionScannerHolder scanners map constantly growing and leaving
objects that are unable to be cleaned.  This would definitely have performance implications.

We are looking at the hbase-trx endpoint coprocessor client/server interaction for a change in behavior.

However, scanners are normally managed and cleaned using leases.
We do not use leases with our scanners.  We allow the underlying hbase scanner to be managed
by the client scanner lease timeout.  If we do have TransactionalRegionScannerHolder scanners that
are left stranded, we have no mechanism to clean them out.

I've added a cleaner method to the hbase-trx coprocessor endpoint chore thread.  This tests for stale 
hbase-trx scanners and cleans them at regular intervals.

This could have a direct correlation to the increased number of hbase scanners that are experiencing
client scanner timeouts seen recently.

These changes are being tested with the performance team on zircon4.

Revision history for this message

Joanie Cooper (joanie-cooper) wrote on 2015-01-16:

A fix for the TrxTransactionState and TransactionalRegionScannerHolder leaks was delivered on 01/15/15.
We'll continue researching this problem for any other remaining issues.

Revision history for this message

Joanie Cooper (joanie-cooper) wrote on 2015-01-20:

We are still testing with the performance team these changes in a 36 hour longevity run. We did perform a 24 hour run on lava. At the end of the run, we could demonstrate via jmap -histo:live that there were no outstanding DTM transactional objects that had been leaked.

We will continue to analyze the official performance runs.

Revision history for this message

Joanie Cooper (joanie-cooper) wrote on 2015-01-21:

The 24 hour longevity test passed on zircon4.
This bug is being set to fixed.

Changed in trafodion:
status:	In Progress → Fix Committed

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.