Activity log for bug #545943

Date Who What changed Old value New value Message
2010-03-24 12:01:17 wweeks bug added bug
2010-03-24 12:01:45 wweeks infinidb: importance Undecided High
2011-02-16 22:48:41 Robert Adams infinidb: status New In Progress
2012-06-13 20:29:19 Rajkumar description A memory leak appears with the following query, joining 300 million by 374 million. The small side consumes 35.4 bytes per row, consuming 10.6GB, with the peak memory on the first run at ~11GB, the overall memory footprint looks very understandable. However, on subsequent runs the memory increases and eventually either 1) causes ExeMgr restart for a UM join, or 2) unresponsive system due to swapping on a PM join. <PmMaxMemorySmallSide>1G</PmMaxMemorySmallSide> <TotalUmMaxMemorySmallSide>14G</TotalUmMaxMemorySmallSide> <MaxMemoryPerUnion>4G</MaxMemoryPerUnion> <TotalUnionMemory>8G</TotalUnionMemory> <MaxMemory>10G</MaxMemory> <!-- Map and aggregate result RowGroups --> mysql> select count(*) from customer; select count(*+-----------+ | count(*) | +-----------+ | 300000000 | +-----------+ 1 row in set (3.42 sec) mysql> select count(*) from lineorder where lo_orderdate <= 19920115; +-----------+ | count(*) | +-----------+ | 374065744 | +-----------+ 1 row in set (7.18 sec) mysql> mysql> select calsetparms('ummaxmemorysmallside', '11g'); +--------------------------------------------+ | calsetparms('ummaxmemorysmallside', '11g') | +--------------------------------------------+ | Updated ummaxmemorysmallside 11g | +--------------------------------------------+ 1 row in set (0.02 sec) mysql> mysql> select count(*) from customer, lineorder where lo_custkey = mysql> c_custkey and lo_orderdate <= 19920115; +-----------+ | count(*) | +-----------+ | 374065744 | +-----------+ 1 row in set (1 min 11.76 sec) mysql> select count(*) from customer, lineorder where lo_custkey = mysql> c_custkey and lo_orderdate <= 19920115; +-----------+ | count(*) | +-----------+ | 374065744 | +-----------+ 1 row in set (1 min 8.73 sec) mysql> select count(*) from customer, lineorder where lo_custkey = mysql> c_custkey and lo_orderdate <= 19920115; ERROR 122 (HY000): There was an internal error encountered in the Calpont Engine while processing this query. The query was cancelled. You may resubmit it if you like. The error is lost Connection to ExeMgr. If the problem continues, please contact your system administrator. # top | grep ExeMgr 22386 root 18 -1 845m 265m 40m S 54 1.7 0:04.41 ExeMgr 22386 root 18 -1 1914m 1.2g 40m S 161 7.6 0:09.26 ExeMgr 22386 root 18 -1 2585m 1.9g 40m S 145 11.9 0:13.62 ExeMgr 22386 root 18 -1 3226m 2.6g 40m S 126 16.3 0:17.41 ExeMgr 22386 root 18 -1 4368m 3.5g 40m S 136 22.1 0:21.51 ExeMgr 22386 root 18 -1 5836m 5.0g 40m S 116 31.7 0:25.01 ExeMgr 22386 root 18 -1 5591m 4.8g 40m S 113 30.6 0:28.41 ExeMgr 22386 root 18 -1 6359m 5.6g 40m S 121 35.6 0:32.06 ExeMgr 22386 root 18 -1 7063m 6.3g 40m S 122 40.2 0:35.72 ExeMgr 22386 root 18 -1 9713m 8.4g 40m S 109 53.4 0:39.01 ExeMgr 22386 root 18 -1 9713m 8.9g 40m S 100 56.8 0:42.02 ExeMgr 22386 root 18 -1 9141m 8.3g 40m S 115 52.9 0:45.49 ExeMgr 22386 root 18 -1 9.8g 9.0g 40m S 111 57.3 0:48.83 ExeMgr 22386 root 18 -1 10.5g 9.6g 40m S 103 61.3 0:51.92 ExeMgr 22386 root 18 -1 11.3g 10g 40m S 102 65.3 0:54.98 ExeMgr 22386 root 18 -1 12.0g 10g 40m S 102 69.0 0:58.05 ExeMgr 22386 root 18 -1 12.9g 11g 40m S 106 72.8 1:01.24 ExeMgr 22386 root 18 -1 13.1g 11g 40m S 799 73.6 1:25.25 ExeMgr 22386 root 18 -1 13.1g 11g 40m S 797 73.9 1:49.23 ExeMgr 22386 root 18 -1 13.1g 11g 40m S 798 74.2 2:13.23 ExeMgr 22386 root 18 -1 13.1g 11g 40m S 796 74.4 2:37.17 ExeMgr 22386 root 18 -1 13.1g 11g 40m S 798 74.7 3:01.16 ExeMgr 22386 root 18 -1 13.2g 11g 40m S 797 74.7 3:25.13 ExeMgr 22386 root 18 -1 13.2g 11g 40m S 797 75.1 3:49.10 ExeMgr 22386 root 18 -1 12.9g 11g 7044 S 476 73.7 4:03.41 ExeMgr 22386 root 18 -1 12.9g 11g 7044 S 100 73.7 4:06.42 ExeMgr 22386 root 18 -1 12.9g 11g 7044 S 100 73.7 4:09.42 ExeMgr 22386 root 18 -1 12.9g 11g 7044 S 100 73.7 4:12.43 ExeMgr 22386 root 18 -1 9.9g 8.5g 7044 S 39 54.5 4:13.59 ExeMgr -- WAITED UNTIL MEMORY STABILIZED BEFORE ISSUING NEXT QUERY 22386 root 18 -1 10.0g 8.6g 40m S 79 54.7 4:15.95 ExeMgr 22386 root 18 -1 10.3g 8.8g 40m S 133 56.4 4:19.95 ExeMgr 22386 root 18 -1 10.9g 9.6g 40m S 140 61.2 4:24.17 ExeMgr 22386 root 18 -1 11.0g 9.6g 40m S 119 61.4 4:27.74 ExeMgr 22386 root 18 -1 12.1g 10g 40m S 122 66.9 4:31.42 ExeMgr 22386 root 18 -1 11.6g 10g 40m S 105 65.1 4:34.59 ExeMgr 22386 root 18 -1 11.6g 10g 40m S 122 65.4 4:38.27 ExeMgr 22386 root 18 -1 11.8g 10g 40m S 119 66.4 4:41.86 ExeMgr 22386 root 18 -1 14.4g 12g 40m S 110 79.2 4:45.16 ExeMgr 22386 root 18 -1 14.4g 13g 40m S 100 83.0 4:48.16 ExeMgr 22386 root 18 -1 13.5g 12g 40m S 115 76.7 4:51.62 ExeMgr 22386 root 18 -1 13.5g 12g 40m S 106 77.0 4:54.80 ExeMgr 22386 root 18 -1 13.9g 12g 40m S 103 80.0 4:57.91 ExeMgr 22386 root 18 -1 14.5g 13g 40m S 101 83.3 5:00.96 ExeMgr 22386 root 18 -1 15.0g 13g 40m S 102 87.0 5:04.03 ExeMgr 22386 root 18 -1 15.5g 14g 40m S 277 89.9 5:12.36 ExeMgr 22386 root 18 -1 15.7g 14g 40m S 799 90.9 5:36.38 ExeMgr 22386 root 18 -1 15.8g 14g 40m S 799 91.6 6:00.42 ExeMgr 22386 root 18 -1 15.8g 14g 40m S 799 91.7 6:24.46 ExeMgr 22386 root 18 -1 15.8g 14g 40m S 799 91.6 6:48.48 ExeMgr 22386 root 18 -1 15.8g 14g 40m S 799 92.0 7:12.51 ExeMgr 22386 root 18 -1 15.8g 14g 40m S 799 92.2 7:36.54 ExeMgr 22386 root 18 -1 15.8g 14g 40m S 799 92.1 8:00.57 ExeMgr 22386 root 18 -1 15.4g 14g 7044 S 378 91.2 8:11.93 ExeMgr 22386 root 18 -1 15.4g 14g 7044 S 100 91.2 8:14.93 ExeMgr 22386 root 18 -1 15.4g 14g 7044 S 100 91.2 8:17.94 ExeMgr 22386 root 18 -1 15.4g 14g 7044 S 100 91.2 8:20.94 ExeMgr 22386 root 18 -1 12.6g 11g 7044 S 37 73.5 8:22.06 ExeMgr -- WAITED UNTIL MEMORY STABILIZED BEFORE ISSUING NEXT QUERY 22386 root 18 -1 13.0g 11g 40m S 121 75.1 8:25.70 ExeMgr 22386 root 18 -1 13.2g 11g 40m S 120 76.5 8:29.32 ExeMgr 22386 root 18 -1 13.6g 12g 40m S 155 79.2 8:33.97 ExeMgr 22386 root 18 -1 13.3g 12g 40m S 121 77.5 8:37.62 ExeMgr 22386 root 18 -1 14.5g 13g 40m S 118 85.2 8:41.18 ExeMgr 22386 root 18 -1 14.0g 12g 40m S 109 82.0 8:44.47 ExeMgr 22386 root 18 -1 14.4g 13g 40m S 121 84.9 8:48.11 ExeMgr 22386 root 18 -1 14.9g 13g 40m S 119 88.3 8:51.69 ExeMgr 23493 root 20 -1 331m 7080 5276 S 0 0.0 0:00.01 ExeMgr A memory leak appears with the following query, joining 300 million by 374 million. The small side consumes 35.4 bytes per row, consuming 10.6GB, with the peak memory on the first run at ~11GB, the overall memory footprint looks very understandable. However, on subsequent runs the memory increases and eventually either 1) causes ExeMgr restart for a UM join, or 2) unresponsive system due to swapping on a PM join. <PmMaxMemorySmallSide>1G</PmMaxMemorySmallSide> <TotalUmMaxMemorySmallSide>14G</TotalUmMaxMemorySmallSide> <MaxMemoryPerUnion>4G</MaxMemoryPerUnion> <TotalUnionMemory>8G</TotalUnionMemory> <MaxMemory>10G</MaxMemory> <!-- Map and aggregate result RowGroups --> mysql> select count(*) from customer; select count(*+-----------+ | count(*) | +-----------+ | 300000000 | +-----------+ 1 row in set (3.42 sec) mysql> select count(*) from lineorder where lo_orderdate <= 19920115; +-----------+ | count(*) | +-----------+ | 374065744 | +-----------+ 1 row in set (7.18 sec) mysql> mysql> select calsetparms('ummaxmemorysmallside', '11g'); +--------------------------------------------+ | calsetparms('ummaxmemorysmallside', '11g') | +--------------------------------------------+ | Updated ummaxmemorysmallside 11g | +--------------------------------------------+ 1 row in set (0.02 sec) mysql> mysql> select count(*) from customer, lineorder where lo_custkey = mysql> c_custkey and lo_orderdate <= 19920115; +-----------+ | count(*) | +-----------+ | 374065744 | +-----------+ 1 row in set (1 min 11.76 sec) mysql> select count(*) from customer, lineorder where lo_custkey = mysql> c_custkey and lo_orderdate <= 19920115; +-----------+ | count(*) | +-----------+ | 374065744 | +-----------+ 1 row in set (1 min 8.73 sec) mysql> select count(*) from customer, lineorder where lo_custkey = mysql> c_custkey and lo_orderdate <= 19920115; ERROR 122 (HY000): There was an internal error encountered in the Calpont Engine while processing this query. The query was cancelled. You may resubmit it if you like. The error is lost Connection to ExeMgr. If the problem continues, please contact your system administrator. # top | grep ExeMgr 22386 root 18 -1 845m 265m 40m S 54 1.7 0:04.41 ExeMgr 22386 root 18 -1 1914m 1.2g 40m S 161 7.6 0:09.26 ExeMgr 22386 root 18 -1 2585m 1.9g 40m S 145 11.9 0:13.62 ExeMgr 22386 root 18 -1 3226m 2.6g 40m S 126 16.3 0:17.41 ExeMgr 22386 root 18 -1 4368m 3.5g 40m S 136 22.1 0:21.51 ExeMgr 22386 root 18 -1 5836m 5.0g 40m S 116 31.7 0:25.01 ExeMgr 22386 root 18 -1 5591m 4.8g 40m S 113 30.6 0:28.41 ExeMgr 22386 root 18 -1 6359m 5.6g 40m S 121 35.6 0:32.06 ExeMgr 22386 root 18 -1 7063m 6.3g 40m S 122 40.2 0:35.72 ExeMgr 22386 root 18 -1 9713m 8.4g 40m S 109 53.4 0:39.01 ExeMgr 22386 root 18 -1 9713m 8.9g 40m S 100 56.8 0:42.02 ExeMgr 22386 root 18 -1 9141m 8.3g 40m S 115 52.9 0:45.49 ExeMgr 22386 root 18 -1 9.8g 9.0g 40m S 111 57.3 0:48.83 ExeMgr 22386 root 18 -1 10.5g 9.6g 40m S 103 61.3 0:51.92 ExeMgr 22386 root 18 -1 11.3g 10g 40m S 102 65.3 0:54.98 ExeMgr 22386 root 18 -1 12.0g 10g 40m S 102 69.0 0:58.05 ExeMgr 22386 root 18 -1 12.9g 11g 40m S 106 72.8 1:01.24 ExeMgr 22386 root 18 -1 13.1g 11g 40m S 799 73.6 1:25.25 ExeMgr 22386 root 18 -1 13.1g 11g 40m S 797 73.9 1:49.23 ExeMgr 22386 root 18 -1 13.1g 11g 40m S 798 74.2 2:13.23 ExeMgr 22386 root 18 -1 13.1g 11g 40m S 796 74.4 2:37.17 ExeMgr 22386 root 18 -1 13.1g 11g 40m S 798 74.7 3:01.16 ExeMgr 22386 root 18 -1 13.2g 11g 40m S 797 74.7 3:25.13 ExeMgr 22386 root 18 -1 13.2g 11g 40m S 797 75.1 3:49.10 ExeMgr 22386 root 18 -1 12.9g 11g 7044 S 476 73.7 4:03.41 ExeMgr 22386 root 18 -1 12.9g 11g 7044 S 100 73.7 4:06.42 ExeMgr 22386 root 18 -1 12.9g 11g 7044 S 100 73.7 4:09.42 ExeMgr 22386 root 18 -1 12.9g 11g 7044 S 100 73.7 4:12.43 ExeMgr 22386 root 18 -1 9.9g 8.5g 7044 S 39 54.5 4:13.59 ExeMgr -- WAITED UNTIL MEMORY STABILIZED BEFORE ISSUING NEXT QUERY 22386 root 18 -1 10.0g 8.6g 40m S 79 54.7 4:15.95 ExeMgr 22386 root 18 -1 10.3g 8.8g 40m S 133 56.4 4:19.95 ExeMgr 22386 root 18 -1 10.9g 9.6g 40m S 140 61.2 4:24.17 ExeMgr 22386 root 18 -1 11.0g 9.6g 40m S 119 61.4 4:27.74 ExeMgr 22386 root 18 -1 12.1g 10g 40m S 122 66.9 4:31.42 ExeMgr 22386 root 18 -1 11.6g 10g 40m S 105 65.1 4:34.59 ExeMgr 22386 root 18 -1 11.6g 10g 40m S 122 65.4 4:38.27 ExeMgr 22386 root 18 -1 11.8g 10g 40m S 119 66.4 4:41.86 ExeMgr 22386 root 18 -1 14.4g 12g 40m S 110 79.2 4:45.16 ExeMgr 22386 root 18 -1 14.4g 13g 40m S 100 83.0 4:48.16 ExeMgr 22386 root 18 -1 13.5g 12g 40m S 115 76.7 4:51.62 ExeMgr 22386 root 18 -1 13.5g 12g 40m S 106 77.0 4:54.80 ExeMgr 22386 root 18 -1 13.9g 12g 40m S 103 80.0 4:57.91 ExeMgr 22386 root 18 -1 14.5g 13g 40m S 101 83.3 5:00.96 ExeMgr 22386 root 18 -1 15.0g 13g 40m S 102 87.0 5:04.03 ExeMgr 22386 root 18 -1 15.5g 14g 40m S 277 89.9 5:12.36 ExeMgr 22386 root 18 -1 15.7g 14g 40m S 799 90.9 5:36.38 ExeMgr 22386 root 18 -1 15.8g 14g 40m S 799 91.6 6:00.42 ExeMgr 22386 root 18 -1 15.8g 14g 40m S 799 91.7 6:24.46 ExeMgr 22386 root 18 -1 15.8g 14g 40m S 799 91.6 6:48.48 ExeMgr 22386 root 18 -1 15.8g 14g 40m S 799 92.0 7:12.51 ExeMgr 22386 root 18 -1 15.8g 14g 40m S 799 92.2 7:36.54 ExeMgr 22386 root 18 -1 15.8g 14g 40m S 799 92.1 8:00.57 ExeMgr 22386 root 18 -1 15.4g 14g 7044 S 378 91.2 8:11.93 ExeMgr 22386 root 18 -1 15.4g 14g 7044 S 100 91.2 8:14.93 ExeMgr 22386 root 18 -1 15.4g 14g 7044 S 100 91.2 8:17.94 ExeMgr 22386 root 18 -1 15.4g 14g 7044 S 100 91.2 8:20.94 ExeMgr 22386 root 18 -1 12.6g 11g 7044 S 37 73.5 8:22.06 ExeMgr -- WAITED UNTIL MEMORY STABILIZED BEFORE ISSUING NEXT QUERY 22386 root 18 -1 13.0g 11g 40m S 121 75.1 8:25.70 ExeMgr 22386 root 18 -1 13.2g 11g 40m S 120 76.5 8:29.32 ExeMgr 22386 root 18 -1 13.6g 12g 40m S 155 79.2 8:33.97 ExeMgr 22386 root 18 -1 13.3g 12g 40m S 121 77.5 8:37.62 ExeMgr 22386 root 18 -1 14.5g 13g 40m S 118 85.2 8:41.18 ExeMgr 22386 root 18 -1 14.0g 12g 40m S 109 82.0 8:44.47 ExeMgr 22386 root 18 -1 14.4g 13g 40m S 121 84.9 8:48.11 ExeMgr 22386 root 18 -1 14.9g 13g 40m S 119 88.3 8:51.69 ExeMgr 23493 root 20 -1 331m 7080 5276 S 0 0.0 0:00.01 ExeMgr