An ORDER BY query returns wrong result
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Trafodion |
Fix Released
|
Critical
|
Hans Zeller |
Bug Description
The following order by query returns wrong result in the v1.1.0rc0 (v0410) build. This query has been returning the correct result ever since the Trafodion early days all the way until the v0407 build. It is a regression introduced between the v0407 build and the r1.1.0rc0 (v0410) build.
Bellow you can find the execution results in v0407 (correct) and in r1.1.0rc0 (incorrect). Notice that this is a query ordered by 2, 3, 4, 1:
set schema trafodion.
prepare xx from
select [first 200]
i_item_id,
ca_country,
ca_state,
ca_county,
......
order by 2,3,4,1;
since 2 (ca_country) is the same value ‘United Stats’ for the first 200 rows, the query essentially should see the rows in the order of 3 (ca_state). In the v0407 execution output, the rows are ordered correctly with ‘IN’ followed by ‘CO’. But in the v1.1.0rc0 execution output, ‘CO’ and ‘IN’ started to mixed together after certain rows.
-------
Here is the script to reproduce it. (It requires the QA g_tpch1x tables.)
set schema trafodion.
prepare xx from
select [first 200]
i_item_id,
ca_country,
ca_state,
ca_county,
agg1,
agg2,
agg3,
agg4,
agg5,
agg6,
agg7
from
(
select i_item_id,
ca_country,
ca_state,
ca_county,
avg(cs_quantity) agg1,
avg(cs_list_price) agg2,
avg(cs_coupon_amt) agg3,
avg(cs_sales_price) agg4,
avg(cs_net_profit) agg5,
avg(c_birth_year) agg6,
avg(cd1.
from catalog_sales, customer_
customer_
customer_address, date_dim, item
where cs_sold_date_sk = d_date_sk and
cs_item_sk = i_item_sk and
cs_bill_cdemo_sk = cd1.cd_demo_sk and
cs_bill_customer_sk = c_customer_sk and
cd1.cd_gender = 'F' and
cd1.cd_
c_current_cdemo_sk = cd2.cd_demo_sk and
c_current_addr_sk = ca_address_sk and
c_birth_month in (9,4,3,11,1,12) and
d_year = 1999 and
ca_state in ('VA','
group by 1,2,3,4
union all
select i_item_id,
cast(null as char),
ca_state,
ca_county,
avg(cs_quantity) agg1,
avg(cs_list_price) agg2,
avg(cs_coupon_amt) agg3,
avg(cs_sales_price) agg4,
avg(cs_net_profit) agg5,
avg(c_birth_year) agg6,
avg(cd1.
from catalog_sales, customer_
customer_
customer_address, date_dim, item
where cs_sold_date_sk = d_date_sk and
cs_item_sk = i_item_sk and
cs_bill_cdemo_sk = cd1.cd_demo_sk and
cs_bill_customer_sk = c_customer_sk and
cd1.cd_gender = 'F' and
cd1.cd_
c_current_cdemo_sk = cd2.cd_demo_sk and
c_current_addr_sk = ca_address_sk and
c_birth_month in (9,4,3,11,1,12) and
d_year = 1999 and
ca_state in ('VA','
group by 1,2,3,4
union all
select i_item_id,
cast(null as char),
cast(null as char),
ca_county,
avg(cs_quantity) agg1,
avg(cs_list_price) agg2,
avg(cs_coupon_amt) agg3,
avg(cs_sales_price) agg4,
avg(cs_net_profit) agg5,
avg(c_birth_year) agg6,
avg(cd1.
from catalog_sales, customer_
customer_
customer_address, date_dim, item
where cs_sold_date_sk = d_date_sk and
cs_item_sk = i_item_sk and
cs_bill_cdemo_sk = cd1.cd_demo_sk and
cs_bill_customer_sk = c_customer_sk and
cd1.cd_gender = 'F' and
cd1.cd_
c_current_cdemo_sk = cd2.cd_demo_sk and
c_current_addr_sk = ca_address_sk and
c_birth_month in (9,4,3,11,1,12) and
d_year = 1999 and
ca_state in ('VA','
group by 1,2,3,4
union all
select i_item_id,
cast(null as char),
cast(null as char),
cast(null as char),
avg(cs_quantity) agg1,
avg(cs_list_price) agg2,
avg(cs_coupon_amt) agg3,
avg(cs_sales_price) agg4,
avg(cs_net_profit) agg5,
avg(c_birth_year) agg6,
avg(cd1.
from catalog_sales, customer_
customer_
customer_address, date_dim, item
where cs_sold_date_sk = d_date_sk and
cs_item_sk = i_item_sk and
cs_bill_cdemo_sk = cd1.cd_demo_sk and
cs_bill_customer_sk = c_customer_sk and
cd1.cd_gender = 'F' and
cd1.cd_
c_current_cdemo_sk = cd2.cd_demo_sk and
c_current_addr_sk = ca_address_sk and
c_birth_month in (9,4,3,11,1,12) and
d_year = 1999 and
ca_state in ('VA','
group by 1,2,3,4
union all
select cast(null as char),
cast(null as char),
cast(null as char),
cast(null as char),
avg(cs_quantity) agg1,
avg(cs_list_price) agg2,
avg(cs_coupon_amt) agg3,
avg(cs_sales_price) agg4,
avg(cs_net_profit) agg5,
avg(c_birth_year) agg6,
avg(cd1.
from catalog_sales, customer_
customer_
customer_address, date_dim, item
where cs_sold_date_sk = d_date_sk and
cs_item_sk = i_item_sk and
cs_bill_cdemo_sk = cd1.cd_demo_sk and
cs_bill_customer_sk = c_customer_sk and
cd1.cd_gender = 'F' and
cd1.cd_
c_current_cdemo_sk = cd2.cd_demo_sk and
c_current_addr_sk = ca_address_sk and
c_birth_month in (9,4,3,11,1,12) and
d_year = 1999 and
ca_state in ('VA','
group by 1,2,3,4
) v1
order by 2,3,4,1;
explain options 'f' xx;
execute xx;
-------
Launchpad prevents adding long text in the description here. See attached file output.txt for the incorrect result from the r1.1.0rc0 (v0410) build and the correct result from the v0407 build.
Changed in trafodion: | |
assignee: | nobody → James Capps (james-capps) |
Changed in trafodion: | |
status: | New → In Progress |
assignee: | James Capps (james-capps) → Justin Du (justin-du-2) |
This problem could be another symptom of LP bugs 1442932 and 1442966. I tried the fix provided by Hans for those two on my workstation and it seems fixed this problem too.