Tuesday, March 26, 2013

DB2 Subquery Removal Test 1


Does DB2 do subquery removal? Let's do a test and find out:

We will use a simplified TPC-H query, and faking 100000 rows for both tables:

Original Statement:
------------------
SELECT
  SUM(l_extendedprice) / 7.0 AS avg_yearly
FROM
  lineitem,
  part
WHERE
  p_partkey = l_partkey AND
  p_brand = '[BRAND]' AND
  p_container = '[CONTAINER]' AND
  l_quantity <
  (SELECT
     0.2 * AVG(l_quantity)
   FROM
     lineitem
   WHERE
     l_partkey = p_partkey
  )



The outcome? See the optimized statement:

Optimized Statement:
-------------------
SELECT
  (Q7.$C0 / 7.0) AS "AVG_YEARLY"
FROM
  (SELECT
     SUM(DECFLOAT(Q6.L_EXTENDEDPRICE, 34))
   FROM
     (SELECT
        Q5.L_EXTENDEDPRICE
      FROM
        (SELECT
           SUM(DECFLOAT(Q2.L_QUANTITY, 34)),
           COUNT(DECFLOAT(Q2.L_QUANTITY, 34))
         FROM
           (SELECT
              Q1.L_QUANTITY
            FROM
             LINEITEM AS Q1
            WHERE
              (Q1.L_PARTKEY = Q4.P_PARTKEY)
           ) AS Q2
        ) AS Q3,
        PART AS Q4,
        LINEITEM AS Q5
      WHERE
        (Q5.L_QUANTITY < (0.2 * (Q3.$C0 / Q3.$C1))) AND
        (Q4.P_CONTAINER = '[CONTAINER]') AND
        (Q4.P_BRAND = '[BRAND]') AND
        (Q4.P_PARTKEY = Q5.L_PARTKEY)
     ) AS Q6
  ) AS Q7




Notice that the subquery has vanished, but LINEITEM Q1 is still being used as a lateral view.
Can it do better than this? Can it reduce the access to LINEITEM from 2 to 1?

More on this later ....




Wednesday, March 6, 2013

"<> ALL" Multi-column Problem

For single column, everything is fine, but when you try multi-column set comparison, 'it' gets confused.

Let's start with a single column comparison:


db2 => select * from no_null where c1 <> all ( select c1 from some_null) ;

C1                       C2                    
------------------------ ------------------------

  0 record(s) selected.


So far so good.

Now, let's try two columns, c1 and c2:



db2 => select * from no_null where (c1, c2) <>ALL ( select c1, c2 from some_null) ;

SQL0104N  An unexpected token "(" was found following "".  Expected tokens may
include:  "".  SQLSTATE=42601


Fun eh?  ;)