Thursday, November 14, 2013

Explain Tables, Data Studio, And Column Organize

There is a bit of nasty surprise when a database is created with default table organization as 'COLUMN'. If the EXPLAIN.DDL are executed with the default organization set to COLUMN, it will cause error during the explain phase in Data Studio.

The workaround: change the default to ROW, then (re)create the explain tables.


Hierarchical Query Processing and Access Method


In its simplest form, DB2 performs hierarchical query using a recursive CTE. The way to spot it is look for the UNION operator in the execution plan. The one shown here is a typical query plan.






Note the step labelled (7): This is where the main recursion happens. Once you know it is doing a NLJOIN from a TEMP ( step 5 resultset from elsewhere), you know that it has lost sight of the volume involved. But that is not the important bit.

Rows from this TEMP is used to probe the matching rows from the 'main' table on the right hand side of the operator. If there is an efficient access method ( eg, IXSCAN), this can be a very fast operation.

What if there is no such access method? What if there is no index?

You may ask what is the odd of that happening. Let's see ...what if the query is running not on a single table, but on a resultset from some tables joins? If that is the case, the resultset would have no precise access to its rows anymore. So, instead of having a NLJOIN into some rows using IXSCAN, you have an NLJOIN into rows using TBSCAN. Yes, for each row, coming from TEMP, it probes the entire table on the other side of the operator. We all know by now, that this is a performance killer.

For comparison: using a table containing 200,000 rows with proper index, the query completes in less than 1 second. using the same table *without* an index, it completes in 10 minutes.

Below is how it looks like when there is no index:



So, the next time you write a hierarchical query, keep this in mind.




Tuesday, November 12, 2013

db2caem And Optimization Guidelines Are Not Friends

Any attempt to run db2caem on a query with optimization guideline will be greeted with this message:

DBT7028E  "db2caem": The db2caem command failed because the DB2 database manager encountered an error while processing the specified input parameters and the specified SQL statement. Handle type: "3". Return code: "-1:2162".
           CLIInfo: SQLstate: "42601"
                    Return code: "-104"
                    Diagnostic message: "[IBM][CLI Driver][DB2/LINUXX8664] SQL0104N  An unexpected token "'SELECT xxxxxxxx" was found following "WHERE STMT_TEXT LIKE".  Expected tokens may include:  "LIKE <pattern>".  SQLSTATE=42601
"

Too bad. 

Saturday, November 9, 2013

db2top 'explain plan' - useful but unreliable

db2top is a handy tool to monitor the current database activity. One of my favourite features is the ability to 'explain' the statement present in the Dynamic query monitor screen. There is a big catcha though: it is merely showing the 'explain plan' at the time you choose to see the plan, it may or may *not* be the actual plan being used during the actual execution of the query.

Yup, it is as good as you manually running 'explain plan for <query>' without running the query.

Beware!



Monday, October 28, 2013

Blu Acceleration and Oracle Customer

Oct 2013 - Current fix pack = SP2. As IBM is trying to win over Oracle's customer with DB2, it is unfortunate that Blu Acceleration is not available *yet* in this condition:

===> Turning on DB2_COMPATIBILITY_VECTOR=ORA would *DISABLE* the ORGANIZE BY COLUMN feature.

Good luck to those trying to sell to Oracle customers with Blue Acceleration powerpoint slides :)

Monday, October 14, 2013

Organize by Column - Index Scan in v10.5 FixPack2

DB2 v10.5 Fixpack 2 has just been released. There is one bit of the changes as described in the fixpack summary that got my attention:

"The performance of a select, update, or delete operation that affects only one row in a column-organized table can be improved if the table has unique indexes, because the query optimizer can now use an index scan instead of a full table scan. "

As I wrote about the "Oraganize By Column" about 2 months ago, I thought it was unfortunate that DB2 was unable to use the primary key index to locate a single row. Naturally what this fixpack has promised to deliver is encouraging.

Unfortunately, after applying this fixpack and redoing the test, I am still not able to get the index scan to work for a ridiculously senseless and simple query like "SELECT C1 FROM T1 WHERE C1 = 10", where C1 is the primary key, with T1 holding 1 million rows. 

More bugs? Installation problem? Blah ...

Monday, September 9, 2013

ORGANIZE BY COLUMN - Part 2

In an earlier post, I demonstrated that DB2 would do a TABLE SCAN for a simple equality predicate on a primary key column. While that looks bad ( ya, looks REALLY BAD), something else is taking place under the hood. In most cases, the TABLE SCAN is not what we think it is.

While a TABLE SCAN is understood to scan the entire table, it is not really necessary if certain condition is met in an ORGANIZE BY COLUMN table. If the data type is datetime/boolean/numeric, the storage engine is able to perform DATA SKIPPING, which avoid scanning data ranges that *do not qualify* the predicate. Any pages picked up by the SCAN operation are guaranteed to contain data of interest. These pages might contain non-qualifying AND qualifying data, depending on the distribution. As a result, a TABLE SCAN pick up much less pages than necessary, and not the entire table is being processed.

Note that as of this writting ( 9-Sep-2013) , varchar/char column do not allow DATA SKIPPING, unless that column is an (enforced) primary key/unique.

Non-enforced uniqueness do not allow DATA SKIPPING as well.

There you go.



Thursday, August 22, 2013

REPEATABLE READ: DB2 vs SQLServer

Just a short post to highlight the difference between the implementation of a similar sounding isolation level, namely REPEATABLE READ (RR), of DB2 and SQLServer.

In SQLServer, RR allows phantom reads, whereas DB2's RR does not allow phantom read.

Just because they have the same term, it does not *guarantee* they are the same ;)

As the saying goes:
*Assumption is the mother of all screw-ups*

Saturday, July 6, 2013

DB2 10.5 : Is ORGANIZE BY COLUMN a magic cure for all?

Unless your system is 100% pure OLAP, the quick answer to this is a big 'NO'.

This short demo tells it all:

CREATE TABLE T1 ( C1 BIGINT NOT NULL PRIMARY KEY) ORGANIZE BY COLUMN;

So, we have a single-column table built with the 'COLUMN ORGANIZED' method. Since it is a primary key, there exists an index on the same column. So, do we get the 'best of both worlds'? Let's take a look.

Populate about 100 thousands rows into this table, and issue this simple query:

SELECT * FROM T1 WHERE C1 = 3;

We would 'expect' that the engine would use the primary key index to locate this value. Let's see the execution plan:

Access Plan:
-----------
Total Cost: 440.126
Query Degree: 2

        Rows
       RETURN
       (   1)
        Cost
         I/O
         |
          1
       CTQ  
       (   2)
       440.126
         480
         |
          1
       TBSCAN
       (   3)
       440.125
         480
         |
       195040
 CO-TABLE: ADMIN  
         T1
         Q1



One thing that stands out is that the cost is over 400 for a single row search. Not good. 

So, in a system where we anticipate mixed workload,  a COLUMN ORGANIZED table could potentially flood your buffer cache when there are plenty of simple singleton query. Of course, if you have a PERFECT foresight that you know 100% for sure that certain table will always be OLTP access pattern and certain table always be OLAP, then choosing ROW/COLUMN organize table is a breeze. 

I'm just curious, who has that ability? ;)

Having said that, I do see this feature lends itself naturally to a pure data warehouse system. For the rest of us, stay alert!









Saturday, May 25, 2013

Killer Disjunct

As of this writing ( LUW V10.1 or older), there is a class of query that does not play nice with DB2 optimizer, which takes the form of a disjunctive predicate such as below:

SELECT 1 FROM
T1  JOIN T2
ON ( C1 = D1 OR C2 = D2)

Currently, the optimizer can only come up with a plan that uses a NLJOIN, *regardless of the data volume*.  Imagine a scenario where you have a couple of million rows in T1 and T2 and there is no suitable covering index, you will have plenty of free time in your hand to do other things such as taking a shower and going for a swim, because the query will take a very long time to complete.

Is there a workaround? Fortunately yes. Just take a look at how Oracle handle this sort of query.

I shall demo an example on this in future post.



The Evil of Prefetch

For many implementations, much emphasis has been placed on some kind of readahead to boost performance. But if it is used blindly, it can have adverse effect.

Cache maintenance is not free, and is definitely not unlimited. If the DB is allowed to perform prefetch too aggressively, you might end up flooding the cache with useless data pages which will be flushed out anyway.

Remember that the key to good performance is to let the system do as little work as possible.

Extent and RAID Stripes Alignment

As usual, the interplay of RAID strip/RAID stripes/filesystem block size/extent size/tablespace block or page size can be a complex matter. There is a tiny titbit on the RAID/extent relationship in the official documentation that largely goes unnoticed. I will highlight the interesting bit here:

Quote:"Do not set the DB2_USE_PAGE_CONTAINER_TAG registry variable. As described earlier, you should create a table space with an extent size that is equal to, or a multiple of, the RAID stripe size. However, when you set DB2_USE_PAGE_CONTAINER_TAG to ON, a one-page container tag is used, and the extents do not line up with the RAID stripes. "

Link to the documentation:
http://pic.dhe.ibm.com/infocenter/db2luw/v9r7/index.jsp?topic=%2Fcom.ibm.db2.luw.admin.dbobj.doc%2Fdoc%2Fc0052067.html



File System Caching Info

Other than using the OS mount option, file system caching can be disabled/enabled during tablespace creation time. As important as this option is, it is not being shown in the standard 'list tablespaces' output.

To get this info, you will have to resort to this:

GET SNAPSHOT FOR TABLESPACES ON <db name>

A sample output is shown below. Note the line with 'File system caching' ( highlighted in italic bold font). For this example, it is 'No' for USERSPACE1:


Tablespace name                            = USERSPACE1
  Tablespace ID                            = 2
  Tablespace Type                          = Database managed space
  Tablespace Content Type                  = All permanent data. Large table space.
  Tablespace Page size (bytes)             = 4096
  Tablespace Extent size (pages)           = 32
  Automatic Prefetch size enabled          = Yes
  Buffer pool ID currently in use          = 1
  Buffer pool ID next startup              = 1
  Using automatic storage                  = Yes
  Auto-resize enabled                      = Yes
  File system caching                      = No
  Tablespace State                         = 0x'00000000'
   Detailed explanation:
     Normal
  Tablespace Prefetch size (pages)         = 32


For tablespaces created in v9.7 and later, NO FILE SYSTEM CACHING is the default. Before deciding whether caching is good for the system or not, make sure you understand the characteristics of the DB ( the deciding factor would involve the presence of inlined/non-inlined LOB).





Wednesday, May 22, 2013

A 'Better' Execution Plan Graph

One crucial number that is missing from the standard plan generated by 'EXPLAIN PLAN' is the actual rows flowing in each node of the graph.  Without it, it is close to impossible to judge the efficiency of the plan. How do you decide if the 'optimizer has underestimated/overestimated the row source' at any point of the execution?

A pretty neat feature which first appeared in v9.7 and unfortunately still largely unknown in v10.1, is the 'section actuals', which can be set at the database level as  SECTION_ACTUALS, or at the individual application using WLM_SET_CONN_ENV.

You will also need to capture the stats using activity event monitor, eg: ALTER WORKLOAD WL1 COLLECT ACTIVITY DATA WITH DETAILS,SECTION, etc etc .... 

But assuming that all you want to do is to do simple test on certain query, there is an easier way. Hidden somewhere in the sea of binaries is a utility called 'db2caem'. It's really easy to use. 

Let's illustrate this using a simple example. Assuming that you want to test your query 'select count(*) from t1', all you need to do is issue this:

db2caem -d db2test -st "select count(*) from t1"


      ____________________________________________________________________

                      _____     D B 2 C A E M     _____

                DB2 Capture Activity Event Monitor data tool
                              I      B      M


          The DB2CAEM Tool is a utility for capturing the activity event
         monitor data with details, section and values, as well as actuals.
          which could be used when analyze the SQL statement performance.
      ____________________________________________________________________


________________________________________________________________________________

DBT7041I  "db2caem": The db2caem utility is connecting to the following database: "db2test".
DBT7038I  "db2caem": The db2caem utility successfully connected to the following database: "db2test".
DBT7042I  "db2caem": The SQL statement "select count(*) from t1" is being issued.
DBT7043I  "db2caem": The db2caem utility is disconnecting from the following database: "db2test".
DBT7039I  "db2caem": The db2caem utility successfully disconnected from the following database: "db2test".
DBT7000I  db2caem completed. Output path: "C:\\DB2CAEM_2013-05-22-16.58.09.762000".



That's pretty much to it. The execution plan you want is in the directory ( as shown on the last line) "C:\\DB2CAEM_2013-05-22-16.58.09.762000"


Look for the file named 'db2caem.exfmt' inside, and you can see something like below:


-------------------------------
< some stuff removed for clarity>


Original Statement:
------------------
select
  count(*)
from
  t1


Optimized Statement:
-------------------
SELECT
  Q3.$C0
FROM
  (SELECT
     COUNT(*)
   FROM
     (SELECT
        $RID$
      FROM
        XXXXXX.T1 AS Q1
     ) AS Q2
  ) AS Q3

Explain level:    Explain from section

Access Plan:
-----------
Total Cost: 78.1959
Query Degree: 1

      Rows
   Rows Actual
     RETURN
     (   1)
      Cost
       I/O
       |
        1
        1
     GRPBY
     (   2)
     78.1957
       NA
       |
      10001
      10001
     IXSCAN
     (   3)
     77.8301
       NA
       |
       -1
       NA
 INDEX: XXXXXXX
      T1X2
       Q1

<more stuff removed> ...


Note that now there are *two* numbers, one is the standard estimated rows, and the one immediately below it is the *actual* rows.

Easy eh? :)

While this is all good, unfortunately the interpretation of the numbers can be very confusing for certain execution plan. I shall leave that for another post ...

Thursday, April 11, 2013

Optimization Guidelines - The Easy way

To use this, there is one very important step, which is:

 ===> set the registry DB2_OPTPROFILE to YES.

Interestingly, this registry is not visible in the 'db2set -lr' output.

Once this is done, you are good to go. All the available 'guidelines' you have are available in a section of the online documentation called 'Current Optimization Schema', which is in XML format. For the example here, I'm going to use the guideline the 'easy way'.

Let's say we have a table M1 and M2, both with columns C1 and C2. We want to join them M1.C1 = M2.C1 using NL join. Let's jump straight into the usage:


SELECT <something>
from
M2 M2 JOIN M1 M1
ON ( M1.C1 = M2.C1)
/*<OPTGUIDELINES>
<NLJOIN>
<IXSCAN TABLE="M1"/>
<TBSCAN TABLE="M2"/>
</NLJOIN>
</OPTGUIDELINES>*/


Note that you start the guideline at the *end* of your statement with /*<OPTGUIDELINES>, and end the guidelines with </OPTGUIDELINES>*/. Anything in between these tags are your guidelins to the database. In this example, I instructed the database to use NL join for the join of table M1 and M2, using index scan (IXSCAN) for table M1, and table scan (TBSCAN) for table M2.

Personally, I use it to understand the WHAT-IF scenario if the execution plan changes in certain manner. This short-cut way to specify the guidelines makes testing very straightforward.
















Tuesday, April 9, 2013

Datatype Space Usage: SMALLINT, BIGINT, etc

When a datatype (such as BIGINT) is chosen for a column, is space being allocated based on the 'value' , or is it a fixed allocation regardless of the value?

This is what I mean:

If a value of '5' being insert into the table, does it consume the same amount of space compared to value '9876543' ? Or does a '5' occupy less space than '9876543'? 

Let's find out. First, we create a single column table with type BIGINT. Then we insert roughly 150000 rows of value '1' into it.

This is the space used:

TAB             NPAGES
--------------- --------------------
BIGINT                           771


Let's repeat this test using value '99999999' instead. This is the space used:


TAB             NPAGES
--------------- --------------------
BIGINT                           771


That suggests there is no difference. 

That also means, if you know the range of the values ( number type, that is) for the column, you should pick the smallest possible datatype that would fit the range.

Below is a table showing the range for each numerical type:


Limit
Smallest SMALLINT value-32 768
Largest SMALLINT value+32 767
Smallest INTEGER value-2 147 483 648
Largest INTEGER value+2 147 483 647
Smallest BIGINT value-9 223 372 036 854 775 808
Largest BIGINT value+9 223 372 036 854 775 807
Largest decimal precision31
Maximum exponent (Emax) for REAL values38
Smallest REAL value-3.402E+38
Largest REAL value+3.402E+38
Minimum exponent (Emin) for REAL values-37
Smallest positive REAL value+1.175E-37
Largest negative REAL value-1.175E-37
Maximum exponent (Emax) for DOUBLE values308
Smallest DOUBLE value-1.79769E+308
Largest DOUBLE value+1.79769E+308
Minimum exponent (Emin) for DOUBLE values-307
Smallest positive DOUBLE value+2.225E-307
Largest negative DOUBLE value-2.225E-307
Maximum exponent (Emax) for DECFLOAT(16) values384
Smallest DECFLOAT(16) value1-9.999999999999999E+384
Largest DECFLOAT(16) value9.999999999999999E+384
Minimum exponent (Emin) for DECFLOAT(16) values-383
Smallest positive DECFLOAT(16) value1.000000000000000E-383
Largest negative DECFLOAT(16) value-1.000000000000000E-383
Maximum exponent (Emax) for DECFLOAT(34) values6144
Smallest DECFLOAT(34) value1-9.999999999999999999999999999999999E+6144
Largest DECFLOAT(34) value9.999999999999999999999999999999999E+6144
Minimum exponent (Emin) for DECFLOAT(34) values-6143
Smallest positive DECFLOAT(34) value1.000000000000000000000000000000000E-6143
Largest negative DECFLOAT(34) value-1.000000000000000000000000000000000E-6143

Tuesday, March 26, 2013

DB2 Subquery Removal Test 1


Does DB2 do subquery removal? Let's do a test and find out:

We will use a simplified TPC-H query, and faking 100000 rows for both tables:

Original Statement:
------------------
SELECT
  SUM(l_extendedprice) / 7.0 AS avg_yearly
FROM
  lineitem,
  part
WHERE
  p_partkey = l_partkey AND
  p_brand = '[BRAND]' AND
  p_container = '[CONTAINER]' AND
  l_quantity <
  (SELECT
     0.2 * AVG(l_quantity)
   FROM
     lineitem
   WHERE
     l_partkey = p_partkey
  )



The outcome? See the optimized statement:

Optimized Statement:
-------------------
SELECT
  (Q7.$C0 / 7.0) AS "AVG_YEARLY"
FROM
  (SELECT
     SUM(DECFLOAT(Q6.L_EXTENDEDPRICE, 34))
   FROM
     (SELECT
        Q5.L_EXTENDEDPRICE
      FROM
        (SELECT
           SUM(DECFLOAT(Q2.L_QUANTITY, 34)),
           COUNT(DECFLOAT(Q2.L_QUANTITY, 34))
         FROM
           (SELECT
              Q1.L_QUANTITY
            FROM
             LINEITEM AS Q1
            WHERE
              (Q1.L_PARTKEY = Q4.P_PARTKEY)
           ) AS Q2
        ) AS Q3,
        PART AS Q4,
        LINEITEM AS Q5
      WHERE
        (Q5.L_QUANTITY < (0.2 * (Q3.$C0 / Q3.$C1))) AND
        (Q4.P_CONTAINER = '[CONTAINER]') AND
        (Q4.P_BRAND = '[BRAND]') AND
        (Q4.P_PARTKEY = Q5.L_PARTKEY)
     ) AS Q6
  ) AS Q7




Notice that the subquery has vanished, but LINEITEM Q1 is still being used as a lateral view.
Can it do better than this? Can it reduce the access to LINEITEM from 2 to 1?

More on this later ....




Wednesday, March 6, 2013

"<> ALL" Multi-column Problem

For single column, everything is fine, but when you try multi-column set comparison, 'it' gets confused.

Let's start with a single column comparison:


db2 => select * from no_null where c1 <> all ( select c1 from some_null) ;

C1                       C2                    
------------------------ ------------------------

  0 record(s) selected.


So far so good.

Now, let's try two columns, c1 and c2:



db2 => select * from no_null where (c1, c2) <>ALL ( select c1, c2 from some_null) ;

SQL0104N  An unexpected token "(" was found following "".  Expected tokens may
include:  "".  SQLSTATE=42601


Fun eh?  ;)




Wednesday, February 13, 2013

VARCHAR, Trailing Spaces, and Unique Constraint


Another look at VARCHAR. This time, the implication of the 'important-ness' of trailing space(s) and constraint.

Let's start from a table with 3 rows:

db2 => select c1, length(c1) from smallt ;

C1   2
---- -----------
B              1
B              2
B              3

  3 record(s) selected.


As you can see, these three rows are 'unique' in the sense that they have different length. In Oracle, they are three distinct rows with all the 'expected' properties.

Can we create a unique index on it?

Let's try:


db2 => create unique index uqi on smallt (c1 );
DB21034E  The command was processed as an SQL statement because it was not a
valid Command Line Processor command.  During SQL processing it returned:
SQL0603N  A unique index cannot be created because the table contains data
that would result in duplicate index entries.  SQLSTATE=23515


NOPE!

Another example of the 'useless' trailing spaces in VARCHAR.

VARCHAR and Equality Comparison

This is a follow-up on the previous post.

Now that we know trailing space(s) is 'not that important' as far as DB2 is concerned, what would you expect in a simple '=' comparison?

Let's do a little test:



db2 => create table smallt (c1 varchar(4)) ;
DB20000I  The SQL command completed successfully.

db2 => insert into smallt values 'B', 'B ', 'B  ';
DB20000I  The SQL command completed successfully.

db2 => select length(c1), c1 from smallt ;

1           C1
----------- ----
          1 B
          2 B
          3 B

  3 record(s) selected.


We have three rows with different 'length' due to the trailing space(s) we inserted. If we compare this column to itself, ie, c1 = c1, how many rows would you expect?

Three?

Let's see ...



db2 => select * from smallt t1, smallt t2 where t1.c1 = t2.c1 ;

C1   C1
---- ----
B    B
B    B
B    B
B    B
B    B
B    B
B    B
B    B
B    B

  9 record(s) selected.


Need I say more? ;)


Varchar: When a limit is unlimited

How much do you know about 'varchar' type in DB2?

I thought I knew, until I ran a little test and got a nasty surprise.

Here's a simple test case to illustrate:


db2 => create table smallt ( c1 varchar(1));
DB20000I  The SQL command completed successfully.

db2 => insert into smallt values('B             ') ;
DB20000I  The SQL command completed successfully.


db2 => select * from smallt ;

C1
--
B

  1 record(s) selected.



See the 'problem'?


This behaviour can be very disturbing for those used to Oracle DB. Another potential 'hidden' bug for Oracle developer doing DB2.

Wednesday, January 16, 2013

BLEVEL: How High Can You Go?

While investigating index splits, I encountered a fairly 'mind-numbing' number. But first, here's the basic set up on a database with default page size 4K:

CREATE TABLE LONGT (V1 CHAR(254)) ;
CREATE INDEX IDX_LT ON LONGT ( V1);

And populate this table with 50,000 rows of ascending 'numbers', starting with 1.

Guess what BLEVEL you get?

Drumroll please ... it is ... 5!

I'm beginning to wonder if this BLEVEL has the same meaning as the BLEVEL I have been exposed to ... ;)