Thursday, November 14, 2013

Explain Tables, Data Studio, And Column Organize

There is a bit of nasty surprise when a database is created with default table organization as 'COLUMN'. If the EXPLAIN.DDL are executed with the default organization set to COLUMN, it will cause error during the explain phase in Data Studio.

The workaround: change the default to ROW, then (re)create the explain tables.


Hierarchical Query Processing and Access Method


In its simplest form, DB2 performs hierarchical query using a recursive CTE. The way to spot it is look for the UNION operator in the execution plan. The one shown here is a typical query plan.






Note the step labelled (7): This is where the main recursion happens. Once you know it is doing a NLJOIN from a TEMP ( step 5 resultset from elsewhere), you know that it has lost sight of the volume involved. But that is not the important bit.

Rows from this TEMP is used to probe the matching rows from the 'main' table on the right hand side of the operator. If there is an efficient access method ( eg, IXSCAN), this can be a very fast operation.

What if there is no such access method? What if there is no index?

You may ask what is the odd of that happening. Let's see ...what if the query is running not on a single table, but on a resultset from some tables joins? If that is the case, the resultset would have no precise access to its rows anymore. So, instead of having a NLJOIN into some rows using IXSCAN, you have an NLJOIN into rows using TBSCAN. Yes, for each row, coming from TEMP, it probes the entire table on the other side of the operator. We all know by now, that this is a performance killer.

For comparison: using a table containing 200,000 rows with proper index, the query completes in less than 1 second. using the same table *without* an index, it completes in 10 minutes.

Below is how it looks like when there is no index:



So, the next time you write a hierarchical query, keep this in mind.




Tuesday, November 12, 2013

db2caem And Optimization Guidelines Are Not Friends

Any attempt to run db2caem on a query with optimization guideline will be greeted with this message:

DBT7028E  "db2caem": The db2caem command failed because the DB2 database manager encountered an error while processing the specified input parameters and the specified SQL statement. Handle type: "3". Return code: "-1:2162".
           CLIInfo: SQLstate: "42601"
                    Return code: "-104"
                    Diagnostic message: "[IBM][CLI Driver][DB2/LINUXX8664] SQL0104N  An unexpected token "'SELECT xxxxxxxx" was found following "WHERE STMT_TEXT LIKE".  Expected tokens may include:  "LIKE <pattern>".  SQLSTATE=42601
"

Too bad. 

Saturday, November 9, 2013

db2top 'explain plan' - useful but unreliable

db2top is a handy tool to monitor the current database activity. One of my favourite features is the ability to 'explain' the statement present in the Dynamic query monitor screen. There is a big catcha though: it is merely showing the 'explain plan' at the time you choose to see the plan, it may or may *not* be the actual plan being used during the actual execution of the query.

Yup, it is as good as you manually running 'explain plan for <query>' without running the query.

Beware!