Thursday, November 14, 2013

Hierarchical Query Processing and Access Method


In its simplest form, DB2 performs hierarchical query using a recursive CTE. The way to spot it is look for the UNION operator in the execution plan. The one shown here is a typical query plan.






Note the step labelled (7): This is where the main recursion happens. Once you know it is doing a NLJOIN from a TEMP ( step 5 resultset from elsewhere), you know that it has lost sight of the volume involved. But that is not the important bit.

Rows from this TEMP is used to probe the matching rows from the 'main' table on the right hand side of the operator. If there is an efficient access method ( eg, IXSCAN), this can be a very fast operation.

What if there is no such access method? What if there is no index?

You may ask what is the odd of that happening. Let's see ...what if the query is running not on a single table, but on a resultset from some tables joins? If that is the case, the resultset would have no precise access to its rows anymore. So, instead of having a NLJOIN into some rows using IXSCAN, you have an NLJOIN into rows using TBSCAN. Yes, for each row, coming from TEMP, it probes the entire table on the other side of the operator. We all know by now, that this is a performance killer.

For comparison: using a table containing 200,000 rows with proper index, the query completes in less than 1 second. using the same table *without* an index, it completes in 10 minutes.

Below is how it looks like when there is no index:



So, the next time you write a hierarchical query, keep this in mind.




No comments:

Post a Comment