One crucial number that is missing from the standard plan generated by 'EXPLAIN PLAN' is the actual rows flowing in each node of the graph. Without it, it is close to impossible to judge the efficiency of the plan. How do you decide if the 'optimizer has underestimated/overestimated the row source' at any point of the execution?
A pretty neat feature which first appeared in v9.7 and unfortunately still largely unknown in v10.1, is the 'section actuals', which can be set at the database level as SECTION_ACTUALS, or at the individual application using WLM_SET_CONN_ENV.
You will also need to capture the stats using activity event monitor, eg: ALTER WORKLOAD WL1 COLLECT ACTIVITY DATA WITH DETAILS,SECTION, etc etc ....
But assuming that all you want to do is to do simple test on certain query, there is an easier way. Hidden somewhere in the sea of binaries is a utility called 'db2caem'. It's really easy to use.
Let's illustrate this using a simple example. Assuming that you want to test your query 'select count(*) from t1', all you need to do is issue this:
db2caem -d db2test -st "select count(*) from t1"
____________________________________________________________________
_____ D B 2 C A E M _____
DB2 Capture Activity Event Monitor data tool
I B M
The DB2CAEM Tool is a utility for capturing the activity event
monitor data with details, section and values, as well as actuals.
which could be used when analyze the SQL statement performance.
____________________________________________________________________
________________________________________________________________________________
DBT7041I "db2caem": The db2caem utility is connecting to the following database: "db2test".
DBT7038I "db2caem": The db2caem utility successfully connected to the following database: "db2test".
DBT7042I "db2caem": The SQL statement "select count(*) from t1" is being issued.
DBT7043I "db2caem": The db2caem utility is disconnecting from the following database: "db2test".
DBT7039I "db2caem": The db2caem utility successfully disconnected from the following database: "db2test".
DBT7000I db2caem completed. Output path: "C:\\DB2CAEM_2013-05-22-16.58.09.762000".
That's pretty much to it. The execution plan you want is in the directory ( as shown on the last line) "C:\\DB2CAEM_2013-05-22-16.58.09.762000"
Look for the file named 'db2caem.exfmt' inside, and you can see something like below:
-------------------------------
< some stuff removed for clarity>
Original Statement:
------------------
select
count(*)
from
t1
Optimized Statement:
-------------------
SELECT
Q3.$C0
FROM
(SELECT
COUNT(*)
FROM
(SELECT
$RID$
FROM
XXXXXX.T1 AS Q1
) AS Q2
) AS Q3
Explain level: Explain from section
Access Plan:
-----------
Total Cost: 78.1959
Query Degree: 1
Rows
Rows Actual
RETURN
( 1)
Cost
I/O
|
1
1
GRPBY
( 2)
78.1957
NA
|
10001
10001
IXSCAN
( 3)
77.8301
NA
|
-1
NA
INDEX: XXXXXXX
T1X2
Q1
<more stuff removed> ...
Note that now there are *two* numbers, one is the standard estimated rows, and the one immediately below it is the *actual* rows.
Easy eh? :)
While this is all good, unfortunately the interpretation of the numbers can be very confusing for certain execution plan. I shall leave that for another post ...
A pretty neat feature which first appeared in v9.7 and unfortunately still largely unknown in v10.1, is the 'section actuals', which can be set at the database level as SECTION_ACTUALS, or at the individual application using WLM_SET_CONN_ENV.
You will also need to capture the stats using activity event monitor, eg: ALTER WORKLOAD WL1 COLLECT ACTIVITY DATA WITH DETAILS,SECTION, etc etc ....
But assuming that all you want to do is to do simple test on certain query, there is an easier way. Hidden somewhere in the sea of binaries is a utility called 'db2caem'. It's really easy to use.
Let's illustrate this using a simple example. Assuming that you want to test your query 'select count(*) from t1', all you need to do is issue this:
db2caem -d db2test -st "select count(*) from t1"
____________________________________________________________________
_____ D B 2 C A E M _____
DB2 Capture Activity Event Monitor data tool
I B M
The DB2CAEM Tool is a utility for capturing the activity event
monitor data with details, section and values, as well as actuals.
which could be used when analyze the SQL statement performance.
____________________________________________________________________
________________________________________________________________________________
DBT7041I "db2caem": The db2caem utility is connecting to the following database: "db2test".
DBT7038I "db2caem": The db2caem utility successfully connected to the following database: "db2test".
DBT7042I "db2caem": The SQL statement "select count(*) from t1" is being issued.
DBT7043I "db2caem": The db2caem utility is disconnecting from the following database: "db2test".
DBT7039I "db2caem": The db2caem utility successfully disconnected from the following database: "db2test".
DBT7000I db2caem completed. Output path: "C:\\DB2CAEM_2013-05-22-16.58.09.762000".
That's pretty much to it. The execution plan you want is in the directory ( as shown on the last line) "C:\\DB2CAEM_2013-05-22-16.58.09.762000"
Look for the file named 'db2caem.exfmt' inside, and you can see something like below:
-------------------------------
< some stuff removed for clarity>
Original Statement:
------------------
select
count(*)
from
t1
Optimized Statement:
-------------------
SELECT
Q3.$C0
FROM
(SELECT
COUNT(*)
FROM
(SELECT
$RID$
FROM
XXXXXX.T1 AS Q1
) AS Q2
) AS Q3
Explain level: Explain from section
Access Plan:
-----------
Total Cost: 78.1959
Query Degree: 1
Rows
Rows Actual
RETURN
( 1)
Cost
I/O
|
1
1
GRPBY
( 2)
78.1957
NA
|
10001
10001
IXSCAN
( 3)
77.8301
NA
|
-1
NA
INDEX: XXXXXXX
T1X2
Q1
<more stuff removed> ...
Note that now there are *two* numbers, one is the standard estimated rows, and the one immediately below it is the *actual* rows.
Easy eh? :)
While this is all good, unfortunately the interpretation of the numbers can be very confusing for certain execution plan. I shall leave that for another post ...
No comments:
Post a Comment