Something About DB2: 2012

Tuesday, December 4, 2012

Optimizer: Fake it like you mean it

There is another useful utility that lets you fool the optimizer into thinking it has more resource than there is in the hardware. The utility is db2fopt.

This is useful if you are testing the optimizer in a test environment that is considerably less powerful than the production environment. eg, the sortheap in the production system might have 80000 pages, but your tiny test machine might max out at 2000. What do you do then? You can use db2fopt to fake it:

db2fopt <database> update opt_sortheap 80000

Easy.

It is a bit unfortunate that only these 4 are available to be 'faked' as of this writing ( for DB2 LUW v10)

opt_buffpage
opt_sortheap
opt_locklist
opt_maxlocks

Sunday, December 2, 2012

Replicating a database's 'statistics'

db2look has a useful option '-m' that allows you to 'replicate' a database's statistics, which is not unlike Oracle's DBMS_STATS.EXPORT_XXX_STATS. One advantage it has over the export is that it gives you an insight into the stats components required, in the form of individual UPDATE statements. The other advantage is that you don't have to prepare a table to be 'exported' or 'imported' into.

eg:

db2look -d sample -t t1 -m -o test.sql

This will 'look' into the database 'sample', pull stats only for table 't1', and redirects the output to file 'test.sql'.

One word: Convenient!

Tuesday, October 2, 2012

ColumnStore Index in SQLServer (2012)

This is obviously neither Oracle nor DB2 feature, but I like it so much that I have put it in my wishlist. For traditional row-based RDBMS, OLAP kind of activities can really hurt performance. On the flip side, newer column-based databases excel in OLAP, but suffer in OLTP type of activities. Forget about throwing in SSD and terrabytes of RAM to 'solve' the issues ( eg: Oracle's Exadata, IBM's pureSparta ), this new kind of index offers partial relief to traditional RBDMS in OLAP system. 'Best' of both worlds at a fairly low cost.

Ellison, are you listening?

Wednesday, September 26, 2012

CLUSTERING_FACTOR = CLUSTERFACTOR ?

Running, in Oracle compatible mode, you get a set of system catalog tables that 'resemble' what you see in Oracle. Column names similarity aside, there are plenty of gotchas that you need to be aware of.

eg: You get 'USER_INDEXES' in DB2 if you create the DB in Oracle compatible mode. However, the column values could mean something totally different.

If you see the column CLUSTERING_FACTOR in USER_INDEXES, it is definitely related to index 'clustering' , but the value is no way what you expect from a 'real' Oracle DB. This CLUSTERING_FACTOR is actually the CLUSTERFACTOR in DB2's original 'syscat.indexes'. If you have a table with , say, 100000 pages, you might have a CLUSTERING_FACTOR of just 1, which can be very puzzling to those coming straight from the Oracle world.

To confuse matter, there is another value called 'CLUSTERRATIO' in syscat.indexes, which is also an indicator of the data clustering ...

I will just leave a 'WARNING' sign here ....

Wednesday, September 19, 2012

Query Results Depends on Processing SPEED?

Let's say you can go back in time and run a query over and over again. Let's also assume that there are many other concurrent DML on the tables *while* the query is executing. Do you want your query result to be
1) the same , everytime
2) anything, depends on what is happening to the rows.

??

I prefer option#1.

It is important to note at this point that none of the isolation level in DB2 , (RR, RS, CS, UR) can give you option#1. RR, RS, and CS would only prevent modifications *after* those matching rows have been fetched. So, it is still possible to modify some rows that are supposed to match the criteria of the query *before* they are fetched, and that may mean the query result can differ from run to run.

Sounds fun? ;)

Monday, September 17, 2012

RUNSTATS ahoy!

I have to say, this has got to be the biggest nasty surprise of all! Again, this is the standard blocking issue we all frown upon.

Test case:

Disable auto-commit. Perform runstats on a table, say, T1:

RUNSTATS ON TABLE T1

Start up another CLP session, issue this:

SELECT * FROM T1

What do you get? Nothing! Well, it is more like 'waiting for something to happen', but it is as good as nothing.

This SELECT session is waiting on a lock type CatCacheLock, presumably the RUNSTATS command has an exclusive lock on this, and this SELECT session needs it in EXCLUSIVE mode too ....why exclusive mode? Hmmm ...another Alfred Hitchcock moment ... :)

What is the damage then, you may ask. Imagine this scenario: you have a huge table with billions of rows, and you are running RUNSTATS on it, and at the same time, there are some users wanting to read this table ... you will get an ultimate test of patience ;)

Update: This behaviour seems to be timing-sensitive. So it seems that the other read session only requests for an exclusive 'loading lock' after runstats ends and the read begins.

Referential Integrity and Locks

When it comes to enforcing parent-child constraint, DB2 is not as aggressive compared to Oracle. This can be demonstrated easily:

DB2:

Create a table P1 ( with a single column C1), and another table C1 which references P1 as foreign key owner. Populate Table P1 so that value '1' is in there.
(ok, I admit the choice of table name C1 is confusing at best, given that there is a column name called 'c1' in P1 ... )

Remember to commit.

Now, start up another CLP and issue this:

LOCK TABLE C1 IN EXCLUSIVE MODE

At this point, this CLP session would have exclusive lock on table C1.

Now, start up another CLP session, and issue this:

UPDATE P1 SET C1 = 2 WHERE C1 = 1

You will have no problem with that. The update will succeed. There is only one IX lock on table P1 which is owned by this session.

For Oracle, the situation is different. When you issue UPDATE P1 SET C1 = 2 WHERE C1 = 1, this session will attempt to obtain an SX lock (mode 3) on table C1, and it will block because of the exclusive lock by another session ( with the 'LOCK TABLE xxxx' command).

It does raise the question: which one is 'safer'? What if a single call is needed to get a consistent view of , say, join between P1 and C1? The answer will be revealed in future post ...

Saturday, September 15, 2012

Constraint Violation and Locks

Came across this nasty surprise while doing some casual testing. It is a surprise because this is something that seems unnecessary.

Here's a quick look at the problem. First, make sure auto-commit is off.

a) create a simple table T1 with a unique column:

CREATE TABLE T1 ( C1 NUMBER NOT NULL PRIMARY KEY)

b) insert a single row into T1

INSERT INTO T1 VALUES (1)

COMMIT

c) Now, let's try to do the same insert:

INSERT INTO T1 VALUES (1)

DB21034E The command was processed as an SQL statement because it was not a
valid Command Line Processor command. During SQL processing it returned:
SQL0803N One or more values in the INSERT statement, UPDATE statement, or
foreign key update caused by a DELETE statement are not valid because the
primary key, unique constraint or unique index identified by "1" constrains
table "DB2INST1.T1" from having duplicate values for the index key.
SQLSTATE=23505

**********************

So far so good. We have triggered a constraint violation.

d) Now, start up another CLP session and issue this:

UPDATE T1 SET C1 = 2 WHERE C1 = 1

You will notice that this session 'stalls'. Why? Because it is blocked. The question is: blocked by what?

==> It is blocked by the previous INSERT session that caused the constraint violation.

Comment: In my opinion, it is very strange that DB2 keeps the locks for an INSERT that has failed. This behaviour is hindering concurrency. It should just simply release those locks associated with the failed constraint and be done with it.

Wednesday, July 25, 2012

Data Change Table Reference (DCTR)

Here's a 'weird' feature that can be put to good use. Let's delete a few rows from a table *AND* retrieve it within a single statement:

db2 => select * from t1 ;

V1
------------------------
2
3
4
5
6
7
8
100
10

9 record(s) selected.

db2 => select * from old table ( delete from t1 where v1 > 8);

V1
------------------------
100
10

2 record(s) selected.

You can see how the 'deleted' rows are being retrieved. The other two options are 'NEW TABLE' and 'FINAL TABLE', which work for INSERT/UPDATE. The advantage of using this? -> Less locks! Locks are scarce commodity in DB2. The less yo need it, the more scalable the DB can be.

Tuesday, July 24, 2012

I Wish ...

I wish one day, one RDBMS would implement something similar to the *combination* of Oracle's in-statement hints, and DB2's XML optimization profile. I call it 'best of both worlds' ;)

Reasons:

Oracle's hints make it easy to do testing if you have full source control, but a real pain or close to impossible to fix embedded SQL statements that you have no access to.

DB2's XML profile allows you to fix pretty much everything, but a real pain to use, especially when all you need is to whip out some simple test cases.

ps: Oracle's stored outline( which is, sadly, not supported in 11g anymore) has great potential, but the lack of proper/easy way to manipulate it makes it even more pain in the neck to use than DB2's XML profile. Why do these designers like to make something useful so painfully difficult to use, and waste so much time making almost useless things (eg, Enterprise Manager) so easy to use?

Monday, July 23, 2012

MOD and MOD

Take note when you use MOD in either of these environment, especially if you need the fractional part of the final arithmetic.

Here's the behaviour of the MOD function in Oracle:

SQL> select mod(10.1,3) from dual ;

MOD(10.1,3)
-----------
1.1

SQL>

And here is how DB2 does it:

db2 => select mod(10.1,3) from sysibm.sysdummy1;

1
-----------
1

1 record(s) selected.

I shall emphasize that *BOTH* implementations are 'correct'.

Sunday, July 15, 2012

Philosophy

The next time you encounter a problem ( in anything), instead of asking "why doesn't it work the way I want it?" , you should ask "what don't I know to make it work?"

More often than not, technology/tool is not the problem; the problem is YOU! ;)

Wednesday, July 11, 2012

LEVEL trick that fails

This simple trick works fine in Oracle, but fails in DB2 in Oracle compatible mode:

db2 => select rownum from dual connect by level <= 10;

ROWNUM
--------------------
SQL20451N Cycle detected in a hierarchical query. SQLSTATE=560CO

So, at least for now ( July 2012), Oracle folks would have lost a handy way to generate data in DB2 under compatible mode.

Tuesday, July 10, 2012

ROWNUM 'error'

Here's a trivial and almost pointless example of 'incompatible' implementation of Oracle's commonly used 'rownum':

Oracle:

SQL> SELECT rownum FROM T1 WHERE ROWNUM = 2;

< no rows returned >

DB2 v10:

db2 => select rownum from T1 where rownum = 2;

ROWNUM
--------------------
1

1 record(s) selected.

See the 'strange-ness'? :)

We have a match for rownum = 2 , but DB2 returns '1' as the resultset.

No, this is not a bug. This is a consequence of the 'workaround' of Oracle's 'rownum' in DB2. This, in practice, is inconsequential, because no Oracle programmer would write something like 'rownum = 2' ;) Ok, maybe some do. Those who do that deserve to get some bugs in their code anyway ;)

LOCK Horror

This post and the next few are intended to address the murky area of Oracle-compatibility in DB2 v10. The biggest change, in my opinion, is the CURRENTLY COMMITTED semantic, which mimics the way Oracle performs non-blocking read.

There are a few gotchas though. Here's the first one:

In Oracle, you can do this:

Session#1:

SQL> LOCK TABLE T1 IN EXCLUSIVE MODE;

Table(s) Locked.

Session#2:

SQL> SELECT * FROM T1;

C1
----------
1
2

Notice how Oracle reads 'around' the locks in this extreme case.

However, in DB2v10,

Session#1:

SQL> LOCK TABLE T1 IN EXCLUSIVE MODE;

Table(s) Locked.

Session#2:

SQL> SELECT * FROM T1;

< this session will wait without any output >

This is one area where DB2 can't read 'around' the locks. This is probably working as designed, but for Oracle folks who are told that it is easy to code in DB2 using their original skills, this can come as a nasty surprise, not least a hidden bug in hiding.