NULL is not zero!

Some time ago I wrote a post about how COUNT(*) and COUNT(column) are semantically different things (link). Such queries may return different results if the column counted has NULLs in it. And the difference comes from that NULL is not a value, it’s rather a state which says “value unknown” or “no value entered”.

So, you better understand how NULLs interact with your SQL constructs if you call yourself a DBA or a database developer ;-)

Here’s another example about how misunderstanding NULLs may cause your application to return different results than what was intended.

I will create a little table with TWO rows in it:

SQL> create table t(a int);
Table created.

SQL> insert into t values(1);

1 row created.

SQL> insert into t values(null);

1 row created.

SQL> select avg(a) from t;

 AVG(A)
----------
         1

When I take an average of the 2 values in these rows I get average of 1.

Now lets update the NULL (no value) to 0 (an actual value of zero).

SQL> update t set a=0 where a is null;

1 row updated.

SQL> select avg(a) from t;

 AVG(A)
----------
        .5

As you see, as we now have an actual value in the other row (as opposed to “no value”), the AVG function takes that zero into account.

Hopefully this illustrates once more that NULL does not mean zero or any other value, it means NO value. If you do aggregation functions (count,avg) over NULLs then you must understand that Oracle treats NULLs as no value and doesn’t account these “no values”, thus your queries may behave differently than what your intuition might say (and yes its always good to read documentation about what exactly a given SQL construction/function does in the given database engine instead of relying on “common sense”).

Measuring what matters

Cary Millsap’s recent post prompted me to write down some of the related thoughts in my head.

Here are few of my mantras for systematic troubleshooting and performance tuning, which have materialized in my head over the years of work:

  • Picking the right starting point to troubleshooting and performance tuning is the most important decision in that process.
  • Pick the wrong starting point and you end up going in circles.
  • The scope of your performance data needs to match the scope of your problem, otherwise you end up going in circles.
  • If you don’t measure what matters, you may end up fixing what doesn’t matter.
  • If you’re not systematic in your troubleshooting, you may get lucky, but you don’t want to be dependent on luck! Moreover, you wont’t need to be lucky if you are systematic in your work!
  • Performance tuning is overrated. Fixing fundamental design and coding flaws via changing a magic configuration parameter is a dream just like is getting slim and healthy via eating magic diet pills bought from TV shop.
  • Your response times are too long for only two reasons:
  1. You are doing too much work
  2. You are waiting for too much

…both of the above things can be measured in Oracle…

  • There’s no such thing as slow database or slow system. How can it be slow independently, without anyone experiencing this slowness?
    • If users say that a database is slow, they must be experiencing that somehow! The only way to experience database slowness is via a connection to it, in which case you’ll have a session (to measure).
    • If a monitoring system says that a database is slow, then it must be running and measuring response time of some task just like users do, otherwise it can not reliably say something is slow.
  • Performance is about one thing and one thing only – time. And time is measured in seconds, not in CPU utilization, number of physical IOs or looks of an execution plan.

Here’s a link to a Cary Millsap’s awesome post, read it!