Finding the reasons for excessive logical IOs

There’s another interesting thread going on in Oracle-L, about understanding logical IOs and drilling down into their reasons. Of course sometimes (or rather usually) the excessive logical IOs come from a bad execution plan (when a nested loop loops over lots of datablocks again and again or a wrong index is used for driving a query etc), but sometimes the excessive LIOs are caused by some internal issues, like space management etc.

A convenient tool I use for reporting logical IO reasons is (again) my Snapper! It has  an option “b” for reporting Buffer get reasons or as I use below – option “a” shows All information Snapper can show.

There are couple of gotchas though which make this approach imperfect:

  1. The X$ tables Snapper uses for LIO reason reporting contain instance-wide counters, not specific to a single testing session. Thus you either need to be the single user in your database when experimenting and even then the background activity may increment some counters while you are testing too. I have sometimes suspended all other processes (kill -STOP and kill -CONT to resume)  or used Flash Freeze (oradebug ffbegin and ffresumeinst) to hang the whole instance that there would be no other activity going on.
  2. These buffer get reason counters are not maintained properly in Oracle 11g, probably due an optimization effort and some changes for faster pinning of buffer cache buffers (there’s a parameter called _fastpin_enable which is set to 1 in 11g and it enables so called fastpath buffer gets. If you see v$sesstat statistics such “consistent gets from cache (fastpath) or “db block gets from cache (fastpath)” being inremented, then fastpath buffer gets/pins are used. Note that I do have a script which works also on 11g but I’ll write about that one some time in the future :)

Anyway, if you are testing in an environment exclusively used by you, on Oracle 10.2 or lower, then you can run snapper with the gather=a option to report a bunch instance-level statistics in addition to the standard session-level stats:

  1. BUFG – Buffer get reasons (both consistent and current mode gets)
  2. LATG – Latch gets (both willing to wait and immediate gets)
  3. ENQG – Enqueue gets

Here’s an example, prepare for long output:

[Read more...]

What’s a good way to learn some Oracle internals every day?

Sometimes when an attendee describes me some totally weird problem during a seminar, I am immediately able to answer something like “Hey this looks like a bug related to this Oracle configuration and can be influenced by xyz”.

And then people ask me “How the hell do you know all this stuff?”

Well, I haven’t been bitten by all of these bugs myself, but I have been doing something for many years, almost every day… reading my email!

Oh, and additionally I have configured Metalink to send me daily updates about new/updated notes, forum articles and… bug descriptions!

The last part is very important. Bug descriptions tell you something about new bugs found (and old bugs rediscovered) and sometimes their details tell you an interesting piece or two about Oracle internals related to them.

[Read more...]

SystemTap is production supported in Redhat EL5.4

If you don’t know what SystemTap is – it’s the Linux world’s attempt to build Solaris DTrace style safe dynamic instrumentation into Linux kernel.

I’m not going into religious discussions which one is better here, I have used both SystemTap and DTrace successfully for diagnosing low level issues inside OS kernel, so both are good enough for me :)

The problem with SystemTap though has been that it’s not production quality, it’s rather been a technology preview.

But with RHEL 5.4 part of it has changed, Redhat says following in this article:

SystemTap is no longer a technology preview, and now has production support. Red Hat recommendeds that users run scripts on development machines before deployment in production environments. Since SystemTap is an optional diagnostic tool, users can easily stop using it in the event of a problem. Options such as -g for Guru mode, and -D* allow users to disable several security checks. Scripts using these options may not be supported.

Red Hat plans to fix problems in SystemTap, or the Linux kernel, as they arise in connection with new scripts. In some cases, a fix may include extending the blacklist for known areas of the Linux kernel that are unsafe to probe. All scripts that use probes targeting blacklisted areas will need to be revised.

SystemTap users are advised to upgrade to this version.


So, at least officially, SystemTap is now supported by Redhat. Well at least as long as you’re not doing crazy stuff with the -g option, the geek mode ;-)

I would still be very careful before using any SystemTap in production, in fact, in critial environments I wouldn’t run it at all, unless there is no other option (and you’re prepared to get a crash + kernel panic).

However the significance of this note is that SystemTap is accepted for production by Redhat in principle. So now it’s just matter of time until it gets stable and widely used enough to be as practical and useful as DTrace is on Solaris.

latch: cache buffers chains latch contention – a better way for finding the hot block

Here’s a treat for Oracle performance professionals and geeks who are looking for more systematic ways for cache buffers chains (CBC) latch contention troubleshooting. Cache buffers chains latches are taken when a process wants to walk through a cache buffer hash chain, looking if the block with required DBA (data block address) is in buffer cache. If the block happens to be in cache, then in most cases it has to be pinned first before use and unpinned after use, to make sure no-one else can perform an incompatible operation on that block at the same time. The modification of pin structures (pinning/unpinning) is also protected by CBC latches.

CBC latch contention can happen for multiple reasons, but one reason is that there is some really hot block in a SMP system with high number of CPUs (or CMT system with high number of threads like Sun T-series servers). Sometimes there happen to be multiple moderately hot blocks “under” the same CBC latch, which can result in latch contention again.

Traditionally DBAs used to look up the child latch address from V$SESSION_WAIT, sql_trace output or ASH and then look up all buffers protected by that latch from X$BH, using HLADDR column (HLADDR stands for Hash Latch Address). I also have a script for that, bhla.sql (Buffer Headers by Latch Address), which reports me all blocks currently in buffer cache, “under” that particular latch and the corresponding data block addresses and object names:

An interview with me

Iggy Fernandez posted an interview with me (published in NoCOUG journal) on his blog.

If you are interested in a little bit of my history and some more general (not-so-technical) thoughts of mine, check it out here:

http://iggyfernandez.wordpress.com/2009/08/08/great-expectations-an-interview-with-tanel-poder/

(Secret) Preview of Oracle 12g CBO leaked from Oracle labs…

After doing my seminar in Spain last month, one of the attendees sent me a secret note about what the CBO would look like in Oracle 12g. Apparently it’s re-engineered from ground and completely different from anything we’ve seen before. It goes beyond being automatic, intelligent and auto-tuning, it’s actually edible too!

The New CBO

The New CBO

Thanks to John Ospino for sending me this insider-info ;)

Using Perfsheet and TPT scripts for solving real life performance problems

Karl Arao has written a nice blog entry where he used also some of my tools for visualizing and understanding performance of a RAC cluster suddenly gone slow. He presents a systematic approach he used for gathering evidence and also hopefully you notice that having the ability to easily visualize performance data (with PerfSheet for example) can be very helpful and time saving when troubleshooting non-trivial problems.

Check his blog entry out here:

http://karlarao.wordpress.com/2009/06/07/diagnosing-and-resolving-gc-block-lost/

I have been troubleshooting since I was a kid! :)

People often ask how come I know so much about Oracle (and some Unix) internals and how do I have such a passion for such things.
Also, another thing which you can notice is that for some reason a large amount of hardcore computer hackers, low level troubleshooters and various internals experts happen to come from Russia.

And I can answer why this is so :)

A little bit of history: I was born in Estonia and lived over 20 of my first years there. From 1940 to 1991 Estonia was occupied by Soviet Union (excluding few years during World War II when we were occupied by German forces).

So, as I was born in the end of seventies, most of my childhood we were still part of Soviet Union. And that applies to all russians and other nationalities in that empire too.

  • …This meant that all we could buy was Soviet Union products.
  • …And all these products usually had quality issues.
  • …Which meant that whenever I got some toy as present for christmas or my birthday – I first had to FIX it and only then I could play with it! :)
  • …That usually applied to toy cars, constructors and whatever non-trivial pieces of toys. Some things needed minor tweaking, some required complete disassembly and reassembly (sometimes you needed to improvise and come up with the missing pieces yourself :)
  • …Later on it also applied to bikes etc. The bike could be working in principle but you still needed to tune it to get it running smoothly (and to make sure that it wouldn’t break when you went downhill the max speed you could get).

So, there’s always something good in everything bad – I had to learn troubleshooting since I was a little kid! ;)

Tracing Oracle SQL plan execution with DTrace

SQL is a declarative language – in other words you just declare what needs to be done and Oracle takes care of the part how it’s done.

However there’s nothing declarative about the actual SQL execution when it happens. SQL plan is just a tree of kernel rowsource functions executed in a specific order (defined in child cursor’s sql area).

The root of SQL plan is where the fetch function (opifch2 for example) gets the rows for passing back to the user (or PL/SQL engine).
The branches are operations like joins, union etc, which don’t have access to any data themselves and can just call other functions recursively to get rows
The leaves are the execution plan operations without any children, they call data layer to acces actual datablocks.

The first execution plan line (with lowest ID) without any children is the one where data access starts, that’s the place where first logical IO happens.

A commented exec plan is below:

SQL> select count(*) from all_users;

  COUNT(*)
----------
        35

SQL> @x

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL_ID  b2zqhgr5tzbpk, child number 0
-------------------------------------
select count(*) from all_users

Plan hash value: 3268326079

--------------------------------------------------------------------------
| Id  | Operation            | Name  | E-Rows |  OMem |  1Mem | Used-Mem |
--------------------------------------------------------------------------
|   1 |  SORT AGGREGATE      |       |      1 |       |       |          |       <- ROOT
|*  2 |   HASH JOIN          |       |     35 |  1517K|  1517K|  637K (0)| <- BRANCH
|*  3 |    HASH JOIN         |       |     35 |  1593K|  1593K| 1361K (0)| <- BRANCH
|   4 |     TABLE ACCESS FULL| TS$   |     13 |       |       |          |   <- LEAF
|*  5 |     TABLE ACCESS FULL| USER$ |     35 |       |       |          |    <- LEAF
|   6 |    TABLE ACCESS FULL | TS$   |     13 |       |       |          |    <- LEAF
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("U"."TEMPTS#"="TTS"."TS#")
   3 - access("U"."DATATS#"="DTS"."TS#")
   5 - filter("U"."TYPE#"=1)

I have written about how to map execution plan lines back to kernel functions here:

http://blog.tanelpoder.com/2008/06/15/advanced-oracle-troubleshooting-guide-part-6-understanding-oracle-execution-plans-with-os_explain/

The above approach is based on pstack, mostly useful for demonstrations but has helped me to diagnose one spinning condition in an execution plan once (that’s the whole reason I came up with this technique).

As I said above, SQL execution just means that the kernel’s rowsource functions are executed in a loop with order and hierarchy specified in the child cursor’s execution plan.

So, if you want to learn and really understand the sequence of SQL plan execution – it’s dead easy with DTrace. Here’s what happens when you fetch from the above execution plan:

[Read more...]

Read OS environment variables using DBMS_SYSTEM.GET_ENV()

Check out this article by Vikram Das about how to read OS environment variables using PL/SQL.

I did not know that!