(Secret) Preview of Oracle 12g CBO leaked from Oracle labs…

After doing my seminar in Spain last month, one of the attendees sent me a secret note about what the CBO would look like in Oracle 12g. Apparently it’s re-engineered from ground and completely different from anything we’ve seen before. It goes beyond being automatic, intelligent and auto-tuning, it’s actually edible too!

The New CBO

The New CBO

Thanks to John Ospino for sending me this insider-info ;)

Oracle memory troubleshooting, Part 3: Automatic top subheap dumping with heapdump

If you haven’t read them – here are the previous articles in Oracle memory troubleshooting series: Part 1, Part 2

In Oracle, the HEAPDUMP dump event in Oracle allows you to dump various heap contents to tracefile. With adding the “level” parameter to this dump event, you can specify which heaps to dump.

Julian Dyke has documented most of the levels here.

There are two little known, but useful level bits for heapdumps – bit 0×10000000 and 0×20000000. These allow Oracle to dump top-5 biggest subheaps in a heap recursively.

When bit 0×10000000 is enabled then Oracle dumps the top-5 subheaps inside any heap its instructed to dump.
When bit 0×20000000 is enabled then Oracle dumps the top-5 subheaps as mentioned above, but in addition Oracle recursively dumps top-5 subheaps inside any subheaps automatically dumped above. So instead of dumping 1 level of subheaps, Oracle recursively dumps 2 levels if such sub-sub-heaps exist.

This reduces the amount of manual work – as Oracle can drill down towards the root cause automatically and dump the relevant information.

Here’s a little test case:

I set the serverout buffer unlimited (Oracle 10.2+) so that Oracle would buffer unlimited amount of dbms_output data in UGA (this is also a “good” way for using up all the memory in your server so you use “unlimited” with care).

SQL> set serverout on size unlimited
SQL>
SQL> exec for i in 1..1000000 loop dbms_output.put_line(lpad('x',10,'x')); end loop;

Without the recursive top subheap dumping we would see output like this (after processing the tracefile with heapdump_analyzer):

This is the usual way for dumping a target process private heap:

SQL> oradebug setorapid 35
Oracle pid: 35, Unix process pid: 26457, image: oracle@linux03
SQL> oradebug dump heapdump 1     <-- level 1 dumps all top level private heaps (top-level PGA,UGA and call heap)
Statement processed.
SQL>

And the output is:

[oracle@linux03 trace]$ heapdump_analyzer LIN11G_ora_26486.trc

  -- Heapdump Analyzer v1.00 by Tanel Poder ( http://www.tanelpoder.com )

  Total_size #Chunks  Chunk_size,        From_heap,       Chunk_type,  Alloc_reason
  ---------- ------- ------------ ----------------- ----------------- -----------------
    55065316     841      65476 ,     top uga heap,         freeable,  session heap          <-- session heap is allocated from top uga heap
    41218392    2517      16376 ,     session heap,         freeable,  koh-kghu sessi      <-- koh-kghu session heap is allocated from session heap
    13650208     836      16328 ,     session heap,             free,
       65520       1      65520 ,         pga heap,             free,
       65476       1      65476 ,     top uga heap,         recreate,  session heap
       57736      14       4124 ,     session heap,         freeable,  kxsFrame4kPage
...

We would see that most of the uga memory (roughly 41MB of 55MB) is allocated for for some reason “koh-kghu sessi”. This is a session heap where from allocations for various objects like PL/SQL variables, records and collections are done. So when we’d want to drill down and see inside that heap we could use the HEAPDUMP_ADDR dump with that heap descriptors address as parameter to look inside it.

However with these extra bits mentioned above, Oracle can automatically dump us the contents of biggest subheaps inside the heaps we asked it to dump:

[Read more...]

Links section

I have added a links section into my blog where I put links to useful external documents and tools what I often refer to during my seminars and consulting.

I will make additions to that page over time…

To see the list, you can click on the “Seminar Links” in top right section of the blog page or just click here:

http://blog.tanelpoder.com/seminar/seminar-links/

Using Perfsheet and TPT scripts for solving real life performance problems

Karl Arao has written a nice blog entry where he used also some of my tools for visualizing and understanding performance of a RAC cluster suddenly gone slow. He presents a systematic approach he used for gathering evidence and also hopefully you notice that having the ability to easily visualize performance data (with PerfSheet for example) can be very helpful and time saving when troubleshooting non-trivial problems.

Check his blog entry out here:

http://karlarao.wordpress.com/2009/06/07/diagnosing-and-resolving-gc-block-lost/

Advanced Oracle Troubleshooting Seminar list for 2009

This is the final list of seminars planned for 2009.

Except these I won’t be doing any more public AOT seminars this year as my calendar is almost full up to January 2010.

So if you like my writing, Oracle internals & OS touchpoint, and want to troubleshoot better, check out events in this list :)

15-17. June – Trutek @ Denver, COhttp://www.trutek.com/index.php?id=165

9-11. September – Hotsos @ Dallas, TXhttps://portal.hotsos.com/education/SEM/

14-16. September – Hotsos @ Washington, DChttps://portal.hotsos.com/education/SEM/

5-6. October – Miracle @ London, UKhttp://www.miraclebenelux.nl/tanel/

8-9. October – Miracle @ Helsinki, Finlandhttp://www.miracleoy.fi/

2-3. November – Miracle @ Utrecht, Netherlandshttp://www.miraclebenelux.nl/tanel/

5-6. November – Miracle @ Brussels, Belgiumhttp://www.miraclebenelux.nl/tanel/

10-11. November – Northern California (SFO Bay Area, more details in mid-June)

3-4. December – Miracle @ Birmingham, UK (right after UKOUG conference) – http://www.miraclebenelux.nl/tanel/

ORA-04031 errors and monitoring shared pool subpool memory utilization with sgastatx.sql

Since Oracle 9.2 the shared pool can be “partitioned” into multiple parts. This was probably done for relieving shared pool latch contention for crappy applications (which use shared pool latches too much due bad cursor or connection management).

The “partitions” are called shared pool subpools and there can be up to 7 subpools. Each subpool is protected by a separate shared pool latch and each subpool has its own freelists and LRU list. If you are interested in more details, a good starting point is this whitepaper by Oracle.

There are few different ways for detecting how many subpools you have in use. The more convenient ones are here:

You could query X$KGHLU which has a line for each shared pool subpool and (from 10g) also java pool if it’s defined:

SQL> select count(distinct kghluidx) num_subpools
  2  from x$kghlu
  3  where kghlushrpool = 1;

NUM_SUBPOOLS
------------
           7

The “kghlushrpool” column, which is 1 for shared pool subheaps and 0 for java pool, isn’t there in 9i (and in 9i the java pool apparently is not reported in x$kghlu anyway).
The reason why I don’t just count all matching lines from x$kghlu but use count distinct instead is that in Oracle 10.2.0.1 there are 4x more lines reported in this x$table. There’s an additional concept called sub-sub-pool starting from 10.2 where each shared pool sub-pool is split futher into 4 areas (allocations with different expected lifetime/durations go into different sub-sub-pools, but the same sub-pool latch protects all activity in sub-sub pools too). But in 10.2.0.1 the x$kghlu reports all sub-sub-pools too for some reason. The whitepaper from Oracle mentioned above explains this in more detail.

So from above output I see that in my instance all 7 shared pool subpools are in use. Oracle determines the number of needed subpools (during instance startup) based on your shared pool size and cpu_count. IIRC in 9.2 if you had 4 CPUs or more AND the shared_pool_size was bigger than 256 MB then 2 subpools were used, in 10g shared_pool_size had to be bigger for that, 512 MB I think and in 11g its 1GB. I don’t recall the exact threshold values and that’s not really important as you can see yourself how many subpools are in use with the above query.

For sake of this experiment I set the _kghdsidx_count variable to 7, this parameter can be used to force the number of subpools you want. In 9.2 days it was actually quite common to set this back to 1 IF you had ORA-4031 errors AND the reason was diagnosed to be free space imbalance between subpools. However since 10g this has been almost unnecessary as Oracle has improved their heap management algorithms.

SQL> @pd kghdsidx

NAME                                          VALUE                          DESCRIPTION
--------------------------------------------- ------------------------------ ------------------
_kghdsidx_count                               7                              max kghdsidx count

The script above queries few X$ tables to show the value of this hidden parameter.

So far the two above approaches have required access to X$ tables which usually means you need to be logged on as SYSDBA. What if you don’t have such access?

In such case you can work this out pretty reliably by looking into how many of the shared pool latches are actually in use. All 7 latches are always there, even if you have less subpools in use, that number is hardcoded into Oracle. But you can see how many latches have a significant number of gets against them.

In my case its evident that all latches are in use, they all have significant number of gets against them:

[Read more...]