Asynch descriptor resize wait event in Oracle

A lot of people have started seeing “asynch descriptor resize” wait event in Oracle 11gR2. Here’s my understanding of what it is. Note that I didn’t spend too much time researching it, so some details may be not completely accurate, but my explanation will at least give you an idea of why the heck you suddenly see this event in your database.

FYI, there’s a short, but incomplete explanation of this wait event also documented in MOS Note 1081977.1

Update: There’s a bug and a patch related to this wait event too.

The “direct path loader” (KCBL) module is used for performing direct path IO in Oracle, such as direct path segment scans and reading/writing spilled over workareas in temporary tablespace. Direct path IO is used whenever you see “direct path read/write*” wait events reported in your session. This means that IOs aren’t done from/to buffer cache, but from/to PGA directly, bypassing the buffer cache.

 

This KCBL module tries to dynamically scale up the number of asynch IO descriptors (AIO descriptors are the OS kernel structures, which keep track of asynch IO requests) to match the number of direct path IO slots a process uses. In other words, if the PGA workarea and/or spilled-over hash area in temp tablespace gets larger, Oracle also scales up the number of direct IO slots. Direct IO slots are PGA memory structures helping to do direct IO between files and PGA.

 

In order to be able to perform this direct IO asynchronously, Oracle also dynamically scales up the number of OS asynch IO descriptors, one for each slot (up to 4096 descriptors per process). When Oracle doesn’t need the direct IO slots anymore (when the direct path table scan has ended or a workarea/tempseg gets cancelled) then it scales down the number of direct IO slots and asynch IO descriptors. Scaling asynch IO descriptors up/down requires issuing syscalls to OS (as the AIO descriptors are OS kernel structures).

 

I guess this is supposed to be an optimization, to avoid running out of OS AIO descriptors, by releasing them when not they’re not needed, but as that Metalink note mentioned, the resize apparently sucks on Linux. Perhaps that’s why other ports also suffer and have seen the same wait event.

 

The “asynch descriptor resize” event itself is really an IO wait event (recorded in the wait class Other though), waiting for reaping outstanding IOs. Once this wait is over, then the OS call to change the amount of asynch IO descriptors (allocated to that process) is made. There’s no wait event recorded for the actual “resize” OS call as it shouldn’t block.

 

So, the more direct IO you do, especially when sorting/hashing to temp with frequent workarea closing/opening, the more of this event you’ll see (and it’s probably the same for regular tablespace direct path IO too).

 

This problem wouldn’t be noticeable if Oracle kept async io descriptors cached and wouldn’t constantly allocated/free them. Of course then you may end up running out of aio descriptors in the whole server easier. Also I don’t know whether there would be some OS issues with reusing cached aio descriptors, perhaps there is a good reason why such caching isn’t done.

 

Nevertheless, what’s causing this wait event is too frequent aio descriptor resize due to changes in direct IO slot count (due to changes in PGA workarea/temp segment and perhaps when doing frequent direct path scans through lots of tables/partitions too).

 

So, the obvious question here is what to do about this wait event? Well, first you should check how big part of your total response time this event takes at all?

 

  1. If it’s someting like 1% of your response time, then this is not your problem anyway and troubleshooting this further would be not practical – it’s just how Oracle works :)
  2. If it’s something like 20% or more of your response time, then it’s clearly a problem and you’d need to talk to Oracle Support to sort out the bug
  3. If it’s anything in between, make sure you don’t have an IO problem first, before telling that this is a bug. In one recent example I saw direct path reads take over a second on average when this problem popped up. The asynch descriptor resize wait event may well disappear from the radar once you fix the root cause – slow IO (or SQL doing too much IO). Remember, the asynch descriptor resize wait event, at least on Linux, is actually an IO wait event, the process is waiting for outstanding IO completion before the descriptor count increase/decrease can take place.

Read currently running SQL statement’s bind variable values using V$SQL_MONITOR.BIND_XML in Oracle 11.2

The title pretty much says it. In Oracle 11.2, if you have the Diag+Tuning Pack licenses and the SQL monitoring kicks in for your SQL statement, then instead of the old fashioned ERRORSTACK dump reading you can just query the V$SQL_MONITOR.BIND_XML to find the values and metadata of your SQL statement’s bind variables.

I’ve written an example here:

And a related comment – V$SQL_BIND_CAPTURE is not a reliable way for identifying the current bind variable values in use. Oracle’s bind capture mechanism does not capture every single bind variable into SGA (it would slow down apps which run lots of short statements with bind variables). The bind capture only selectively samples bind values, during the first execution of a new cursor and then every 15 minutes from there (controlled by _cursor_bind_capture_interval parameter), assuming that new executions of that same cursor are still being started (the capture happens only when execution starts, not later during the execution).

Exadata v2 Smart Scan Performance Troubleshooting article

I finally finished my first Exadata performance troubleshooting article.

This explains one bug I did hit when stress testing an Exadata v2 box, which caused smart scan to go very slow – and how I troubleshooted it:

Thanks to my secret startup company I’ve been way too busy to write anything serious lately, but apparently staying up until 6am helped this time! :-) Anyway, maybe next weekend I can repeat this and write Part 2 in the Exadata troubleshooting series ;-)

Enjoy! Comments are welcome to this blog entry as I haven’t figured out a good way to enable comments in the google sites page I’m using…

Dropping and creating tables in read only tablespaces?!

You probably already know that it’s possible to drop tables in Oracle read only tablespaces… (You did know that already, right? ;-) Here’s a little example:

SQL> create tablespace ronly datafile '/u03/oradata/LIN112/ronly.01.dbf' size 10m;

Tablespace created.

SQL> create table test tablespace ronly as select * from all_users;

Table created.

SQL> alter tablespace ronly READ ONLY;

Tablespace altered.

SQL> drop table test;

Table dropped.

I just dropped a table from a read only tablespace! Well, perhaps it’s because that instead of dropping the table was put into recyclebin instead (which is a data dictionary update)? Let’s check which segments remain in the RONLY tablespace:

SQL> select owner,segment_name,segment_type from dba_segments where tablespace_name = 'RONLY';

OWNER   SEGMENT_NAME                    SEGMENT_TYPE
------- ------------------------------- ------------------
TANEL   BIN$ix7rAUXZfB3gQKjAgS4LXg==$0  TABLE

Indeed, it seems that the table segment wasn’t actually dropped. Well, let’s purge the recycle bin to try to actually drop the table segment:

SQL> purge recyclebin;

Recyclebin purged.

SQL> select owner,segment_name,segment_type from dba_segments where tablespace_name = 'RONLY';

OWNER    SEGMENT_NAME                   SEGMENT_TYPE
-------- ------------------------------ ------------------
TANEL    9.130                          TEMPORARY

Wow, Oracle has converted the table segment into a temporary segment instead (see segment_type)! Bur our tablespace is read only, how can it do that?! The answer is that neither the regular DROP nor DROP PURGE need to write anything into the tablespace where the segment resides! The initial DROP operation just updated data dictionary, like renaming the table to BIN$… in OBJ$ and so on. The second DROP PURGE operation just ran a bunch of deletes against data dictionary to indicate that the table object is gone. But why is the TEMPORARY segment left behind? This has to do with locally managed tablespaces. Before LMT days, when you dropped a segment, then the segment space was released and acquired back to tablespace through inserts/updates to UET$/FET$ (used/free extents) base tables, which resided in system tablespace like all other data dictionary base tables. But with LMTs, the free space information is kept in bitmaps in the tablespace files themselves! Thus, if you drop a table in a read only LMT tablespace, the table will be gone, but the space will not be physically released (as you can’t update the LMT bitmaps in read only tablespace files). However, Oracle doesn’t want to lose that space should someone make the tablespace read write later on, so the table segment is updated to be a TEMPORARY segment instead of completely deleting it from data dictionary. That’s how the SMON can clean it up later on should that tablespace become read-write again. The 9.130 in SEGMENT_NAME column means relative file# 9 and starting block# 130, that’s a segment’s unique identifier in a tablespace. Let’s move on. This example is executed on Oracle 11.2, while logged on to a non-SYS/SYSTEM user:

SQL> select status from dba_tablespaces where tablespace_name = 'RONLY';

STATUS
---------
READ ONLY

The tablespace is still in read only status. Let’s try to CREATE a table into that tablespace:

SQL> create table test(a int) tablespace ronly;

Table created.

What? I can also CREATE a table into a read only tablespace?! Well, this is the behavior you get starting from 11gR2 onwards, it’s called deferred segment creation, Oracle doesn’t need to create any segments for a table until you actually insert rows into it! So, Oracle creates all needed metadata about the new table in data dictionary, but doesn’t actually allocate any space from the tablespace. This applies to other segment types like index and table/index partitions as well. There’s new syntax which controls the deferred segment creation:

SQL> drop table test purge;

Table dropped.

SQL> create table test (a int) segment creation IMMEDIATE tablespace ronly;
create table test (a int) segment creation IMMEDIATE tablespace ronly
*
ERROR at line 1:
ORA-01647: tablespace 'RONLY' is read-only, cannot allocate space in it

In the above case I disabled the deferred segment creation and got an error immediately as Oracle tried to allocate space from the read only tablespace. Let’s try the other way:

SQL> create table test (a int) segment creation DEFERRED tablespace ronly;

Table created.

Now, with deferred segment creation enabled, Oracle didn’t try to allocate space from the read only tablespace. Let’s look into the segments in that tablespace again:

SQL> select owner,segment_name,segment_type from dba_segments where tablespace_name = 'RONLY';

OWNER        SEGMENT_NAME    SEGMENT_TYPE
------------ --------------- ---------------
TANEL        9.130           TEMPORARY

We still have that old segment from an earlier DROP TABLE operation waiting to be cleaned up (when the tablespace goes into read-write mode again), but no segment for our latest test table created. Let’s make the tablespace read write and check again:

SQL> alter tablespace ronly read write;

Tablespace altered.

SQL> select owner,segment_name,segment_type from dba_segments where tablespace_name = 'RONLY';

no rows selected

Apparently SMON has kicked in already and cleaned up the table segment which had been marked temporary earlier when I dropped the table in a read only tablespace.

Note that tables in SYS and SYSTEM schema can not use deferred segment creation:

SQL> create table sys.test (a int) segment creation deferred;
create table sys.test (a int) segment creation deferred
*
ERROR at line 1:
ORA-14223: Deferred segment creation is not supported for this table

SQL> create table system.test (a int) segment creation deferred;
create table system.test (a int) segment creation deferred
*
ERROR at line 1:
ORA-14223: Deferred segment creation is not supported for this table

SQL> create table tanel.test (a int) segment creation deferred;

Table created.

There’s also a parameter, deferred_segment_creation which controls the default behavior:

SQL> show parameter deferred

NAME_COL_PLUS_SHOW_PARAM     TYPE        VALUE
---------------------------- ----------- --------
deferred_segment_creation    boolean     TRUE

SQL>

KGH: NO ACCESS – Buffer cache inside streams pool too!

Some time ago I wrote that since Oracle 10.2, some of the buffer cache can physically reside within shared pool granules.

I just noticed this in an 11.2 instance:

SQL> select * from v$sgastat where name like ‘KGH%’;
POOL         NAME                            BYTES
------------ -------------------------- ----------
streams pool KGH: NO ACCESS                4186144
So, it looks that also streams pool can surrender parts of its memory granules to buffer cache, if it’s unable to flush everything out from the granule for complete granule handover.
Let’s see whether that’s the case:

SQL> select last_oper_type, last_oper_mode from v$sga_dynamic_components where component = 'streams pool';
LAST_OPER_TYP LAST_OPER
------------- ---------
SHRINK        DEFERRED
Yep, the last streams pool shrink operation was left in DEFERRED status, which means the granule wasn’t handed over – streams pool kept the granule for itself, marked everything it could flush out as KGH: NO ACCESS in its heap header and handed these chunks over to buffer cache manager.
Lets check whether these chunks are actually used by buffer cache buffers:
(NB! Think twice before running this query in production as it may hold your shared pool latches for very long time):
SQL> select ksmchidx,ksmchdur,ksmchcom,ksmchptr,ksmchsiz,ksmchcls from x$ksmsst where ksmchcom = 'KGH: NO ACCESS';
  KSMCHIDX   KSMCHDUR KSMCHCOM         KSMCHPTR           KSMCHSIZ KSMCHCLS
---------- ---------- ---------------- ---------------- ---------- --------
         1          4 KGH: NO ACCESS   00000003A2401FE0    4186144 no acce
So, there’s only one chunk flushed & handed over to buffer cache in the streams pool heap. The KSMCHPTR column shows the starting address of that chunk in SGA and the KSMCHSIZ is the size of that chunk.
So, let’s see if there are any buffers within that address range. First I’ll calculate the end address of that chunk (start address + size -1 = end address)

SQL> @calc 0x00000003A2401FE0 + 4186143
                     DEC                  HEX
------------------------ --------------------
         15611199487.000            3A27FFFFF
And now lets query X$BH using that address range to see if / how many buffer cache buffers have been placed in there:

SQL> select count(*) from x$bh where rawtohex(ba) between '00000003A2401FE0' and '00000003A27FFFFF';

  COUNT(*)
----------
       483
We have just proven, that there are 483 buffer cache buffers (~3.8MB, 8kB buffers) which reside physically in streams pool heap!

New seminars and dates announced

I have been very busy over last months (as you see from the lack of blog entries). Part of the reason is that I’ve been building new seminar material and now I’m pleased to announce some first seminar dates!

I have updated new seminar dates and cities in m new webpage:

From April 2010 I offer total 3 different seminars

I have rearranged the Advanced Oracle Troubleshooting class based on customer feedback, removed some content, added new content and I think this deserves a new version number, 2.0.

Also, I created an entirely new class Advanced Oracle SQL Tuning which should provide the same for SQL tuners that my Advanced Oracle Troubleshooting class has provided for database troubleshooters. This class will not start with CBO concepts and how SQL execution might work in theory, insead we will start from going very deep into understanding how Oracle really executes SQL execution plans and what is the data flow order and hierarchy in the execution plan tree.

From there we go on into learning how to read execution plans of any complexity and how to control SQL execution plans – how to make them do exactly what we want. And CBO topics will come in the end – by then the CBO fundamental concepts such as Cardinality, Density and IO/CPU Cost will make good sense and are not just some arbitrary names for some magic numbers coming from the optimizer ;-)

In addition, I separated Parallel Execution and Partitioning topics (which not everyone is using) into a separate 1-day seminar, Oracle Partitioning and Parallel Execution for Performance, which I usually deliver right after the 3-day SQL tuning seminar.

In coming days I also plan to upload some SQL tuning related content to tech.e2sn.com to show the quality of the upcoming seminar ;-)

So, feel free to check out the seminar dates and descriptions here:

Oracle 11gR2 has been released – and with column oriented storage option

You may already have noticed that Oracle 11gR2 for Linux is available for download on Oracle.com website, with documentation.

And this document ends speculation about whether Oracle 11.2 will support column-oriented storage – yes it will:

http://www.oracle.com/technology/products/database/oracle11g/pdf/oracle-database-11g-release2-overview.pdf

However, this is apparently available on Exadata storage only as a new error message below indicates:

ORA-64307: hybrid columnar compression is only supported in tablespaces residing on Exadata storage
Cause: An attempt was made to use hybrid columnar compression on unsupported storage.
Action: Create this table in a tablespace residing on Exadata storage or use a different compression type.

Update: Kevin Closson mentioned that 11gR2 doesn’t really have column oriented storage as some other products like Vertica’s and Sybase IQ use, but its rather just column oriented compression option where storage is still organized by row but individual fields in these rows use compression dictionaries whichcan span multiple block boundaries (we’ll thats my interpretation at least).

The 11gR2 release overview doc seems to be wrong in this case, as it says:

Hybrid columnar compression is a new method for organizing how data is stored. Instead of
storing the data in traditional rows, the data is grouped, ordered and stored one column at a time.

Read Kevin’s note here:

http://kevinclosson.wordpress.com/2009/09/01/oracle-switches-to-columnar-store-technology-with-oracle-database-11g-release-2/

ORA-64307: hybrid columnar compression is only supported in tablespaces residing on Exadata storage
Cause: An attempt was made to use hybrid columnar compression on unsupported storage.

Action: Create this table in a tablespace residing on Exadata storage or use a different compression type.