Oracle Core: Essential Internals for DBAs and Developers book by Jonathan Lewis

In case you didn’t know, Jonathan Lewis’es new Oracle Core: Essential Internals for DBAs and Developers book is out (for a few weeks already).

I was the technical reviewer for that book and I can say it’s awesome! It will likely be the best Oracle internals book out there for the coming 10 years, just like Steve Adams’es Oracle Internal Services book was in the last decade :) 

Jonathan does a very good job explaining complex things in a simple enough way – and the book is not just dry listing of how things work inside Oracle database, but also why they work like they do and what are the benefits, limitations and side effects of the behavior.

So check out Jonathan’s book page for more details (and I think you can still order a heavily discounted ebook from Apress too):

Enjoy :)

What the heck is the SQL Execution ID – SQL_EXEC_ID?

Ok, I think it’s time to write another blog entry. I’ve been traveling and dealing with jetlag from 10-hour time difference, then traveling some more, spoken at conferences, drank beer, had fun, then traveled some more, trained customers, hacked some Exadatas and now I’m back home.

Anyway, do you know what is the SQL_EXEC_ID in V$SESSION and ASH views?

Oh yeah, it’s the “SQL Execution ID” just like the documentation says … all clear. Um … is it? I’d like to know more about it – what does it actually stand for?! Is it session level, instance level or a RAC-global counter? And why does it start from 16 million, not 1?

 

SQL> SELECT sql_exec_id FROM v$session WHERE sid = USERENV('SID');

SQL_EXEC_ID
-----------
   16777216

 

This number 16777216 looks strangely familiar – indeed, it’s 2^24.

When I run the same query again (incrementing the SQL_EXEC_ID counter for the same SQL), I see the counter going up by 1:

 

SQL> SELECT sql_exec_id FROM v$session WHERE sid = USERENV('SID');

SQL_EXEC_ID
-----------
   16777217

SQL> SELECT sql_exec_id FROM v$session WHERE sid = USERENV('SID');

SQL_EXEC_ID
-----------
   16777218

SQL> SELECT sql_exec_id FROM v$session WHERE sid = USERENV('SID');

SQL_EXEC_ID
-----------
   16777219

SQL> SELECT sql_exec_id FROM v$session WHERE sid = USERENV('SID');

SQL_EXEC_ID
-----------
   16777220

 

Further executions of the same query keep incrementing this counter, one by one – even if I run this same SQL from another session in the same instance. So, this SQL_EXEC_ID is not a session-scope value for each SQL_ID, it’s at least instance-wide. It looks like the counting starts from 2^24 (the bit representing 2^24 is set) and ignoring that bit for now, the counting works normally, one by one, starting from zero.

 

Note that changing even a single character in the SQL text (see the extra space in the end before the semi-colon) causes the SQL_ID to change and a different SQL_EXEC_ID counter to be reported (which starts from “zero” again). A separate SQL_EXEC_ID counter is maintained in shared pool for each SQL_ID:

SQL> SELECT sql_exec_id FROM v$session WHERE sid = USERENV('SID') ;

SQL_EXEC_ID
-----------
   16777216

 

So, obviously, when I have just restarted my instance and still see 16777216 as the starting SQL_EXEC_ID for any SQL I execute, it must mean that the full SQL_EXEC_ID value contains something else than just the execution number of this SQL_ID. Whenever I see such familiar values (like powers of 2), then I like to look into the values in hex format to see whether some higher order bits are used for some special purpose. Let’s run a new SQL statement:

 

SQL> SELECT sql_exec_id, TO_CHAR(sql_exec_id,'XXXXXXXX') hex FROM v$session WHERE sid = USERENV('SID') ;

SQL_EXEC_ID HEX
----------- ---------
   16777216   1000000

SQL> SELECT sql_exec_id, TO_CHAR(sql_exec_id,'XXXXXXXX') hex FROM v$session WHERE sid = USERENV('SID') ;

SQL_EXEC_ID HEX
----------- ---------
   16777217   1000001

SQL> SELECT sql_exec_id, TO_CHAR(sql_exec_id,'XXXXXXXX') hex FROM v$session WHERE sid = USERENV('SID') ;

SQL_EXEC_ID HEX
----------- ---------
   16777218   1000002

 

Indeed, it looks like the 25th bit (2^24) is always pre-set to 1, while the least significant 24 bits represent how many times this SQL ID has been executed in an instance (I have tested this with a loop – the 24 least significant bits do get used fully for representing the SQL ID’s execution count in the instance and once it reaches 0xFFFFFF – or 0x1FFFFFF with that pre-set 25th bit, it wraps to 0×1000000 – the 25th bit still remaining set!). So the SQL_EXEC_ID can reliably only track 2^24 – 1 SQL executions in an instance and then the counter wraps to beginning. This is why you should include SQL_EXEC_START (date datatype with 1 sec precision) column in your performance monitoring queries as well, to distinguish between SQL executions with a colliding SQL_EXEC_ID. As long as you’re executing your SQL statement less than 16.7 million times per second per instance, this should be fine :-)

 

Anyway, so what’s the magic 25th bit then? Well, in RAC it would be very hard to somehow coordinate the incrementing of a single counter globally (that’s why you want to keep your sequences cached in RAC), I figure that there are different counters for the same SQL ID in different RAC instances. Let’s check – I will log in to another RAC node (node 2) and run this:

 

SQL> SELECT sql_exec_id, TO_CHAR(sql_exec_id,'XXXXXXXX') hex FROM v$session WHERE sid = USERENV('SID') ;

SQL_EXEC_ID HEX
----------- ---------
   33554433   2000001

SQL> SELECT sql_exec_id, TO_CHAR(sql_exec_id,'XXXXXXXX') hex FROM v$session WHERE sid = USERENV('SID') ;

SQL_EXEC_ID HEX
----------- ---------
   33554434   2000002

SQL> SELECT sql_exec_id, TO_CHAR(sql_exec_id,'XXXXXXXX') hex FROM v$session WHERE sid = USERENV('SID') ;

SQL_EXEC_ID HEX
----------- ---------
   33554435   2000003

 

Whoa – the SQL Execution ID in the 2nd instance starts from 33 Million! And when you convert the value to hex, you’ll see that now the 26th bit is set – showing that this SQL was executed in instance #2!

So, it very much looks like that while the 24 least significant bits are used for the SQL execution ID counter, the more significant bits are used for showing which instance_id ran that SQL. Assuming that 32 bits are used for the whole SQL_EXEC_ID value, then up to 8 higher order bits could be used for storing the instance_id – supporting up to 256-node RAC clusters. This is very useful when analyzing past ASH data as you can aggregate data (count min/max exec ID difference to get the execution counts in a time range) either in each separate instance or globally – by stripping out the instance_id part from the value.

I haven’t tested the instance_id part with 256-node RAC clusters (as Santa Claus is cutting back due to poor economy), but at least on an 8-node full rack Exadata all 8 instance_ids were reported properly. Note that for serial queries, the SQL_EXEC_ID shows you the instance_id of the instance where the session is logged on to, but for inter-instance parallel query, you will see the instance_id of the query coordinator for all PX slaves, regardless of in which instances they run. Here’s a little script from a 8-node Exadata cluster to show it. I’ll leave it up to you to fully figure it out what, how and why it’s doing, but basically what it shows is that the SQL_EXEC_ID consists of the query coordinator’s instance_id value and the execution number for a SQL_ID in the instance where the query coordinator session was logged in:

 

SQL> SELECT qc_instance_id, MIN(TO_CHAR(sql_exec_id,'XXXXXXXX'))
  2  , MAX(TO_CHAR(sql_exec_id,'XXXXXXXX'))
  3* FROM gv$active_session_history GROUP BY qc_instance_id order by 1
SQL> /

QC_INSTANCE_ID MIN(TO_CH MAX(TO_CH
-------------- --------- ---------
             1   1000000   100540F
             2   2000000   20009BF
             3   3000000   300541E
             4   4000000   40000DD
             5   5000000   50C5035
             6   6000000   600018C
             7   7000000   700023D
             8   8000000   8000755
                 1000000   803DF3B

9 rows selected.

 

That’s all for today – more cool stuff is coming, I promise :-)

 

And oh, next week I’ll start another run of my Advanced Oracle Troubleshooting seminar, so check it out! ;-)

V8 Bundled Exec call – and Oracle Program Interface (OPI) calls

So, what he hell is that V8 Bundled Exec call which shows up in various Oracle 11g monitoring reports?!

It’s yet another piece of instrumentation which can be useful for diagnosing non-trivial performance problems. Oracle ASH has allowed us to measure what is the top wait event or top SQLID for a long time, but now it’s also possible to take a step back and see what type of operation the database session is servicing. 

I am talking about Oracle Program Interface (OPI) calls. Basically for each OCI call in the client side (like , OCIStmtExecute, OCIStmtFetch, etc) there’s a corresponding server side OPI function (like opiexe(), opifch2() etc). 

It has been possible to trace all the OPI calls with event 10051 as I’ve explained here, but since Oracle 11g this data is also conveniently accessible from ASH views (the various monitoring reports, including SQL Monitoring report also use ASH data for some features).

So, I can write a simple query against ASH, which doesn’t group the samples by wait event or SQLID, but just by the general OPI call type (TOPLEVELCALLNAME column) and also by the SQL command type (using V$SQLCOMMAND in 11.2):  

SQL> SELECT
  2      a.top_level_call#
  3    , a.top_level_call_name
  4    , a.top_level_sql_opcode
  5    , s.command_name
  6    , COUNT(*)
  7  FROM
  8      v$active_session_history a
  9    , v$sqlcommand s
 10  WHERE
 11      a.top_level_sql_opcode = s.command_type
 12  GROUP BY
 13      a.top_level_call#
 14    , a.top_level_call_name
 15    , a.top_level_sql_opcode
 16    , s.command_name
 17  ORDER BY
 18*     COUNT(*) DESC
 19  /

TOP_LEVEL_CALL# TOP_LEVEL_CALL_NAME     top_op# COMMAND_NAME                     COUNT(*)
--------------- -------------------- ---------- ------------------------------ ----------
             94 V8 Bundled Exec               7 DELETE                              10505
              0                               0                                      4041
             59 VERSION2                      0                                       579
             59 VERSION2                     47 PL/SQL EXECUTE                        377
             59 VERSION2                      3 SELECT                                191
             96 LOB/FILE operations         170 CALL METHOD                            67
             59 VERSION2                    170 CALL METHOD                            66
             94 V8 Bundled Exec               6 UPDATE                                 52
             59 VERSION2                      6 UPDATE                                 41
             59 VERSION2                      7 DELETE                                 36
             94 V8 Bundled Exec               3 SELECT                                 24
             96 LOB/FILE operations          47 PL/SQL EXECUTE                         18
             59 VERSION2                      2 INSERT                                  8
             94 V8 Bundled Exec               1 CREATE TABLE                            5
              0                               3 SELECT                                  3
             59 VERSION2                     15 ALTER TABLE                             1
             96 LOB/FILE operations           0                                         1
             59 VERSION2                     12 DROP TABLE                              1
              5 FETCH                         3 SELECT                                  1
             94 V8 Bundled Exec              12 DROP TABLE                              1

20 rows selected.

 

Aas you see above, most of the ASH samples in my test database have been created by a DELETE type SQL statement, executed via V8 Bundled Exec type of an OPI call.

So, what is this call about? Let’s explain its name. Look into other call types in the above output. In the bottom you see a FETCH call (fetching from a SELECT type statement). Also there are a few LOB/FILE operations calls, which are used exclusively for accessing LOB data (via LOB locator, bypassing the usual SQL processing layer).

In Oracle 7 you would also see PARSE and EXECUTE calls, but starting from Oracle 8 not anymore. This is because starting from Oracle 8, the OPI layer in database side can accept bundled OCI calls from the client – to reduce the number of network roundtrips. So, basically instead of sending the PARSE and EXEC requests in separate SQLNet roundtrips (increasing the latency), the OCI client libraries can bundle these requests together and send them to database in one SQLNet payload. The database server side understands it and is able to extract these separate OPI requests from the bundled packet (in right order) and execute the corresponding OPI function for each separate call. 

Note that this is why you frequently see the kpoal8() function close to the beginning in Oracle server process stack traces (and where I usually start reading them from), this is the function which processes all the OPI requests sent to it in a bundled package. So, whenever there’s a OCIStmtExecute() call extracted from the bundle, the opiexe() function is called in Oracle kernel with appropriate arguments extracted from the same bundle. Whenever we extract an OCIStmtFetch2() call from the bundle, the corresponding opifch2() function is called in the kernel.

Hopefully this explains why is there such a call “V8 Bundled Exec” in Oracle. It just allows to reduce client – server communication latency by allowing to bundle multiple database requests together into a single SQL*Net payload. In other words, it’s just how Oracle works and it’s perfectly normal to see V8 Bundled Exec as the top OPI call type in performnace reports. If you see this OPI call as the top one, then you’d need to drill down into what’s the actual SQLID which consumes the most of the response time (and further breakdown like which wait event and execution plan step takes the most time).

But the ability of breaking down database response time by OPI call becomes much more useful when troubleshooting somewhat more exotic performance problems like LOB access times (where there’s no SQL statement associated with the database call) or other direct OPI calls which are executed without parsing and running a SQL cursor.

For example, have you noticed that the behavior of ROLLBACK command in sqlplus is different from the shorter ROLL command?

When you issue a ROLLBACK, then Oracle will actually send the string “ROLLBACK” to the database as a regular SQL statement (using V8 Bundled Exec), it will be parsed there (with all the latching and shared pool overhead) as regular SQL – and then Oracle realizes that the command in it is a rollback. Then rollback is performed.

But if you issue a ROLL command, then sqlplus understands it and doesn’t send it to the database for parsing like a regular SQL statement. Instead it will send an OCITransRollback() call, which will call the corresponding OPI function directly, bypassing the SQL processing layer completely. Instead of the usual “V8 Bundled Exec” bundle call you would see a “Transaction Commit/Rollback” OPI call as it was called directly, without any SQL statement processing involved. This is why you sometimes see WAIT#0 lines in SQLTrace, where the waits seem to be associated with some non-existent cursor #0. Whenever the wait happens when the database session is servicing an OPI call which bypasses the SQL processing layer (kks/kkx modules) then the SQLTrace just shows cursor#0 as the wait’s “owner”.

The same happens when using things like connection.commit() in JDBC, the client does not send a SQL statement with text “commit” into the datbase, but rather will call out the OPI commit function out directly.

So, how many different OPI calls are there? Well, a lot, as you can see from v$toplevelcall (or its underlying x$orafn) in Oracle 11.2:

 

SQL> SELECT * FROM v$toplevelcall;

TOP_LEVEL_CALL# TOP_LEVEL_CALL_NAME
--------------- ----------------------------------------------------------------
              0
              2 OPEN
              3 PARSE
              4 EXECUTE
              5 FETCH
              8 CLOSE
              9 LOGOFF
             10 DESCRIBE
             11 DEFINE
             12 COMMIT ON
             13 COMMIT OFF
             14 COMMIT
             15 ROLLBACK
             16 SET OPTIONS
             17 RESUME
             18 VERSION
             20 CANCEL
             21 GET ERR MSG
             23 SPECIAL
             24 ABORT
             25 DEQ ROW
             26 FETCH LONG
             31 HOW MANY
             32 INIT
             33 CHANGE USER
             34 BIND REF POS
             35 GET BIND VAR
             36 GET INTO VAR
             37 BINDV REF
             38 BINDN REF
             39 PARSE EXE
             40 PARSE SYNTAX
             41 PARSE SYNSDI
             42 CONTINUE
             43 ARRAY DESC
             44 INIT PARS
             45 FIN PARS
             46 PUT PAR
             48 START ORACLE
             49 STOP ORACLE
             50 RUN IND PROC
             52 ARCHIVE OP
             53 MED REC STRT
             54 MED REC TABS
             55 MED REC GETS
             56 MED REC RECL
             57 MED REC CANC
             58 LOGON
             59 VERSION2
             60 INIT
             62 EVERYTHING
             65 DIRECT LOAD
             66 UL BUFFER XMIT
             67 DISTRIB XACTION
             68 DESCRIBE INDEXES
             69 SESSION OPS
             70 EXEC w/SCN
             71 FAST UPI
             72 FETCH LONG
             74 V7 PARSE
             76 PL/SQL RPC
             78 EXEC & FCH
             79 XA OPS
             80 KGL OP
             81 LOGON
             82 LOGON
             83 Streaming op
             84 SES OPS (71)
             85 XA OPS (71)
             86 DEBUG
             87 DEBUGS
             88 XA XA Start
             89 XA XA Commit
             90 XA XA Prepare
             91 x/import
             92 KOD OP
             93 RPI Callback with ctxdef
             94 V8 Bundled Exec
             95 Streaming op
             96 LOB/FILE operations
             97 FILE Create
             98 V8 Describe Query
             99 Connect
            100 OPEN Recursive
            101 Bundled KPR
            102 Bundled PL/SQL
            103 Transaction Start/End
            104 Transaction Commit/Rollback
            105 Cursor close all
            106 Failover session info
            107 SES OPS (80)
            108 Do Dummy Defines
            109 INIT V8 PARS
            110 FIN V8 PARS
            111 PUT V8 PAR
            112 TERM V8 PARS
            114 INIT UNTR CB
            115 OAUTH
            116 Failover get info
            117 Commit Remote Sites
            118 OSESSKEY
            119 V8 Describe Any
            120 Cancel All
            121 Enqueue
            122 Dequeue pre 8.1
            123 Object Transfer
            124 RFS op
            125 Notification
            126 Listen
            127 Commit Remote Sites >= V813
            128 DirPathPrepare
            129 DirPathLoadStream
            130 DirPathMiscOps
            131 MEMORY STATS
            132 AQ Prop Status
            134 remote Fetch Archive Log (FAL)
            135 Client ID propagation
            136 DR Server CNX Process
            138 SPFILE parameter put
            139 KPFC exchange
            140 V82 Object Transfer
            141 Push transaction
            142 Pop transaction
            143 KFN Operation
            144 DirPathUnloadStream
            145 AQ batch enqueue/dequeue
            146 File transfer
            147 PING
            148 TSM
            150 Begin TSM
            151 End TSM
            152 Set schema
            153 Fetch from suspended result-set
            154 Key value pair
            155 XS Create Session Op
            156 XS Session RoundtripOp
            157 XS Piggyback Oper.
            158 KSRPC Execution
            159 Streams combined capture/apply
            160 AQ replay information
            161 SSCR
            162 OSESSGET
            163 OSESSRLS
            165 workload replay data
            166 replay statistic data
            167 Query Cache Stats
            168 Query Cache IDs
            169 RPC Test Stream
            170 replay plsql rpc
            171 XStream Out
            172 Golden Gate RPC

151 rows selected.

 

A lot of calls … special stuff (like DESCRIBE) which will bypass the SQL layer completely (but may in turn invoke further recursive SQL statements through the Recursive Program Interface – RPI).

Ok, time to stop – if you want to learn more, enable SQL trace with waits & binds and event 10051 at level 1 in your test database and try to describe a table or read some LOB columns for example!

 

Full scans, direct path reads and ORA-8103 error hacking session video here (plus iTunes podcast address!)

I have uploaded the latest hacking session video to blip.tv. I have edited it a little, I cut out the part where I spilled an entire Red Bull onto my desk, with some onto my laptop (some keys are still sticky:)

Also, I do upload all these sessins into iTunes – so you can subscribe to my podcast! That way you can download the videos into your computer, phone or iPad. I have deliberately used 1024×768 resolution so it would look awesome on iPad screen! (so hopefully your commute time gets a bit more fun now ;-)

 

Enjoy!

iTunes video-podcast:

 

 

What is the purpose of segment level checkpoint before DROP/TRUNCATE of a table?

There was a very good question asked in Oracle-L list today, which has bothered me too in past.
The question was:
What is the purpose of a segment level checkpoint before DROP/TRUNCATE of a table?
 
In other words, why do we have to wait for the enq: RO – fast object reuse wait event (and in 11.2 the enq: CR – block range reuse ckpt wait) when dropping & truncating segments?
 
I’m not fully confident that I know all the real reasons behind this, but it could be related to the need to get rid of segment’s dirty buffers in buffer cache, before dropping the object.
 
Imagine this:
 
  • You have a large buffer cache and you drop table A without checkpointing the dirty buffers.  
  • Immediately after the drop succeeds (some buffers are still dirty in cache) some other segment (table B) reuses that space for itself and writes stuff into it.
  • A few seconds later, DBWR wakes up to find & write some dirty buffers to disk (anything it finds from its lists). As there are some old & dirty blocks of table A still in the cache, they get written to disk too, overwriting some of the new table B blocks!

  

So, this is one reason why you should checkpoint the blocks to disk before dropping (or truncating) a segment. Of course you might ask that why doesn’t DBWR just check whether the dirty buffer is part of an existing object or a dropped one when it walks through its dirty list? It could just discard the dirty buffers of dropped objects it finds. It would be doable – but I also think it would get quite complex. DBWR is a low level background proces, understanding the cache layer and dealing with physical datablocks in a file# = X block offset = Y. It doesn’t really know anything about the segments/objects which use these blocks. If it should start checking for logical existence of an object, it would have to start running code to access (a much higher level concept) data dictionary cache – and possibly query data dictionary tables via recursive calls, etc, so making it much more complicated.
 
So, this logic may just be matter of implementation, it’d be too complex to implement such selective discarding of dirty buffers, based on a higher-level concept of existence of a segment or object. Dropping and truncating tables so frequently, that these waits become a serious problem (consuming significant % of response time) indicate a design problem anyway. For example, former SQL server developers creating multiple temporary tables in Oracle – for breaking a complex query down into smaller parts, just like they had been doing it in SQL Server.
 
Anyway, here’s what I think about this – I’d love to hear other opinions, if you think otherwise!

Tech Reviewer, Tech Reviewer! ;-)

I just noticed that Jonathan Lewis has announced that he’s writing a new Oracle (fundamental) internals book, due to be out in November.

So, I’m happy to add to Jonathan’s announcement, that I’m the tech reviewer of that book!

After all the hard work on the Exadata book, I didn’t want to hear about working on any book again (even if it’s just tech reviewing work), but as this is Jonathan’s book, about exactly these topics I love and focus on, I had no choice but to make an exception and become a reviewer ;-)

I’ve already reviewed a couple of chapters and this book is going to be awesome!

 

Oracle Troubleshooting TV Show: Season 1, Episode 01 ;-)

Ok, it’s official – the first and only Oracle Troubleshooting TV show is live now!

The first show is almost 2 hours about the ORA-4031 errors and shared pool hacking. It’s a recording of the US/EMEA timezone online hacking session I did some days ago.

There are a couple of things to note:

  1. The text still isn’t as sharp as in the original recording, but it’s much better than in my previous upload attempts and is decently readable. I’ll try some more variations with my next shows so I hope the text quality will get better! Or maybe I should just switch to GUI tools or powerpoint slides? ;-)
  2. You probably should view this video in full screen (otherwise the text will be tiny and unreadable)
  3. There’s advertising in the beginning (and maybe end) of this show! I’ll see how much money I’ll make out of this – maybe these shows start contributing towards the awesome beer selection I’ll have in my fridge some day (right now I have none). Viewing a 30-sec advert is small price to pay for 2 hours of kick-ass shared pool hacking content !!!
  4. You can download the scripts and tools used in the demos from http://tech.e2sn.com/oracle-scripts-and-tools/
  5. Make sure you check out my online Oracle troubleshooting seminars too (this April and May already)

View the embedded video below or go to my official Oracle Troubleshooting TV show channel:

http://tanelpoder.blip.tv

Enjoy!

LOBREAD SQL Trace entry in Oracle 11.2 (and tracing OPI calls with event 10051)

A few days ago I looked into a SQL Tracefile of some LOB access code and saw a LOBREAD entry there. This is a really welcome improvement (or should I say, bugfix of a lacking feature) for understanding resource consumption by LOB access OPI calls. Check the bottom of the output below:

*** 2011-03-17 14:34:37.242
WAIT #47112801352808: nam='SQL*Net message from client' ela= 189021 driver id=1413697536 #bytes=1 p3=0 obj#=99584 tim=1300390477242725
WAIT #0: nam='gc cr multi block request' ela= 309 file#=10 block#=20447903 class#=1 obj#=99585 tim=1300390477243368
WAIT #0: nam='cell multiblock physical read' ela= 283 cellhash#=379339958 diskhash#=787888372 bytes=32768 obj#=99585 tim=1300390477243790
WAIT #0: nam='SQL*Net message to client' ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=99585 tim=1300390477243865
[...snipped...]
WAIT #0: nam='SQL*Net more data to client' ela= 2 driver id=1413697536 #bytes=2048 p3=0 obj#=99585 tim=1300390477244205
WAIT #0: nam='SQL*Net more data to client' ela= 4 driver id=1413697536 #bytes=2048 p3=0 obj#=99585 tim=1300390477244221
WAIT #0: nam='gc cr multi block request' ela= 232 file#=10 block#=20447911 class#=1 obj#=99585 tim=1300390477244560
WAIT #0: nam='cell multiblock physical read' ela= 882 cellhash#=379339958 diskhash#=787888372 bytes=32768 obj#=99585 tim=1300390477245579
WAIT #0: nam='SQL*Net more data to client' ela= 16 driver id=1413697536 #bytes=2020 p3=0 obj#=99585 tim=1300390477245685
WAIT #0: nam='SQL*Net more data to client' ela= 6 driver id=1413697536 #bytes=2048 p3=0 obj#=99585 tim=1300390477245706
WAIT #0: nam='SQL*Net more data to client' ela= 5 driver id=1413697536 #bytes=1792 p3=0 obj#=99585 tim=1300390477245720
LOBREAD: c=1000,e=2915,p=8,cr=5,cu=0,tim=1300390477245735

In past versions of Oracle the CPU (c=) usage figures and other stats like number of physical/logical reads of the LOB chunk read OPI call were just lost – they were never reported in the tracefile. In past only the most common OPI calls, like PARSE, EXEC, BIND, FETCH (and recently CLOSE cursor) were instrumented with SQL Tracing. But since 11.2(.0.2?) the LOBREAD’s are printed out too. This is good, as it reduces the amount of guesswork needed to figure out what are those WAITs for cursor #0 – which is really a pseudocursor.

Why cursor#0? It’s because normally, with PARSE/EXEC/BIND/FETCH, you always had to specify a cursor slot number you operated on (if you fetch from cursor #5, it means that Oracle process went to slot #5 in the open cursor array in your session’s UGA and followed the pointers to shared cursor’s executable parts in library cache from there). But LOB interface works differently – if you select a LOB column using your query (cursor), then all your application gets is a LOB LOCATOR (sort of a pointer with LOB item ID and consistent read/version SCN). Then it’s your application which must issue another OPI call (LOBREAD) to read the chunks of that LOB out from the database. And the LOB locator is independent from any cursors, it doesn’t follow the same cursor API as regular SQL statements (as it requires way different functionality compared to a regular select or update statement).

So, whenever a wait happened in your session due to an access using a LOB locator, then there’s no specific cursor responsible for it (as far as Oracle sees internally) and that’s why a fake, pseudocursor #0 is used.

Note that on versions earlier than 11.2(.0.2?) when the LOBREAD wasn’t printed out to trace – you can use OPI call tracing (OPI stands for Oracle Program Interface and is the server-side counterpart to OCI API in the client side) using event 10051. First enable SQL Trace and then the event 10051 (or the other way around if you like):

SQL> @oerr 10051

ORA-10051: trace OPI calls

SQL> alter session set events '10051 trace name context forever, level 1';

Session altered.

Now run some LOB access code and check the tracefile:

*** 2011-03-17 14:37:07.178
WAIT #47112806168696: nam='SQL*Net message from client' ela= 6491763 driver id=1413697536 #bytes=1 p3=0 obj#=99585 tim=1300390627178602
OPI CALL: type=105 argc= 2 cursor=  0 name=Cursor close all
CLOSE #47112806168696:c=0,e=45,dep=0,type=1,tim=1300390627178731
OPI CALL: type=94 argc=28 cursor=  0 name=V8 Bundled Exec
=====================
PARSING IN CURSOR #47112802701552 len=19 dep=0 uid=93 oct=3 lid=93 tim=1300390627179807 hv=1918872834 ad='271cc1480' sqlid='3wg0udjt5zb82'
select * from t_lob
END OF STMT
PARSE #47112802701552:c=1000,e=1027,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,plh=3547887701,tim=1300390627179805
EXEC #47112802701552:c=0,e=29,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=3547887701,tim=1300390627179884
WAIT #47112802701552: nam='SQL*Net message to client' ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=99585 tim=1300390627179939
WAIT #47112802701552: nam='SQL*Net message from client' ela= 238812 driver id=1413697536 #bytes=1 p3=0 obj#=99585 tim=1300390627418785
OPI CALL: type= 5 argc= 2 cursor= 26 name=FETCH
WAIT #47112802701552: nam='SQL*Net message to client' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=99585 tim=1300390627418945
FETCH #47112802701552:c=0,e=93,p=0,cr=5,cu=0,mis=0,r=1,dep=0,og=1,plh=3547887701,tim=1300390627418963
WAIT #47112802701552: nam='SQL*Net message from client' ela= 257633 driver id=1413697536 #bytes=1 p3=0 obj#=99585 tim=1300390627676629
OPI CALL: type=96 argc=21 cursor=  0 name=LOB/FILE operations
WAIT #0: nam='SQL*Net message to client' ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=99585 tim=1300390627676788
[...snip...]
WAIT #0: nam='SQL*Net more data to client' ela= 2 driver id=1413697536 #bytes=1792 p3=0 obj#=99585 tim=1300390627677054
LOBREAD: c=0,e=321,p=0,cr=5,cu=0,tim=1300390627677064

Check the bold and especially the red string above.  Tracing OPI calls gives you some extra details of what kind of tasks are executed in the session. The “LOB/FILE operations” call indicates that whatever lines come after it (unlike SQL trace call lines where all the activity happens before a call line is printed (with some exceptions of course)) are done for this OPI call (until a next OPI call is printed out). OPI call tracing should work even on ancient database versions…

By the way, if you are wondering, what’s the cursor number 47112801352808 in the “WAIT #47112801352808″ above? Shouldn’t the cursor numbers be small numbers?

Well, in 11.2.0.2 this was also changed. Before that, the X in CURSOR #X (and PARSE #X, BIND #X, EXEC #X, FETCH #X) represented the slot number in your open cursor array (controlled by open_cursors) in your session’s UGA. Now, the tracefile dumps out the actual address of that cursor. 47112801352808 in HEX is 2AD94DC9FC68 and it happens to reside in the UGA of my session.

Naturally I asked Cary Millsap about whether he had spotted this LOBREAD already and yes, Cary’s way ahead of me – he said that Method-R’s mrskew tool v2.0, which will be out soon, will support it too.

It’s hard to not end up talking about Cary’s work when talking about performance profiling and especially Oracle SQL trace, so here are a few very useful bits which you should know about:

If you want to understand the SQL trace & profiling stuff more, then the absolute must document is Cary’s paper on the subject – Mastering Performance with Extended SQL Trace:

Also, if you like to optimize your work like me (in other words: you’re proactively lazy ;-) and you want to avoid some boring “where-the-heck-is-this-tracefile-now” and “scp-copy-it-over-to-my-pc-for-analysis” work then check out Cary’s MrTrace plugin (costs ~50 bucks and has a 30-day trial) for SQL Developer. I’ve ended up using it myself regularly although I still tend to avoid GUIs:

Implicit datatype conversion in the parsing phase – something new I learned today!

Wow, I wasn’t aware that Oracle can also do an implicit datatype conversion for literal strings during parsing phase!

SQL> @desc t
           Name                            Null?    Type
           ------------------------------- -------- ----------------------------
    1      A                                        NUMBER(38)

SQL> select * from t where a = '1' || RPAD('0',5,'0');

no rows selected

SQL> @x
Display execution plan for last statement for this session from library cache...

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------
SQL_ID  d7r6md8wfu74d, child number 0
-------------------------------------
select * from t where a = '1' || RPAD('0',5,'0')

Plan hash value: 1601196873

--------------------------------------------------------
| Id  | Operation         | Name | E-Rows | Cost (%CPU)|
--------------------------------------------------------
|   0 | SELECT STATEMENT  |      |        |     2 (100)|
|*  1 |  TABLE ACCESS FULL| T    |      1 |     2   (0)|
--------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A"=100000)

You see what happened? The expression ’1′ || RPAD(’0′,5,’0′) has been evaluated, which returns a string. And this string ’100000′ has been converted to a NUMBER 100000 during parsing phase .. otherwise you would see quotes around the number above with a TO_NUMBER() function around it (so that Oracle could compare the NUMBER column “A” to the same datatype)…

I add a TO_CHAR() around the column A just for demoing that a varchar datatype (as the original “literal” in my query is) will be shown with quotes like every normal string:

SQL> select * from t where to_char(a) = '1'||rpad('0',5,'0');

no rows selected

SQL> @x
Display execution plan for last statement for this session from library cache...

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------
SQL_ID  7yf6j8fdyrvk7, child number 0
-------------------------------------
select * from t where to_char(a) = '1'||rpad('0',5,'0')

Plan hash value: 1601196873

--------------------------------------------------------
| Id  | Operation         | Name | E-Rows | Cost (%CPU)|
--------------------------------------------------------
|   0 | SELECT STATEMENT  |      |        |     2 (100)|
|*  1 |  TABLE ACCESS FULL| T    |      1 |     2   (0)|
--------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(TO_CHAR("A")='100000')

Let’s see whether this trick is somehow done also for bind variables:

SQL> var x varchar2(10)
SQL> exec :x:= '1' || RPAD('0',5,'0');

PL/SQL procedure successfully completed.

SQL> print x

X
--------------------------------
100000

SQL> select * from t where a = :x;

no rows selected

SQL> @x
Display execution plan for last statement for this session from library cache...

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------
SQL_ID  45f39y7580bdp, child number 2
-------------------------------------
select * from t where a = :x

Plan hash value: 1601196873

--------------------------------------------------------
| Id  | Operation         | Name | E-Rows | Cost (%CPU)|
--------------------------------------------------------
|   0 | SELECT STATEMENT  |      |        |     2 (100)|
|*  1 |  TABLE ACCESS FULL| T    |      1 |     2   (0)|
--------------------------------------------------------

Peeked Binds (identified by position):
--------------------------------------

   1 - (VARCHAR2(30), CSID=873): '100000'

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A"=TO_NUMBER(:X))

Apparently not! And this kind of makes sense – as if this string to number conversion is done during parse phase – Oracle doesn’t know what the actual value is yet (in the bind variable memory) so it can’t convert it to number in advance either :-)

This is a little interesting detail… I didn’t know that in addition to the implicit datatype conversion during query execution (using TO_CHAR, TO_NUMBER functions etc) Oracle can sometimes convert a string literal to number datatype under the hood during the parse time!

P.S. I tested this on Oracle 11.2.0.2 with optimizer_features_enable set from 11.2.0.2 to all the way back to 8.0.0 and the behavior was the same. I didn’t find any mention of this conversion in the CBO tracefile although after a filter pushdown transformation (FPD) the string literal was already shown as a number datatype. If anyone still has access to ancient Oracle database versions (like 9.2 and 10.1 ;-) then let me know whether you see the same results!

ORA-4031 errors, contention, cursor management issues and shared pool fragmentation – free secret seminar!

Free stuff! Free stuff! Free stuff! :-)

The awesome dudes at E2SN have done it again! (and yes, Tom, this time the “we at E2SN Ltd” doesn’t mean only me alone ;-)

On Tuesday 22nd March I’ll hold two (yes two) Secret Oracle Hacking Sessions – about ORA-04031: unable to allocate x bytes of shared memory errors, cursor management issues and other shared pool related problems (like fragmentation). This event is free for all! You’ll just need to be fast enough to register, both events have 100 attendee limit (due to my GotoWebinar accont limitations).

I am going to run this online event twice, so total 200 people can attend (don’t register for both events, please). One event is in the morning (my time) to cater for APAC/EMEA region and the other session is for EMEA/US/Americas audience.

The content will be the same in both sessions. There will be no slides (you cant fix your shared pool problems with slides!) but there will be demos, scripts, live examples and fun (for the geeks among us anyway – others go and read some slides instead ;-)!