Profiling trace files with preprocessor external tables in 11g and some parallel execution hacking

If you work with SQL Trace files (and profile them) then you should check out the awesome novel use of the “external table preprocessor” feature explained by Adrian Billington here:

Ironically just a day after writing my “Evil things” article, I noticed a note in MOS about how to enable an event 10384 at level 16384 to get a parallel plan to be executed in serial:

  • How to force that a Parallel Query runs in serial with the Parallel Execution Plan [ID 1114405.1]

This way you can still see the A-rows and other feedback like buffer gets and PIOs by row source even if the plan is a parallel plan. The problem is that even with GATHER_PLAN_STATISTICS enabled (or the equivalent parameter(s)) the PX slaves don’t pass their actual time, rows and buffer gets/PIOs stats back to the QC to be displayed in V$SQL_PLAN_STATISTICS / DBMS_XPLAN output. With parallel slaves, all you’d see would be the QC-generated numbers and not the PX slave stuff.

So if you set that magic event (at your own risk) then even the parallel plans would be executed by the QC only (basically a parallel plan executed entirely ins serial) and you’ll still see all the A-rows and buffer gets/physical read numbers as with serial plans. But remember my yesterday’s article ;-)

Alternative options for getting such runtime stats for a parallel query would be:

  1. Use the Real-Time SQL Monitoring feature (only on 11g and with diagnostics+tuning pack licenses)
  2. Enable SQL Trace for the QC, run your parallel query (you’ll have to wait until it finishes or cancel the query with CTRL+C) and consolidate the STAT# lines from all the PX slave sessions – each PX slave dumps the STAT# lines with row-source level stats like regular serial queries
Update: Coskan commented over twitter that why not use ALLSTATS ALL in DBMS_XPLAN.DISPLAY_CURSOR instead of ALLSTATS LAST – and indeed, in this case you would see the A-rows, buffer gets and other stats of both QC and the PX slaves aggregated together. Although I had used ALLSTATS ALL to compare average execution stats to the LAST stats, I never thought of using it for aggregating the PX slave stats together with QC like that.
 
Basically what happens (in 10g+) with PX is that the QC and all PX slaves all execute the same SQL_ID (but sometimes a different child version though, especially when you’re running cross-instance PX, in which case SQL Monitoring is a better option). And ALLSTATS ALL would aggregate the stats of all executions of the particular SQL_ID (and child number). So if you run something with 4 PX slaves with GATHER_PLAN_STATISTICS enabled, they all add their execution stats to the “ALL” columns. It’s just that the QC is always the last one to finish executing the query (that’s when the exec stats get updated in the V$SQL_PLAN_STATISTICS views), that’s why the ALLSTATS LAST always shows the limited QC stats and not the PX slave ones. But Allstats ALL solves that problem.
 
However the catch is that ALLSTATS ALL would accumulate all executions of this query (including your previous runs of it), so if you want to see fresh stats of only your latest query execution, you should adjust the SQL_ID so that a new cursor would be created. And you can adjust the SQL_ID by simply adding a comment like /* test 2 */ somewhere into your query text…
Thanks Coskan! :)

Evil things are happening in Oracle

Relax, I’m talking about the Oracle Database kernel here, not the corporation ;-)

Here’s a couple of more reasons why not to play around with undocumented debug events unless you’re really sure why and how would they help to solve your specific problem (and you’ve gotten a blessing in some form from Oracle support too):

$ oerr ora 10665
 10665, 00000, "Inject Evil Literals"
 // *Cause:  Event 10665 is set to some number > 0, causing 1/(value-1) of all
 //          literals to be replaced by 2000 letter 'A's.  A value of 1 does
 //          not corrupt anything.
 // *Action: never set this event

$ oerr ora 10668

10668, 00000, "Inject Evil Identifiers"
 // *Cause:  event 10668 is set to some number > 0, causing 1/(value-1) of all
 //          identifiers to be replaced by a maximum amount of x's.  It is
 //          common for an identifier to be parsed once with a max of 30 bytes,
 //          then reparsed later with a max of 4000, so it may not be possible
 //          to inject such an identifier without the aid of this event.  A
 //          value of 1 causes no identifiers to be corrupted.
 // *Action: never set this event

 

Some events are meant to be left alone. You don’t want to wake up the evil sleeping deep in the core of the Oracle Kernel-land!

 

By the way, Karl Arao once managed to capture what this evil creature looks like:

 

http://t.co/GVbxwfnJ

:)

V8 Bundled Exec call – and Oracle Program Interface (OPI) calls

So, what he hell is that V8 Bundled Exec call which shows up in various Oracle 11g monitoring reports?!

It’s yet another piece of instrumentation which can be useful for diagnosing non-trivial performance problems. Oracle ASH has allowed us to measure what is the top wait event or top SQLID for a long time, but now it’s also possible to take a step back and see what type of operation the database session is servicing. 

I am talking about Oracle Program Interface (OPI) calls. Basically for each OCI call in the client side (like , OCIStmtExecute, OCIStmtFetch, etc) there’s a corresponding server side OPI function (like opiexe(), opifch2() etc). 

It has been possible to trace all the OPI calls with event 10051 as I’ve explained here, but since Oracle 11g this data is also conveniently accessible from ASH views (the various monitoring reports, including SQL Monitoring report also use ASH data for some features).

So, I can write a simple query against ASH, which doesn’t group the samples by wait event or SQLID, but just by the general OPI call type (TOPLEVELCALLNAME column) and also by the SQL command type (using V$SQLCOMMAND in 11.2):  

SQL> SELECT
  2      a.top_level_call#
  3    , a.top_level_call_name
  4    , a.top_level_sql_opcode
  5    , s.command_name
  6    , COUNT(*)
  7  FROM
  8      v$active_session_history a
  9    , v$sqlcommand s
 10  WHERE
 11      a.top_level_sql_opcode = s.command_type
 12  GROUP BY
 13      a.top_level_call#
 14    , a.top_level_call_name
 15    , a.top_level_sql_opcode
 16    , s.command_name
 17  ORDER BY
 18*     COUNT(*) DESC
 19  /

TOP_LEVEL_CALL# TOP_LEVEL_CALL_NAME     top_op# COMMAND_NAME                     COUNT(*)
--------------- -------------------- ---------- ------------------------------ ----------
             94 V8 Bundled Exec               7 DELETE                              10505
              0                               0                                      4041
             59 VERSION2                      0                                       579
             59 VERSION2                     47 PL/SQL EXECUTE                        377
             59 VERSION2                      3 SELECT                                191
             96 LOB/FILE operations         170 CALL METHOD                            67
             59 VERSION2                    170 CALL METHOD                            66
             94 V8 Bundled Exec               6 UPDATE                                 52
             59 VERSION2                      6 UPDATE                                 41
             59 VERSION2                      7 DELETE                                 36
             94 V8 Bundled Exec               3 SELECT                                 24
             96 LOB/FILE operations          47 PL/SQL EXECUTE                         18
             59 VERSION2                      2 INSERT                                  8
             94 V8 Bundled Exec               1 CREATE TABLE                            5
              0                               3 SELECT                                  3
             59 VERSION2                     15 ALTER TABLE                             1
             96 LOB/FILE operations           0                                         1
             59 VERSION2                     12 DROP TABLE                              1
              5 FETCH                         3 SELECT                                  1
             94 V8 Bundled Exec              12 DROP TABLE                              1

20 rows selected.

 

Aas you see above, most of the ASH samples in my test database have been created by a DELETE type SQL statement, executed via V8 Bundled Exec type of an OPI call.

So, what is this call about? Let’s explain its name. Look into other call types in the above output. In the bottom you see a FETCH call (fetching from a SELECT type statement). Also there are a few LOB/FILE operations calls, which are used exclusively for accessing LOB data (via LOB locator, bypassing the usual SQL processing layer).

In Oracle 7 you would also see PARSE and EXECUTE calls, but starting from Oracle 8 not anymore. This is because starting from Oracle 8, the OPI layer in database side can accept bundled OCI calls from the client – to reduce the number of network roundtrips. So, basically instead of sending the PARSE and EXEC requests in separate SQLNet roundtrips (increasing the latency), the OCI client libraries can bundle these requests together and send them to database in one SQLNet payload. The database server side understands it and is able to extract these separate OPI requests from the bundled packet (in right order) and execute the corresponding OPI function for each separate call. 

Note that this is why you frequently see the kpoal8() function close to the beginning in Oracle server process stack traces (and where I usually start reading them from), this is the function which processes all the OPI requests sent to it in a bundled package. So, whenever there’s a OCIStmtExecute() call extracted from the bundle, the opiexe() function is called in Oracle kernel with appropriate arguments extracted from the same bundle. Whenever we extract an OCIStmtFetch2() call from the bundle, the corresponding opifch2() function is called in the kernel.

Hopefully this explains why is there such a call “V8 Bundled Exec” in Oracle. It just allows to reduce client – server communication latency by allowing to bundle multiple database requests together into a single SQL*Net payload. In other words, it’s just how Oracle works and it’s perfectly normal to see V8 Bundled Exec as the top OPI call type in performnace reports. If you see this OPI call as the top one, then you’d need to drill down into what’s the actual SQLID which consumes the most of the response time (and further breakdown like which wait event and execution plan step takes the most time).

But the ability of breaking down database response time by OPI call becomes much more useful when troubleshooting somewhat more exotic performance problems like LOB access times (where there’s no SQL statement associated with the database call) or other direct OPI calls which are executed without parsing and running a SQL cursor.

For example, have you noticed that the behavior of ROLLBACK command in sqlplus is different from the shorter ROLL command?

When you issue a ROLLBACK, then Oracle will actually send the string “ROLLBACK” to the database as a regular SQL statement (using V8 Bundled Exec), it will be parsed there (with all the latching and shared pool overhead) as regular SQL – and then Oracle realizes that the command in it is a rollback. Then rollback is performed.

But if you issue a ROLL command, then sqlplus understands it and doesn’t send it to the database for parsing like a regular SQL statement. Instead it will send an OCITransRollback() call, which will call the corresponding OPI function directly, bypassing the SQL processing layer completely. Instead of the usual “V8 Bundled Exec” bundle call you would see a “Transaction Commit/Rollback” OPI call as it was called directly, without any SQL statement processing involved. This is why you sometimes see WAIT#0 lines in SQLTrace, where the waits seem to be associated with some non-existent cursor #0. Whenever the wait happens when the database session is servicing an OPI call which bypasses the SQL processing layer (kks/kkx modules) then the SQLTrace just shows cursor#0 as the wait’s “owner”.

The same happens when using things like connection.commit() in JDBC, the client does not send a SQL statement with text “commit” into the datbase, but rather will call out the OPI commit function out directly.

So, how many different OPI calls are there? Well, a lot, as you can see from v$toplevelcall (or its underlying x$orafn) in Oracle 11.2:

 

SQL> SELECT * FROM v$toplevelcall;

TOP_LEVEL_CALL# TOP_LEVEL_CALL_NAME
--------------- ----------------------------------------------------------------
              0
              2 OPEN
              3 PARSE
              4 EXECUTE
              5 FETCH
              8 CLOSE
              9 LOGOFF
             10 DESCRIBE
             11 DEFINE
             12 COMMIT ON
             13 COMMIT OFF
             14 COMMIT
             15 ROLLBACK
             16 SET OPTIONS
             17 RESUME
             18 VERSION
             20 CANCEL
             21 GET ERR MSG
             23 SPECIAL
             24 ABORT
             25 DEQ ROW
             26 FETCH LONG
             31 HOW MANY
             32 INIT
             33 CHANGE USER
             34 BIND REF POS
             35 GET BIND VAR
             36 GET INTO VAR
             37 BINDV REF
             38 BINDN REF
             39 PARSE EXE
             40 PARSE SYNTAX
             41 PARSE SYNSDI
             42 CONTINUE
             43 ARRAY DESC
             44 INIT PARS
             45 FIN PARS
             46 PUT PAR
             48 START ORACLE
             49 STOP ORACLE
             50 RUN IND PROC
             52 ARCHIVE OP
             53 MED REC STRT
             54 MED REC TABS
             55 MED REC GETS
             56 MED REC RECL
             57 MED REC CANC
             58 LOGON
             59 VERSION2
             60 INIT
             62 EVERYTHING
             65 DIRECT LOAD
             66 UL BUFFER XMIT
             67 DISTRIB XACTION
             68 DESCRIBE INDEXES
             69 SESSION OPS
             70 EXEC w/SCN
             71 FAST UPI
             72 FETCH LONG
             74 V7 PARSE
             76 PL/SQL RPC
             78 EXEC & FCH
             79 XA OPS
             80 KGL OP
             81 LOGON
             82 LOGON
             83 Streaming op
             84 SES OPS (71)
             85 XA OPS (71)
             86 DEBUG
             87 DEBUGS
             88 XA XA Start
             89 XA XA Commit
             90 XA XA Prepare
             91 x/import
             92 KOD OP
             93 RPI Callback with ctxdef
             94 V8 Bundled Exec
             95 Streaming op
             96 LOB/FILE operations
             97 FILE Create
             98 V8 Describe Query
             99 Connect
            100 OPEN Recursive
            101 Bundled KPR
            102 Bundled PL/SQL
            103 Transaction Start/End
            104 Transaction Commit/Rollback
            105 Cursor close all
            106 Failover session info
            107 SES OPS (80)
            108 Do Dummy Defines
            109 INIT V8 PARS
            110 FIN V8 PARS
            111 PUT V8 PAR
            112 TERM V8 PARS
            114 INIT UNTR CB
            115 OAUTH
            116 Failover get info
            117 Commit Remote Sites
            118 OSESSKEY
            119 V8 Describe Any
            120 Cancel All
            121 Enqueue
            122 Dequeue pre 8.1
            123 Object Transfer
            124 RFS op
            125 Notification
            126 Listen
            127 Commit Remote Sites >= V813
            128 DirPathPrepare
            129 DirPathLoadStream
            130 DirPathMiscOps
            131 MEMORY STATS
            132 AQ Prop Status
            134 remote Fetch Archive Log (FAL)
            135 Client ID propagation
            136 DR Server CNX Process
            138 SPFILE parameter put
            139 KPFC exchange
            140 V82 Object Transfer
            141 Push transaction
            142 Pop transaction
            143 KFN Operation
            144 DirPathUnloadStream
            145 AQ batch enqueue/dequeue
            146 File transfer
            147 PING
            148 TSM
            150 Begin TSM
            151 End TSM
            152 Set schema
            153 Fetch from suspended result-set
            154 Key value pair
            155 XS Create Session Op
            156 XS Session RoundtripOp
            157 XS Piggyback Oper.
            158 KSRPC Execution
            159 Streams combined capture/apply
            160 AQ replay information
            161 SSCR
            162 OSESSGET
            163 OSESSRLS
            165 workload replay data
            166 replay statistic data
            167 Query Cache Stats
            168 Query Cache IDs
            169 RPC Test Stream
            170 replay plsql rpc
            171 XStream Out
            172 Golden Gate RPC

151 rows selected.

 

A lot of calls … special stuff (like DESCRIBE) which will bypass the SQL layer completely (but may in turn invoke further recursive SQL statements through the Recursive Program Interface – RPI).

Ok, time to stop – if you want to learn more, enable SQL trace with waits & binds and event 10051 at level 1 in your test database and try to describe a table or read some LOB columns for example!

 

Secret hacking session – full scans, direct path reads, object level checkpoints, ORA-8103s! (again)

I’m mentioning this again just in case you missed the announcement (because I posted it on the weekend):

 

There will be anotner free Secret hacking session – about full scans, direct path reads, object level checkpoints, ORA-8103s!

It will happen tomorrow, Tuesday 9th August, online!

 

Register here:

See you soon!

 

Secret hacking session – full scans, direct path reads, object level checkpoints, ORA-8103s!

FREE STUFF !!! FREE STUFF !!! FREE STUFF !!!
At the last week’s Expert Oracle Exadata virtual conference both me and Kerry touched the topic of how smart scan (and direct path read) decisions are done for each scanned segment during the SQL execution runtime – and not by the optimizer during the optimization phase.
I got a few follow-up questions about this yesterday and I also recall a similar question after my Understanding Exadata Performance Metrics presentation I did for UKOUG in London in April.
I think this is a topic which deserves some deeper coverage and so I decided I’ll do another secret hacking session on Tuesday 9th August (2011) 9-11am PDT.
The topics (hopefully) covered will be:
  1. How do full table scans work
  2. How and why do the “ORA-8103: object no longer exists” happen
  3. How does Oracle (11g) decide between a buffered full table scan and a direct path read scan (plus smart scan in Exadata)
  4. What’s the difference between an object_id and data_object_id?
  5. and more!
Note that while I do touch the Exadata topic a bit too, this session is generic and useful for anyone full scanning tables on any Oracle database… :)
This event is free and you can sign up here:

P.S. I will record the event and publish the video too. So if you’re in Australia or Hong Kong, no need to stay up late ;-)

Training Schedule for 2011 and Public Appearances

Online Seminars
A lot of people have asked me about whether I’d be doing any more seminars in the future. And the answer is yes – at least this year (might be too busy running a company the next year ;-)

I have finally put together the schedule for my 2011 seminars. In addition to the Advanced Oracle Troubleshooting seminar I will also deliver my Advanced Oracle SQL Tuning and Oracle Partitioning and Parallel Execution for Performance seminars, which I have done only onsite in past.

So, check out the seminars page:
Also don’t forget the Expert Oracle Exadata virtual conference next week!

Public Appearances

Oracle OpenWorld 2. October
  • I will talk about Large-Scale Consolidation onto Oracle Exadata: Planning, Execution, and Validation
  • Session ID 09355

Maybe I’ll lurk around the UKOUG venue as well in december ;-)

The First Exadata Virtual Conference in the World!

We have been secretly planning something with Kerry Osborne – and now it’s official – we will host The First Exadata Virtual Conference in the World, on 3-4 August 2011.

This conference takes place a couple of weeks after our Expert Oracle Exadata book is published (on 18. July – check out the awesome new cover design). So, we thought it’d be a good idea to run this conference, where we can explain some things in a different format, do live demos and answer questions that attendees have.

On the first day Kerry and Randy will talk about some serious fundamentals of Exadata, like how Exadata Smart Scan Offloading works and how to make the IO resource manager work for you (especially important in mixed workload consolidated environments).

And on the second day we’ll dig even deeper, with Andy Colvin talking about how to survive Exadata patching (he has patched more Exadatas than anyone else I know) and me following up with some complex performance troubleshooting stories I’ve encountered recently (trust me – there’s a LOT of issues to troubleshoot ;-)

About the Conference:

Since its release, Oracle Exadata quickly became a hit. Due to the relative “youth” of Exadata technology and internal behavior changes introduced with frequent patch-sets, there’s not much up-to-date quality technical information and know-how available to public. This virtual conference brings you a chance to learn from the leading Exadata experts, from their experience of working with real Exadata environments, from Exadata V1 to the latest X2-8. Additionally, there is plenty of Q&A time scheduled, so you can also get answers to your Exadata-related questions.

The speakers are probably some of the most experienced Exadata consultants in the world, in the field of Exadata deployment, migration, performance, and troubleshooting. Also, Kerry, Randy and Tanel are the authors of the Expert Oracle Exadata book published by Apress in July 2011.

Dates:

  • 3-4 August 2011

Location:

  • Online (or should I say “the Cloud” ;-)

Duration:

  • 8am – 12pm (PST) on both days – 2 x 1.5h sessions on each day, with Q&A sessions and a break in between

Speakers:

  • Kerry Osborne, Randy Johnson, Andy Colvin from Enkitec
  • Tanel Poder from right here :-)

All of the speakers are hard-core hands-on professionals, having worked on many different real-life (production) Exadata environments of their clients. Enkitec dudes didn’t stop there, they bought a half rack for themselves, just for playing around with it. Yeah (+1 from me), some people buy a red hot Ferrari, some buy a red hot computer rack with an X on it :-)

Price:

  • 375 USD (early bird until 22. July), 475 regular price

More information, abstracts and registration:

I don’t think you’ll find an Exadata learning opportunity like this from anywhere else (and any time soon), especially considering the price!

This conference is so hot, that one of the attendees managed to sign up to it even before I had published this page to the world! :-)


Tech Reviewer, Tech Reviewer! ;-)

I just noticed that Jonathan Lewis has announced that he’s writing a new Oracle (fundamental) internals book, due to be out in November.

So, I’m happy to add to Jonathan’s announcement, that I’m the tech reviewer of that book!

After all the hard work on the Exadata book, I didn’t want to hear about working on any book again (even if it’s just tech reviewing work), but as this is Jonathan’s book, about exactly these topics I love and focus on, I had no choice but to make an exception and become a reviewer ;-)

I’ve already reviewed a couple of chapters and this book is going to be awesome!

 

IOUG Select Journal Editor’s Choice Award 2011

In May I received the IOUG Select Journal Editor’s Choice Award for my Systematic Oracle Latch Contention Troubleshooting article where I introduced my LatchProfX tool for advanced drilldown into complex latch contention problems (thanks IOUG and John Kanagaraj!).

As the relevant IOUG webpage hasn’t been updated yet, I thought to delay this announcement until the update was done – but I just found an official enough announcement (press release) by accident from Reuters site:

Woo-hoo! :-)

The article itself is here:

Thanks to IOUG crew, John Kanagaraj and everyone else who has read, used my stuff and given feedback! :-)

Backups are forever

You should save this link in your bookmarks, for the next time you accidentally remove your database’s online redologs or overwrite some ASM disks: