Advanced Oracle Troubleshooting Guide, Part 4: Diagnosing a long parsing issue

There was a recent thread in Oracle Forums about a session getting stuck somewhere when a specific SQL was issued. The SQL executed did not return at all unless ORDERED hint was used. Even the EXPLAIN PLAN command (which only parses the statement, doesn’t execute it) did never return.

Classic tracing + tkprof techniques didn’t show much (just some recursive queries consuming insignificant amounts of time).

The proven V$SESSION_WAIT sampling technique didn’t reveal anything as it showed the session being constantly on CPU (the wait state = ‘WAITED KNOWN TIME’ which means session is on CPU) and SEQ# didn’t increase (which means that wait state did not change over time).

Due the symptoms described above I was well prepared to troubleshoot this issue. This looks exactly like one of the troubleshooting use cases I demonstrate in of my Advanced Oracle Troubleshooting class (nice embedded advertisment, huh? ;)

In such a case where tracing and V$ views don’t provide any useful information about what the session is doing, I normally look into few stack traces of the server process. In this case I asked the poster to do this and here is the result:

21820: oracleXXXXXX (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
00000001021f5f58 kkosbn (ffffffff7ad79910, ff, ffffffff7ad96f58, ffffffff7ad96f58, 10000, ffffffff7ad96f58) + 2d8
00000001021eb278 kkobmp (ffffffff7ad790f0, 10000, ffffffff7ad73e18, 0, 0, ffffffff7ad797a0) + 4d8
00000001021ad348 kkotap (0, 0, 0, ffffffff7ad79910, ffffffff7ad73e18, ffffffff7ad790f0) + 4e8
000000010218fd4c kkojnp (382660a28, 0, ffffffff7ad790f0, 10218f160, 1065327d0, ffffffff7ad797a0) + bac
000000010218e828 kkocnp (106400, 200, ffffffff7ad790f0, 1, 0, 1065327d0) + e8
00000001021757a4 kkooqb (102174, 0, 3, 0, ffffffff7af15d00, ffffffff7ad790f0) + c04
0000000102165ac0 kkoqbc (0, ffffffff7af15d00, 0, 1065327d0, 6, 102165380) + 720
00000001022f3700 apakkoqb (ffffffff7fff6810, ffffffff7af15d00, 3843e26e0, 0, c200080, 0) + 80
00000001022f4110 apaqbd (ffffffff7fff6810, ffffffff7af15d00, 3843e26e0, ffffffff7ad7fd38, 381cbf738, 200000) + 170
00000001022f1014 apadrv (3843e26e0, 106532, ffffffff7af43f28, 106531b28, ffffffff7af15d00, 106532) + 294
0000000102529dc8 opitca (3, 3843e26e0, ffffffff7af43f28, 0, 1065327d0, 106525000) + 9c8
00000001019f1f8c kksFullTypeCheck (ffffffff7fff7b70, 1056d8c28, 106525898, ffeffc00, 1065253d8, 1056d8c18) + c
000000010034c894 rpiswu2 (0, 106525898, 0, ffffffff7fff72a8, 1056d8000, 106525) + 1f4
00000001019f8ca8 kksLoadChild (106400, ffffffff7fff7b70, ffffffff7af43f28, 106525898, ffffffff7fff91c0, 3843e26e0) + 22e8
00000001018f767c kxsGetRuntimeLock (0, 0, 180, 3834b2578, e9e733ed, 106525898) + 5bc
0000000101a18f24 kksfbc (106531, 0, 108, 0, 0, 382d0d758) + 3de4
0000000101a12618 kkspsc0 (1000, fffdffff, 0, 108, 10000000, 0) + 1838
0000000101a13844 kksParseCursor (ffffffff7fffa8b8, 3c305e100, 0, 0, ffffffff7af32260, 3) + e4
0000000102515d3c opiosq0 (1056d8c18, ffffffff7af32260, 1056d8, 20a, 1056d8000, 20b) + 85c
00000001024b1078 kpooprx (ffffffff7fffe07c, ffffffff7fffbec0, 106400, ffffffff7fffabd0, 0, 0) + d8
00000001024ad6d8 kpoal8 (ffffffff7fffbe88, 106528, a4, 8061, 1, 106532860) + 2b8
0000000100346ccc opiodr (1057687c0, 5e, 106525600, 1, ffffffff7af26738, 106528) + 60c
0000000103d74124 ttcpip (34, ffffffff7fffcb50, 1056beb2c, 1056bc310, 0, 106525718) + 504
0000000100341238 opitsk (106532868, 1, 0, 106532860, 105882f78, fffffffd) + 598
0000000100345888 opiino (106400, 106532860, 0, 40002801, 106400, 106532860) + 468
0000000100346ccc opiodr (1065319d8, 3c, 106400, 1065327e0, 106532, 106528) + 60c
000000010033f660 opidrv (106534f60, 0, 3c, 106531c98, 3c, 0) + 380
0000000100339c30 sou2o (ffffffff7ffff2a8, 3c, 4, ffffffff7ffff2d0, 105e0c000, 105e0c) + 50
00000001002fbf5c opimai_real (2, ffffffff7ffff3a8, 104050b4c, 1064cd848, 247cbfc, 14800) + 7c
00000001002fbe18 main (2, 8, 0, ffffffff7ffff3b8, ffffffff7ffff4c8, ffffffff7b500140) + 98
00000001002fbd3c _start (0, 0, 0, 0, 0, 0) + 17c

And one of the highlighted functions (kksParseCursor) already gives out that this process is parsing.
And the other function (kksLoadChild) gives out that we are in process of hard parsing (as soft parsing does not need to load child cursors into library cache)

Some more bits:

  • The rpiswu function is the one initializing a separate runtime execution context for any hard parsing recursive activity ( RPI = Recursive Program Interface )
  • The apa* functions above also indicate parsing action
  • The kko* functions are the ones taking care of SQL query transformation, validation and various other checks

On Oracle 10g+ you can actually measure session execution time spent parsing by sampling V$SESS_TIME_MODEL, what I have done using my Snapper tool here:

SQL> @sn 10 106

-- Session Snapper v1.07 by Tanel Poder ( http://www.tanelpoder.com )

---------------------------------------------------------------------------------------------------------------------------------------------
HEAD,     SID, SNAPSHOT START   ,  SECONDS, TYPE, STATISTIC                               ,         DELTA,  DELTA/SEC,     HDELTA, HDELTA/SEC
---------------------------------------------------------------------------------------------------------------------------------------------
DATA,     106, 20080603 06:31:17,       10, TIME, hard parse elapsed time                 ,      10124018,    1012402,     10.12s,      1.01s
DATA,     106, 20080603 06:31:17,       10, TIME, parse time elapsed                      ,      10124018,    1012402,     10.12s,      1.01s
DATA,     106, 20080603 06:31:17,       10, TIME, DB CPU                                  ,      10113462,    1011346,     10.11s,      1.01s
DATA,     106, 20080603 06:31:17,       10, TIME, DB time                                 ,      10124018,    1012402,     10.12s,      1.01s
DATA,     106, 20080603 06:31:17,       10, WAIT, events in waitclass Other               ,            51,          5,       51us,      5.1us
--  End of snap 1

PL/SQL procedure successfully completed.

However as the original poster had STATISTICS_LEVEL=BASIC, these values were zero, for Time Model stats gathering STATISTICS_LEVEL needs to be set to either TYPICAL or ALL.

…but reading a stack trace is more fun, isn’t it? ;-)

This entry was posted in Oracle and tagged , . Bookmark the permalink.

5 Responses to Advanced Oracle Troubleshooting Guide, Part 4: Diagnosing a long parsing issue

  1. Charles says:

    Hi Tanel,
    A big fan of your blogs…
    Please, tell me how you can tell from your trace that the kksParseCursor and kksLoadChild functions are currently being executed, as opposed to them having BEEN executed at some point in the past?
    Thank you if you have time to respond!

  2. wanqisheng says:

    hello Tanel, i’m meeted the same bug.
    from metalink, compare with stack , it’s similar with Bug:5667683,
    but version is different, my version is 10.2.0.3 for solaris ; at doc 421913.1, the version is 10.2.0.2.
    i can’t patch the db, we can set some parameter to avoid it?
    also i will try to follow Bug 5667683 to remove stats.
    can you tell me how to process and avoid it? thanks

    $pstack 7148
    7148: oraclexxx (LOCAL=NO)
    00000000026aaaf5 kkosbn () + 2c5
    00000000026a157a kkobmp () + 50a
    000000000266ddb8 kkotap () + 338
    000000000265814d kkojnp () + d4d
    0000000002656ca3 kkocnp () + 93
    0000000002644222 kkooqb () + 8d2
    0000000002638a03 kkoqbc () + 813
    000000000276c83f apakkoqb () + 8f
    000000000276d047 apaqbd () + 187
    000000000276a5ce apadrv () + 1fe
    0000000002935e2c opitca () + 5cc
    0000000001fe190d kksFullTypeCheck () + 1d
    0000000000e9bbdc rpiswu2 () + 18c
    0000000001fe6b85 kksLoadChild () + 1d15
    0000000001f0655f kxsGetRuntimeLock () + 61f
    0000000001ffe0fb kksfbc () + 15fb
    0000000001ff998f kkspsc0 () + 4ff
    0000000001ffb813 kksParseCursor () + 93
    000000000292693d opiosq0 () + 75d
    00000000028d169d kpooprx () + dd
    00000000028ceb69 kpoal8 () + 2d9
    0000000000e97c6c opiodr () + 41c
    0000000003d9f6da ttcpip () + 46a
    0000000000e939d3 opitsk () + 503
    0000000000e96f18 opiino () + 3a8
    0000000000e97c6c opiodr () + 41c
    0000000000e924d1 opidrv () + 2f1
    0000000000e8f90b sou2o () + 5b
    0000000000e552e4 opimai_real () + 84
    0000000000e551b4 main () + 64

    • Tanel Poder says:

      Are you using bitmap indexes? (as the bug 5667683 mentions bitmap indexes) and the kkobmp() function sounds like a bitmap index related one.
      You could try a few options:
      1) use a lower optimizer_features_enabled value for this session/query in hope of optimizer not hitting this bug
      2) test the query by making the bitmap indexes invisible or by using a bunch of NO_INDEX hints listing all bitmap indexes on the tables accessed by this query – again in hope of not hitting this bug as you’ve prohibited evaluating these indexes
      3) test with setting the _b_tree_bitmap_plans = FALSE. especially when you don’t have bitmap indexes in your schema. This disables some CBO codepath and would help in case this spin happens in this codepath.
      4) patch the bug

      • wanqisheng says:

        sorry , until now to reply . At first, i’m follow Bug 5667683 description,
        remove statistics data, then lock it, the sql is can parse and run normal.
        Today, i have time to test your advice. before test, i’m unlock statistics data, and gather it, the sql is spin agin.
        1. use a lower optimizer_features_enabled value .
        at first, i’m use 10.2.0.1 to test(current is 10.2.0.3), the sql is spin again,
        check stack, it’s same with before.
        then i’m use 9.2.0.8, it’s ok, it’s finished at moment.
        2. if modify sql, it’s need developer modify and deploy new package for all
        location, so i’m not test it.
        3. setup the _b_tree_bitmap_plans = FALSE.
        (at our db, we don’t use bitmap index.)
        yes, the sql is run normal.
        after test, so i’m set the parameter in my spfile.
        thanks very much.

        I have click “post comment” , but IE refresh long time and not display,
        so i ‘m reply again.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>