Advanced Oracle Troubleshooting Guide: When the wait interface is not enough [part 1]

Welcome to read my first real post on this blog!

If I ever manage to post any more entries, the type and style of content will be pretty much as this one: some Oracle problem diagnosis and troubleshooting techniques with some OS and hardware touch in it. Mmm… internals ;-)

Nevertheless I am also a fan of systematic approaches and methods so I plan to propose some less known OS and Oracle techniques for reducing guesswork in advanced troubleshooting even further.

Ok, to the topic.
Troubleshooting. Troubleshooting = finding out what is going on.

This post covers one unexplained issue I once had with Oracle external tables – which eventually turned out to be a problem with Oracle wait interface instrumentation. I used some of these “what’s going on” techniques to find out… what’s going on. Solaris 10 x64 / Oracle 10.2.0.2.


I worked on a project for which I needed to read data through an external table from an Unix pipe ( ever wanted to load compressed flat file contents to Oracle on-the-fly? ;-)

I created a Unix pipe:

$ mknod /tmp/tmp_pipe p

I created an Oracle external table, reading from that pipe:


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.2.0 - Production
With the Partitioning, OLAP and Data Mining options


USERNAME     INSTANCE_NAME    HOST_NAME                 VER        STARTED      SID SERIAL# SPID
------------ ---------------- ------------------------- ---------- -------- ------- ------- -------
TANEL        SOL01            solaris01                 10.2.0.2.0 20070618     470      14 724

Tanel@Sol01> CREATE DIRECTORY dir AS '/tmp';

Directory created.

Tanel@Sol01> CREATE TABLE ext (
  2     value number
  3  )
  4  ORGANIZATION EXTERNAL (
  5    TYPE oracle_loader
  6    DEFAULT DIRECTORY dir
  7      ACCESS PARAMETERS (
  8      FIELDS TERMINATED BY ';'
  9      MISSING FIELD VALUES ARE NULL
 10      (value)
 11      )
 12      LOCATION ('tmp_pipe')
 13    )
 14  ;

Table created.

Tanel@Sol01> select * from ext;

So far so good… unfortunately this select statement never returned any results. As it turned out later, the gunzip over remote ssh link which should have fed the Unix pipe with flat file data, had got stuck.
Without realizing that, I approached this potential session hang condition with first obvious check – a select from V$SESSION_WAIT:

Tanel@Sol01> select sid, event, state, seq#, seconds_in_wait, p1,p2,p3
  2  from v$session_wait
  3  where sid = 470;

    SID EVENT                          STATE                     SEQ# SECONDS_IN_WAIT         P1         P2         P3
------- ------------------------------ ------------------- ---------- --------------- ---------- ---------- ----------
    470 db file sequential read        WAITED KNOWN TIME          164            7338          1       1892          1

Tanel@Sol01> /

    SID EVENT                          STATE                     SEQ# SECONDS_IN_WAIT         P1         P2         P3
------- ------------------------------ ------------------- ---------- --------------- ---------- ---------- ----------
    470 db file sequential read        WAITED KNOWN TIME          164            7353          1       1892          1

Tanel@Sol01> /

    SID EVENT                          STATE                     SEQ# SECONDS_IN_WAIT         P1         P2         P3
------- ------------------------------ ------------------- ---------- --------------- ---------- ---------- ----------
    470 db file sequential read        WAITED KNOWN TIME          164            7374          1       1892          1

Tanel@Sol01>

The STATE and SECONDS_IN_WAIT columns in V$SESSION_WAIT say we have been crunching the CPU for last two hours, right? (as WAITED… means NOT waiting on any event, in this case the EVENT just shows the last event on which we waited before getting on CPU)

Hmm.. let’s check it out:

$ prstat -p 724
   PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/NLWP
   724 oracle    621M  533M sleep   59    0   0:00:00 0.0% oracle/1

prstat reports that this process is currently in sleep state, is not using CPU and has used virtually no CPU during its 2-hour “run” time!

Let’s check with ps (which is actually a quite powerful tool):

$ ps -o user,pid,s,pcpu,time,etime,wchan,comm -p 724
    USER   PID S %CPU        TIME     ELAPSED            WCHAN COMMAND
  oracle   724 S  0.0       00:01    02:18:08 ffffffff8135cadc oracleSOL01

ps also confirms that the process 724 has existed for over 2 hours 18 minutes (ELAPSED), but has only used roughly 1 second of CPU time (TIME). The state column “S” also indicates the sleeping status.

So, either Oracle V$SESSION_WAIT or standard Unix tools are lying to us. From above evidence it is pretty clear that it’s Oracle who’s lying (also, in cases like that, lower-level instrumentation always has a better chance to know what’s really going on at the upper level than vice versa).

So, let’s use truss (or strace on Linux, tusc on HP-UX) to see if our code is making any system calls or is sleeping within a system call…

$ truss -p 724
read(14, 0xFFFFFD7FFD6FDE0F, 524273) (sleeping...)

Hmm, as no followup is printed to this line, it looks like the process is waiting for a read operation on a file descriptor 14 to complete.
Which file is this fd 14 about?

$ pfiles 724
724:    oracleSOL01 (LOCAL=NO)

...snip...

  14: S_IFIFO mode:0644 dev:274,2 ino:4036320452 uid:100 gid:300 size:0
      O_RDONLY|O_LARGEFILE
      /tmp/tmp_pipe
...snip...

So from here it’s already pretty obvious where the problem is. There is no data coming from the tmp_pipe. This led me to check what was my gunzip doing on the other end of the pipe and it was stuck, in turn waiting for ssh to feed more data into it. And ssh had got stuck due some network transport issue.

The baseline is that you can rely on low-level (OS) tools to identify what’s really going on when higher level tools (like Oracle wait interface) provide weird or contradicting information, in this case the Oracle wait interface was not recording external table read wait events. I reported this info to Oracle people and I think it has been filed as a bug by now.


This was only a simple demo, identifying a pretty clear case of a session hang, however with use of a pretty intrusive tool ( I would not attach truss to a busy production instance process without thinking twice ).

However there are other options. In the next part of this guide ( when I manage to write it ) I will deal with more complex problems like what to do when the session is not reporting significant waits and is spinning heavily on CPU. Using Oracle and Unix tools it is quite easy to figure out the execution profile of a spinning server process, even without connecting to Oracle at all ( do I hear pstack, mdb and stack tracing? ;-)

As I’ve just started blogging, I would appreciate any feedback, including about things like blog layout, font sizes, readability, understandability etc. Also I think it will take few days before I manage to post the Part 2 of this troubleshooting guide.

Thank you for your patience reading through this :-)

Note that this year’s only Advanced Oracle Troubleshooting class takes place in the end of April/May 2014, so sign up now if you plan to attend this year!

This entry was posted in Oracle and tagged , , . Bookmark the permalink.

27 Responses to Advanced Oracle Troubleshooting Guide: When the wait interface is not enough [part 1]

  1. oramoss says:

    Welcome Tanel!

    An interesting post and well presented too.

    Seems like we’ll be getting an education here at your blog.

    I guess my only complaint was that, unless I made a mistake, I had to be logged in to WordPress to make a comment and that necessitated me creating a WordPress account which I don’t really need…can you configure the blog to allow people to comment without requiring a WordPress account or was that an intentional move on your part?

    Cheers
    Jeff Moss
    http://oramossoracle.blogspot.com

  2. Tanel says:

    Thanks Jeff!
    I changed the setting and am just testing commenting without logging on, now.
    If you see this message, then it worked!

    :)

  3. Peter McLarty says:

    Hi Tanel
    Welcome aboard the great blogosphere.

    Great Post, and after seeing some of your assistance on Oracle-L then I expect we will see some interesting items here.

    Cheers
    Peter

  4. Doug Burns says:

    Tanel,

    A very welcome addition to the list of Oracle blogs. Keep up the good work.

    Cheers,

    Doug

  5. Welcome to the blogging world Tanel. I for one am looking forward to some great internals insights. I have added your blog to my Oracle blogs aggregator at http://www.petefinnigan.com/news/blogs

    cheers

    Pete

  6. Tanel,

    keep up the good work. Great you started a blog. Maybe an idea to ask Eddie Awad to add this site to the http://www.orana.info aggregator so we can keep up ;-)

    Marco

  7. carlosal says:

    Welcome Tanel.

    I think It would be fine if you added a RSS feed link to your home page.

    Nice site, nice post.

    Cheers.

    Carlos.

  8. tanelp says:

    Thanks Carlos, good idea!

    Will check it out how to do it..

  9. Gus Spier says:

    Tanel,

    Thanks for the tutorial! To me, the lesson learned is to start at the top (higher level tools (like Oracle wait interface)) and work your way down (low-level (OS) tools).

    The difficulty is going to be remembering the tools, commands, and interpretation of the output. Stuff like this doesn’t happen to us unless it’s a production environment and the end-users are screaming. And, we’ve been unfortunate in that our predecessors shielded us from these issues.

    Regards,

    Gus

  10. coskan says:

    Thank you for this very educational entry and Welcome to database blogosphere

    Keep blogging

    Coskan

  11. Welcome Tanel.

    Hope you take some time from your busy timing to writing somthing in your blog frequently.

    Jaffar

  12. Pingback: Log Buffer #50: A Carnival of the Vanities for DBAs — Eye on Oracle

  13. I notice that SEQ# in
    your query on V$SESSION_WAIT was not incrementing — meaning that it was still the same occurrence of a wait event. Also, you should then look at WAIT_TIME — if WAIT_TIME is 0 then it is still the current wait going on. The last (completed) wait would appear in WAIT_TIME as a non-zero value, so a Zero in WAIT_TIME should also mean that the session is _currently_ still waiting.

  14. tanelp says:

    Hi Hemant,
    Looks like you didn’t read my post thoroughly ;)

    I deliberately included multiple samples from v$session_wait to show that SEQ# was not increasing.
    The key point of this post was exactly that v$session_wait.state shows WAITED… (sic!) which means NOT WAITING, which means “ready to be on CPU”, which should result in the process consuming all CPU time it can as there were no other wait events reported.

    However that was not the case as I have proved, it is an Oracle instrumentation bug instead.

    I normally do not use the WAIT_TIME column in my scripts and demos as it just confuses things. The V$SESSION_WAIT.STATE already shows all what is needed (e.g. waiting or not waiting). By the way, STATE column values are derived from exactly the same column in x$ksusest as WAIT_TIME, so it is physically impossible to have a state other than WAITING (as in my demo) and WAIT_TIME=0 at the same time.

  15. The STATE of “WAITED KNOWN TIME” may be a bug. It should have showed “WAITING”. The bug could be that the state did not change to “WAITING” although SECONDS_IN_WAIT for the current wait (as indicated by SEQ# being the same) was being incremented. The bug could be with “db file sequential read” {why SEQUENTIAL READ — shouldn’t it have been SCATTERED READ when doing an FTS ?} when reading from a PIPE, not a regular file, as an External Table.
    If it should have been a “db file scattered read” on the “SELECT *” then even the event showing as “db file sequential read” could be wrong.

    I was referring to the fact that we should read both columns WAIT_TIME and SECONDS_IN_WAIT.
    From the documentation :

    “WAIT_TIME NUMBER A nonzero value is the session’s last wait time. A zero value means the session is currently waiting. ”

    “SECONDS_IN_WAIT NUMBER If WAIT_TIME = 0, then SECONDS_IN_WAIT is the seconds spent in the current wait condition. If WAIT_TIME > 0, then SECONDS_IN_WAIT is the seconds since the start of the last wait, and SECONDS_IN_WAIT – WAIT_TIME / 100 is the active seconds since the last wait ended. ”

    Further, the doc on STATE=’WAITED KNOWN TIME’ says “WAIT_TIME = duration of last wait ”

    WAIT_TIME gets posted on completion of the wait.
    SECONDS_IN_WAIT, obviously, is continuously incremented.

    So I was interested in knowing what WAIT_TIME was showing as well, along with SECONDS_IN_WAIT. That is to say, we should read WAIT_TIME with SECONDS_IN_WAIT, when in doubt about what V$SESSION_WAIT.STATE shows us.
    Maybe one was for the “db file sequential read” (that could have occurred _before_ actually starting to do the FTS) and the other for the “db file scattered read” that I would have expected.

    Like you, I, too, use SECONDS_IN_WAIT in my “current_waits” reporting script.

  16. tanelp says:

    This IS a bug as I have proven already. Oracle IS WAITING in a OS read syscall and the only correct state in V$SESSION_WAIT.STATE is WAITING, regardless of values in any other columns. This contradiction alone proves that Oracle behaves incorrectly.

    Regarding what documentation says, the section about wait interface in Oracle has always been lacking concreteness, thus the myths about having to use the WAIT_TIME and such. Just the fact that something is documented, doesn’t make it a fact ;-)

    The SECONDS_OF_WAIT is an unfortunate misnaming, it should really be called SECONDS_IN_STATE, as it’s just the time since last wait state change (e.g. from waiting to not-waiting or vice versa). It’s measured and incremented even if being constantly on CPU without any waiting at all. Every time LGWR runs, it calculates the time delta since last wait state change for each session and updates the fixed table under V$SESSION_WAIT.

    Normally when you enter a wait, this wait state change timestamp is reset and SEQ# is incremented. What happens in my test case, however, is that both last SEQ# and SECONDS_IN_WAIT are not changed at all, their values persist (the latter continues to increment, is not zeroed).

    I suspect this lack of instrumentation is because external tables use an ODCI cartridge for loading data, which in turn call “trusted” external sqlldr library functions (which however are linked in to oracle binary on Solaris). Perhaps noone wanted to change the generic sqlldr functions to include wait instrumentation, perhaps there are limitations altering sessions wait state from external library, perhaps it was just missed…

    Note that the wait event should not have been a db file scattered nor sequential read as the external tables do not read from datafiles to buffer cache, they just invoke ODCI cartride, which invokes a method from object type (ORACLE_LOADER in our case) which in turn call a function in external library (qxxq_fetch in qxxqlib) for reading the data from text file using sqlloader functions. The returned data is presented as a rowsource to the consumer who invoked external table read. Now that I look into it, I don’t even see a wait event anything named “external” or ODCI related. So perhaps it’s a design flaw.

  17. Pingback: Case Study: Statspack/AWR Latch Waits (Part 2) : Ardent Performance Computing

  18. Pingback: Advanced Oracle Troubleshooting Guide, Part 2: No magic is needed, systematic approach will do « Tanel Poder’s blog: Core IT for geeks and pros

  19. Pingback: Log Buffer #60: a Carnival of the Vanities for DBAs · Steve Karam · The Oracle Alchemist

  20. Car says:

    Based on your review no 16, to whom SECONDS_IN_WAIT that accounted when not waiting at all (constantly on cpu) belongs ? How do you got that ? Please clarify ?

  21. Pingback: My Blog is 1 year old! 8-) « Tanel Poder’s blog: Core IT for geeks and pros

  22. jiulu sun says:

    nice work, where can i find your article of “I will deal with more complex problems like what to do when the session is not reporting significant waits and is spinning heavily on CPU”, that seems exactly happened on my system.

  23. You should just have waited for 18 months or more and change the Release (or wait any longer for 10.2.0.5).
    Now Preprocessor For External Tables is out and should solve all these problems (whereas it should be still checked how it’s shown in wait events).
    Just for the records (I’m sure you know it): http://structureddata.org/2008/11/19/preprocessor-for-external-tables/
    Before you ask: yes, it’s fun to read old blog entries on Saturday evening! ;-)

  24. Tanel Poder says:

    Hi Martin,

    Yeah, starting from 11.1.0.7 (and probably 10.2.0.5 but I haven’t checked) there are new wait events related to external table access:

    SQL> @sed “external table”

    EVENT# EVENT_NAME PARAMETER1 PARAMETER2 PARAMETER3
    ------ --------------------------------------------- ------------------------------ ------------------------------ ----------
    293 external table misc IO filectx iocode P3
    291 external table read filectx file# size
    292 external table write filectx file# size

    Btw, I don’t have to ask, I *know* that it’s fun to read old (quality) blog entries on Saturday evenings! ;)

  25. Pingback: Alter system kill session and ORA-00031: session marked for kill | Tanel Poder's blog: Core IT for Geeks and Pros

  26. Pingback: Oracle

  27. Pingback: Troubleshooting | Oracle Scratchpad

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>