I bet you thought I’ll be writing about direct SGA access?! ;)
Nope!
Direct SGA access has excellent troubleshooting potential (as long as you know the shared memory data structures), but it has one major drawback – very few companies have such tools already in place in their production systems.
I have occasionally been called in to solve an urgent performance problem, happening right now and it needs solving immediately! And did I mention, these are critical production systems. Where you can’t just install binary executables freshly downloaded off internet. In fact you would want to diagnose the issue with minimal impact and changes required to those production environments (and that leaves sql tracing out the first round troubleshooting tools for me as well!)
So, I’ve developed myself a toolset for such purpose, Snapper and sw.sql and some process stack reading techniques I already have introduced in my blog.
Next in line is waitprof.sql which is a high-frequency V$SESSION_WAIT sampler – implemented in plain SQL (not PL/SQL).
Waitprof is basically a sampling session wait profiler. It’s like running a select against V$SESSION_WAIT in a very tight loop and aggregating results – but I have used a trick to do all this in plain SQL, which gives me performance advantage over PL/SQL based loops. Waitprof is able to sample V$SESSION_WAIT for a session up to 100 000 times per second!
This depends on your hardware of course and Oracle version too, but normally you’ll get 50-70kHz sampling rate with it.
Ok, you want to see an example? ;-)
I will run my lotspios.sql script in one session (SID=142), which, as the name says, generates lots of physical IOs.
SQL> @lotspios 100
Now, let say I’m interested what this session is doing over time (is it on CPU or waiting and if waiting, on what). For that I run waitprof.sql on that session. The syntax is following:
@waitprof
And lets run waitprof now, on SID 142, gather only events waited on and take 100000 samples:
SQL> @waitprof noprint 142 e 100000
-- WaitProf 1.03 by Tanel Poder ( http://www.tanelpoder.com )
% Total Total Event Distinct Avg time
SID STATE EVENT P1 Time Time ms Events ms/Event
------- ------- ----------------------------------- -------------------------- ------- ------------ ---------- ----------
142 WAITING db file scattered read 39.75 580.306 721 .805
142 WAITING gc buffer busy 30.90 451.198 485 .930
142 WORKING On CPU / runqueue 17.42 254.347 1786 .142
142 WAITING read by other session 10.35 151.081 313 .483
142 WAITING db file sequential read 1.56 22.834 244 .094
142 WAITING latch: cache buffers chains .02 .234 9 .026
6 rows selected.
Elapsed: 00:00:01.48
From the %Total Time column you see that during sampling that session stats, most of the time (39.75%) it was waiting for db file scattered read event. This roughly accounts for 580 milliseconds.
The Distinct Events column shows the number of distinct occurrences of that event, so during the 1.48 second sampling period 721 waits on db file scattered read was done. This column is based on V$SESSION_WAIT.SEQ# so it’s reasonably accurate over short periods of time. And the Avg time ms/Event column shows average duration for a single event, so an db file scattered read in this example took 805 microseconds.
This is the simplest usage of waitprof, however this is not why I wrote it. I want to know the details of those waits too!
The what_to_sample parameter can accept any combination of following values:
- e – sample event names (this is always done automatically)
- 1 – sample P1 values
- 2 – sample P2 values
- 3 – sample P3 values
- s – sample SEQ# values
So, I can run waitprof with e1 to get all wait event parameter details:
SQL> @waitprof noprint 142 e1 100000
-- WaitProf 1.03 by Tanel Poder ( http://www.tanelpoder.com )
% Total Total Event Distinct Avg time
SID STATE EVENT P1 Time Time ms Events ms/Event
------- ------- ----------------------------------- -------------------------- ------- ------------ ---------- ----------
142 WAITING db file scattered read file#= 6 49.07 814.496 1057 .771
142 WAITING gc buffer busy file#= 6 23.99 398.151 437 .911
142 WORKING On CPU / runqueue 20.38 338.358 1895 .179
142 WAITING read by other session file#= 6 5.72 94.985 210 .452
142 WAITING db file sequential read file#= 6 .83 13.712 159 .086
142 WAITING latch: cache buffers chains address= 422E843C .01 .083 1 .083
142 WAITING latch: cache buffers chains address= 41F8E900 .00 .066 1 .066
142 WAITING latch: cache buffers chains address= 437FCEC4 .00 .017 1 .017
142 WAITING latch: cache buffers chains address= 4238EE2C .00 .017 1 .017
142 WAITING latch: cache buffers chains address= 42342298 .00 .017 1 .017
142 WAITING latch: cache buffers chains address= 41FBF44C .00 .017 1 .017
142 WAITING latch: cache buffers chains address= 41FB4998 .00 .017 1 .017
142 WAITING latch: cache buffers chains address= 41FACB6C .00 .017 1 .017
142 WAITING latch: cache buffers chains address= 41F7D074 .00 .017 1 .017
142 WAITING latch: cache buffers chains address= 00000006 .00 .017 1 .017
142 WAITING latch: cache buffers chains address= 41FA98F4 .00 .017 1 .017
16 rows selected.
Elapsed: 00:00:01.70
So now I have the wait profile breakdown by event and its P1 value. For buffer related events we see that all IOs happened against datafile 6.
For latch waits, the P1 value is the latch address. This is especially helpful in pre-10g databases as it helps us to translate this address back to the actual latch name (and even to individual child latch) using V$LATCH_PARENT and V$LATCH_CHILDREN.
I have written a script la.sql for this:
SQL> @la 422E843C ADDR LATCH# CHLD NAME GETS IGETS MISSES IMISSES SLEEPS WAIT_TIME -------- ---------- ------ ---------------------------------------- ---------- ---------- ---------- ---------- ---------- ---------- 422E843C 122 1735 cache buffers chains 164637 64387 329 0 6 167
Waitprof allows you to use the what_to_sample parameters in any combination, for example, if I only want to know the breakdown of scattered reads by number of blocks read at a time (P3), I can use “3″ in the parameter as seen below. Note that as P2, P3 and SEQ# are not printed with noprint option (for shorter line width), then now we need to use print option:
SQL> @waitprof print 142 e3 100000
-- WaitProf 1.03 by Tanel Poder ( http://www.tanelpoder.com )
% Total Total Event Distinct Avg time
SID STATE EVENT P1 P2 P3 SEQ# Time Time ms Events ms/Event
------- ------- ----------------------------------- -------------------------- --------------- --------------- ---------- ------- ------------ ---------- ----------
142 WORKING On CPU / runqueue 36.03 544.053 1211 .449
142 WAITING gc buffer busy id#= 65537 28.73 433.838 393 1.104
142 WAITING db file scattered read blocks= 16 22.39 338.149 270 1.252
142 WAITING read by other session class#= 1 7.61 114.866 340 .338
142 WAITING db file scattered read blocks= 13 1.16 17.546 20 .877
142 WAITING db file scattered read blocks= 12 1.03 15.568 16 .973
142 WAITING db file scattered read blocks= 14 .49 7.459 7 1.066
142 WAITING db file scattered read blocks= 11 .46 6.976 9 .775
142 WAITING db file scattered read blocks= 10 .36 5.361 7 .766
142 WAITING db file scattered read blocks= 8 .31 4.696 7 .671
142 WAITING db file sequential read blocks= 1 .27 4.077 47 .087
142 WAITING db file scattered read blocks= 9 .27 4.062 6 .677
142 WAITING db file scattered read blocks= 3 .16 2.341 11 .213
142 WAITING db file scattered read blocks= 2 .15 2.235 14 .160
142 WAITING db file scattered read blocks= 15 .15 2.235 2 1.117
142 WAITING db file scattered read blocks= 4 .13 1.933 6 .322
142 WAITING db file scattered read blocks= 6 .12 1.737 5 .347
142 WAITING db file scattered read blocks= 5 .10 1.495 4 .374
142 WAITING db file scattered read blocks= 7 .07 1.027 2 .513
142 WAITING read by other session class#= 65537 .01 .136 9 .015
142 WAITING gc buffer busy id#= 1 .01 .106 7 .015
142 WAITING latch: cache buffers chains tries= 0 .00 .045 2 .023
142 WAITING read by other session class#= 0 .00 .015 1 .015
142 WAITING db file sequential read blocks= 2 .00 .015 1 .015
142 WAITING gc buffer busy id#= 5 .00 .015 1 .015
142 WAITING db file scattered read blocks= 1 .00 .015 1 .015
26 rows selected.
Elapsed: 00:00:01.89
From above we see that as far as IO operations are concerned, most of the response time during sampling was spent during 16/13/12 block reads. Other read sizes took less time.
Of course we can also make Waitprof show all P1,P2,P3 colums, but this may display lots of lines for an active session:
SQL> @waitprof print 142 e123 100000
-- WaitProf 1.03 by Tanel Poder ( http://www.tanelpoder.com )
% Total Total Event Distinct Avg time
SID STATE EVENT P1 P2 P3 SEQ# Time Time ms Events ms/Event
------- ------- ----------------------------------- -------------------------- --------------- --------------- ---------- ------- ------------ ---------- ----------
142 WORKING On CPU / runqueue 21.63 424.007 1970 .215
142 WAITING db file scattered read file#= 1 block#= 43525 blocks= 16 .16 3.175 1 3.175
142 WAITING db file scattered read file#= 1 block#= 43541 blocks= 12 .13 2.587 1 2.587
142 WAITING db file scattered read file#= 1 block#= 43493 blocks= 16 .11 2.234 1 2.234
142 WAITING db file scattered read file#= 1 block#= 44167 blocks= 16 .10 2.019 1 2.019
142 WAITING db file scattered read file#= 1 block#= 55466 blocks= 16 .10 1.901 2 .951
142 WAITING db file scattered read file#= 1 block#= 44041 blocks= 16 .10 1.882 1 1.882
142 WAITING db file scattered read file#= 1 block#= 43877 blocks= 16 .09 1.705 1 1.705
142 WAITING db file scattered read file#= 1 block#= 44067 blocks= 16 .09 1.686 1 1.686
142 WAITING db file scattered read file#= 1 block#= 44103 blocks= 16 .08 1.646 1 1.646
142 WAITING db file scattered read file#= 1 block#= 44135 blocks= 16 .08 1.568 1 1.568
142 WAITING db file scattered read file#= 1 block#= 43597 blocks= 16 .08 1.509 1 1.509
142 WAITING db file scattered read file#= 1 block#= 8696 blocks= 16 .08 1.490 1 1.490
[...lots of lines snipped...]
142 WAITING db file sequential read file#= 1 block#= 35630 blocks= 1 .00 .020 1 .020
142 WAITING db file sequential read file#= 1 block#= 35634 blocks= 1 .00 .020 1 .020
142 WAITING db file sequential read file#= 1 block#= 35636 blocks= 1 .00 .020 1 .020
142 WAITING db file sequential read file#= 1 block#= 35638 blocks= 1 .00 .020 1 .020
142 WAITING db file sequential read file#= 1 block#= 35640 blocks= 1 .00 .020 1 .020
142 WAITING db file sequential read file#= 1 block#= 35644 blocks= 1 .00 .020 1 .020
142 WAITING db file sequential read file#= 1 block#= 38097 blocks= 16 .00 .020 1 .020
142 WAITING db file sequential read file#= 1 block#= 43562 blocks= 1 .00 .020 1 .020
142 WAITING db file scattered read file#= 1 block#= 35529 blocks= 16 .00 .020 1 .020
1979 rows selected.
Even though the CPU usage is in the top of the list, it has only used 21.63% of total response time, so the rest 78% of the IO time is the “problem” here. The reason is that all CPU time is aggregated together (the time between waits), while most events are brought out separately due differences in P1,P2,P3 values.
If you want to see events in chronological order, then you can include SEQ# (which is the wait state sequence identifier) also in the output:
SQL> @waitprof print 142 e123s 1000
-- WaitProf 1.03 by Tanel Poder ( http://www.tanelpoder.com )
% Total Total Event Distinct Avg time
SID STATE EVENT P1 P2 P3 SEQ# Time Time ms Events ms/Event
------- ------- ----------------------------------- -------------------------- --------------- --------------- ---------- ------- ------------ ---------- ----------
142 WAITING db file scattered read file#= 6 block#= 43897 blocks= 13 26036 1.80 .360 1 .360
142 WORKING On CPU / runqueue 26036 2.40 .480 1 .480
142 WAITING db file scattered read file#= 6 block#= 44105 blocks= 13 26037 5.30 1.060 1 1.060
142 WORKING On CPU / runqueue 26037 1.40 .280 1 .280
142 WAITING db file scattered read file#= 6 block#= 44314 blocks= 12 26038 5.90 1.180 1 1.180
142 WORKING On CPU / runqueue 26038 1.70 .340 1 .340
142 WORKING On CPU / runqueue 26039 1.80 .360 1 .360
142 WAITING db file scattered read file#= 6 block#= 44534 blocks= 13 26039 6.40 1.280 1 1.280
142 WORKING On CPU / runqueue 26040 1.80 .360 1 .360
142 WAITING db file scattered read file#= 6 block#= 44742 blocks= 13 26040 6.20 1.240 1 1.240
142 WORKING On CPU / runqueue 26041 1.70 .340 1 .340
142 WAITING db file scattered read file#= 6 block#= 44950 blocks= 13 26041 6.00 1.200 1 1.200
142 WAITING db file scattered read file#= 6 block#= 45159 blocks= 12 26042 5.60 1.120 1 1.120
142 WORKING On CPU / runqueue 26042 2.10 .420 1 .420
142 WAITING db file scattered read file#= 6 block#= 45366 blocks= 13 26043 6.00 1.200 1 1.200
142 WORKING On CPU / runqueue 26043 1.70 .340 1 .340
142 WAITING db file scattered read file#= 6 block#= 45574 blocks= 13 26044 5.90 1.180 1 1.180
142 WORKING On CPU / runqueue 26044 1.80 .360 1 .360
142 WAITING db file scattered read file#= 6 block#= 45769 blocks= 13 26045 5.90 1.180 1 1.180
142 WORKING On CPU / runqueue 26045 1.60 .320 1 .320
142 WORKING On CPU / runqueue 26046 1.60 .320 1 .320
142 WAITING db file scattered read file#= 6 block#= 45978 blocks= 12 26046 5.50 1.100 1 1.100
142 WAITING db file scattered read file#= 6 block#= 46185 blocks= 13 26047 6.00 1.200 1 1.200
142 WORKING On CPU / runqueue 26047 1.80 .360 1 .360
142 WAITING db file scattered read file#= 6 block#= 46380 blocks= 13 26048 6.10 1.220 1 1.220
142 WORKING On CPU / runqueue 26048 1.70 .340 1 .340
142 WAITING db file scattered read file#= 6 block#= 46601 blocks= 13 26049 4.30 .860 1 .860
27 rows selected.
Here you see a detailed chronological view of events, ordered by their SEQ# number. There are few gotchas like SEQ# wrapping to zero after reaching 65535 and potentially missed events due sampling granularity, but more about this in an upcoming post. Btw, you can get more details about script usage from the script header and by reading the single SQL statement in there :)
One might ask why use such sampling script when you can enable extended SQL trace which also gives you chronological ordering and all wait parameters. One answer is that it’s more convenient to use a script for first-round diagnosis of performance problems. Another answer is that sometimes, in change-controlled production environments it may take quite some time to get the change for enabling trace through.
In an upcoming post I’ll show more (very cool) uses of this powerful sampling technique, a hint is that the script is called Latchprof ;-)
If you haven’t read my other Advanced Oracle Troubleshooting posts, here they are:
http://blog.tanelpoder.com/2007/06/18/advanced-oracle-troubleshooting-guide-when-the-wait-interface-is-not-enough-part-1/
http://blog.tanelpoder.com/2007/08/27/advanced-oracle-troubleshooting-guide-part-2-no-magic-is-needed-systematic-approach-will-do/
http://blog.tanelpoder.com/2007/09/06/advanced-oracle-troubleshooting-guide-part-3-more-adventures-in-process-stack/
http://blog.tanelpoder.com/2008/06/03/advanced-oracle-troubleshooting-guide-part-4-diagnosing-a-long-parsing-issue/
Meanwhile, can someone explain how my script can sample the same table (V$SESSION_WAIT) hundreds of thousands of times by running a single SQL statement only once? ;-)




Amazing use of connect by :)
awsome, not less.
Nice work, Tanel. This is a worthy successor to Steve Adams’ ten wait observations query, which is also brilliant though ancient now. Shakir comments on it here if you don’t know what I’m talking about: http://www.pythian.com/blogs/275/using-the-oracle-wait-interface-to-troubleshoot-io-issues
Thanks!
Yep I knew Steve’s cool script and I’ve used it in past. However I needed some more flexibility so I wrote WaitProf ;-)
Hi Tanel,
Very clever – I love it! :)
The uses for your ‘connect by’ idea are just endless.
Just Brilliant!
Thanks,
Adrian.
Thanks for a good idea.
This page has a small bug with sqlplus formatting – search, for example, for “@waitprof noprint 142 e 100000″ and you’ll see an issue at couple of rows below: minuses (-) are replaces with hyphens (#1812). I’m using an own plugin to prevent it link removed
Yep I’ve noticed that, however it’s not messing up any of the content of sqlplus output, so I haven’t bothered to see what’s wrong with it.
Note that I removed the link to .zip download file. Feel free to post a link to a *webpage* which explains your plugin, but not a link to the downloadable zip file directly…
Here it is – http://blog.softwarer.ru/?page_id=78 – but I have no english translation, so I doubt if it more useful :)
Thanks,
For translation, the google translate can help (more or less :)
http://www.google.com/translate?u=http%3A%2F%2Fblog.softwarer.ru%2F%3Fpage_id%3D78&hl=en&ie=UTF8&sl=ru&tl=en
Tanel,
Thank you for waitprof – very valuable.
At http://blog.tanelpoder.com/2008/03/06/hotsos-symposium-2008-presentations-and-files/, you mention an os_explain script, but I don’t see any reference to it and cannot find anything about it.
You responded that you would post some information about it in your next post – did you forget?
[friendly reminder :)]
Thanks,
Rich
Rich, I hadn’t forgotten :)…but I didn’t want to post the script without tidying it up and without a proper explanation.. so I managed to do this today – see this post
Hi Tanel
I was wondering if the sampling rate would be faster then the event chanes? I mean if you sample so fast what happen when you have sampled twice and the event have not changed?
The possible fix is COUNT(DISTINCT seq#) in your code :-?
Cheers
Hi lsc,
Yes, sampling rate normally is faster than the events.
Read through the code, the COUNT(DISTINCT seq#) is already there…
Yes yes, I was asking if COUNT(DISTINCT seq#) was filtering out the possible dulicates :-)
Thanks
[ Meanwhile, can someone explain how my script can sample the same table (V$SESSION_WAIT) hundreds of thousands of times by running a single SQL statement only once? ;-)
]
The HOW is more or less easy to find out by reading the code, you limited the number of samples to 1000*1000*1000. Am I right?
I’m just a little bit confused: in an ordinary table: if I select the same data 1 billion times within the same SQL, they should be equal. There seems to be a difference with the kernel related x$ tables to this rule? Or am I totally wrong?
thnx
berx
This 1000*1000*1000 trick is just for safe generation of rows (as a recursive connect by over millions of levels will take a lot of PGA memory, therefore I have 3 recursive connect bys over 1000 rows and join them). But this is not the trick which does the high-frequency sampling for me.
The real trick here is that I have deliberately set the join method to NESTED LOOPS and join order so that the X$ tables are inner rowsources. Thus the nested loop join mechanism is doing the sampling for me!
And yes, you’re right, for normal tables the read consistency applies, so you see a single snapshot of data during a query execution time, that’s read consistency. But X$ tables do not have read consistency built in, thus nested loop sampling becomes possible.
This is a great idea. This trick solves one “non-performance” related issue I had for two years with X$ tables. Many thanks!
Mike
Thanks,
Of course you’ve got to watch out that you’re not sampling the wrong X$ tables ultrafrequently as some X$ table access requires latch gets in order to protect the memrory structure while reading it.
The good news is that latchprof / latchprofx help to show what’s the latching impact of an X$ query…
This is a great script. Needed something like this, as we have very tight change control and getting a trace approve take like 2 days and untill then we are expected to see what the waits are and stuff like that. Couple of suggestions (maybe)… Yes the script looks at p1, p2, p3 values and point to the datafile#, and block# which can then tell us what table its waiting on. It would be nice to someway have the script where it gives you p1,p2,p3 value and also tell you the object name. Just a thought.
Also you mentioned. “Even though the CPU usage is in the top of the list, it has only used 21.63% of total response time, so the rest 78% of the IO time is the “problem” here.” — what are some suggestions if its the other way around ?? Lets say i have 78% of CPU usage..what can I do to get that down?? any suggestions ?
High CPU usage is good in some sense – meaning that the session can do its work on CPU without having to wait for anything much.
Of course if the job/session is still slow then I would run snapper on the session and see whether its perhaps hard parsing a lot or doing lots of logical IOs. In first case the application code must be changed (or cursor_sharing=force parameter can be used as a dirty workaround), in second case the heavy logical IOs may be due an inappropriate nested loop joins looping through the inner rowsources way too many times (revisiting blocks a lot etc)
Of course there are many more reasons for high CPU usage but these two are the first common ones that come into my mind.
The object name can be deducted from the file# and block_id (parameter1,parameter2) in dba_extents.
I use a script like this, it takes file# and block# as parameters:
select owner, segment_name, partition_name, tablespace_name, extent_id
from dba_extents
where file_id = &1
and &2 between block_id and block_id + blocks – 1;
Get an error with the latest version (3.11) of snapper.sql:
sys@FD> @snapper out 1 1 3
Sampling with interval 1 seconds, 1 times…
type snrec is record (stype varchar2(4), statistic# number, name varchar2(64));
*
ERROR at line 30:
ORA-06550: line 476, column 42:
PLS-00302: component ‘STATE’ must be declared
ORA-06550: line 475, column 11:
PL/SQL: Statement ignored
Have anybody else gotten this error?
@Lasse Jenssen
Are you running it on 9i? Snapper v3 doesn’t run on 9i yet, once I have the time I’ll change it… so you need to use snapper_v2 on 9i…
You are right! I’m on 9i. Should have testet an earlier version, but I was confident I did run the v3 against 9i earlier. I was obviously wrong. Fantastic script by the way. After starting using it, I keep wondering how I managed before ;-)
@Lasse Jenssen
Same here! ;)
Coolest use of ‘connect by’
Simplicity is the ultimate elegance!!!
Cheers
Mehul S.
Tanel,
I get the following error message when I run it on Solaris. Any idea? It works fine on Linux or AIX.
SQL> @snapper ash 5 1 ALL
Sampling SID ALL with interval 5 seconds, taking 1 snapshots…
– Session Snapper v3.13 by Tanel Poder @ E2SN ( http://tech.e2sn.com )
———————————————————————–
Active% | SQL_ID | EVENT | WAIT_CLASS
———————————————————————–
declare
*
ERROR at line 1:
ORA-00979: not a GROUP BY expression
ORA-06512: at line 831
ORA-06512: at line 1364
Thanks,
Venu