Here’s an example of a quick’n'dirty way of profiling stack traces on your command line. This is an example from Solaris (but the script should work on Linux too plus other Unixes with minor modifications).
I created a problem case below, based on a case I once troubleshooted at a client site. Note that they had set optimizer_mode = FIRST_ROWS in their database and the optimized came up with a very inefficient execution plan for the select from DBA_LOCK_INTERNAL view below:
SQL> ALTER SESSION SET optimizer_mode = FIRST_ROWS; Session altered. SQL> SET TIMING ON SQL> SELECT * FROM dba_lock_internal; ... ... the output data stripped ... ... 927 rows selected. Elapsed: 00:23:27.14
It took over 23 minutes to return 927 rows from DBA_LOCK_INTERNAL!
I ran Snapper to see where the time is spent then:
SQL> @snapper4 all 5 1 222
Sampling SID 222 with interval 5 seconds, taking 1 snapshots...
-- Session Snapper v4.00 BETA - by Tanel Poder ( http://blog.tanelpoder.com ) - Enjoy the Most Advanced Oracle Troubleshooting Script on the Planet! :)
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SID, USERNAME , TYPE, STATISTIC , DELTA, HDELTA/SEC, %TIME, GRAPH , NUM_WAITS, WAITS/SEC, AVERAGES
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
222, SYS , STAT, non-idle wait count , 1, .19, , , , , ~ per execution
222, SYS , TIME, DB CPU , 6028084, 1.17s, 117.2%, [@@@@@@@@@@], , ,
222, SYS , TIME, sql execute elapsed time , 6032677, 1.17s, 117.3%, [##########], , ,
222, SYS , TIME, DB time , 6032677, 1.17s, 117.3%, [##########], , , ~ unaccounted time
222, SYS , WAIT, library cache: mutex X , 6, 1.17us, .0%, [ ], 1, .19, 6us average wait
-- End of Stats snap 1, end=2013-02-14 21:30:45, seconds=5.1
---------------------------------------------------------------------------------------------
Active% | SQL_ID | SQL_CHILD | EVENT | WAIT_CLASS
---------------------------------------------------------------------------------------------
100% | c884zcqpv9y5h | 0 | ON CPU | ON CPU
-- End of ASH snap 1, end=2013-02-14 21:30:45, seconds=5, samples_taken=50
The query is apparently 100% on CPU with no significant waits. Normally I would expect some other metrics to pop up in snapper output in such high-CPU cases, like thousands of session logical reads per second, sorts (rows) showing millions of rows sorted per second or parse count (hard) number in hundreds or thousands per second. These are the “usual suspects”.
But this time none of these additional metrics were incremented by the session. So it’s time to systematically drill down by other means.



