<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Tanel Poder's blog: Core IT for Geeks and Pros &#187; Troubleshooting</title>
	<atom:link href="http://blog.tanelpoder.com/category/troubleshooting/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.tanelpoder.com</link>
	<description>Oracle troubleshooting, internals and performance tuning</description>
	<lastBuildDate>Sat, 31 Jul 2010 05:44:10 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Exadata v2 Smart Scan Performance Troubleshooting article</title>
		<link>http://blog.tanelpoder.com/2010/07/30/exadata-v2-smart-scan-performance-troubleshooting-article/</link>
		<comments>http://blog.tanelpoder.com/2010/07/30/exadata-v2-smart-scan-performance-troubleshooting-article/#comments</comments>
		<pubDate>Fri, 30 Jul 2010 22:18:25 +0000</pubDate>
		<dc:creator>Tanel Poder</dc:creator>
				<category><![CDATA[Exadata]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Oracle 11gR2]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Troubleshooting]]></category>
		<category><![CDATA[Tuning]]></category>

		<guid isPermaLink="false">http://blog.tanelpoder.com/?p=727</guid>
		<description><![CDATA[I finally finished my first Exadata performance troubleshooting article. This explains one bug I did hit when stress testing an Exadata v2 box, which caused smart scan to go very slow &#8211; and how I troubleshooted it: Troubleshooting Exadata v2 Smart Scan Performance Thanks to my secret startup company I&#8217;ve been way too busy to [...]]]></description>
			<content:encoded><![CDATA[<p>I finally finished my first Exadata performance troubleshooting article.</p>
<p>This explains one bug I did hit when stress testing an Exadata v2 box, which caused smart scan to go very slow &#8211; and how I troubleshooted it:</p>
<ul>
<li><a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL3RlY2guZTJzbi5jb20vb3JhY2xlL2V4YWRhdGEvcGVyZm9ybWFuY2UtdHJvdWJsZXNob290aW5nL2V4YWRhdGEtc21hcnQtc2Nhbi1wZXJmb3JtYW5jZQ==" target=\"_blank\">Troubleshooting Exadata v2 Smart Scan Performance</a></li>
</ul>
<p>Thanks to my secret startup company I&#8217;ve been way too busy to write anything serious lately, but apparently staying up until 6am helped this time! :-) Anyway, maybe next weekend I can repeat this and write Part 2 in the Exadata troubleshooting series ;-)</p>
<p>Enjoy! Comments are welcome to this blog entry as I haven&#8217;t figured out a good way to enable comments in the google sites page I&#8217;m using&#8230;</p>
<div class="facebook_like_button"><iframe src="http://www.facebook.com/plugins/like.php?href=http%3A%2F%2Fblog.tanelpoder.com%2F2010%2F07%2F30%2Fexadata-v2-smart-scan-performance-troubleshooting-article%2F&amp;layout=standard&amp;show-faces=true&amp;width=450&amp;action=like&amp;font=arial&amp;colorscheme=light" scrolling="no" frameborder="0" allowTransparency="true" style="padding: 0px 0px; border:none; overflow:hidden; width:450px; height:70px;"></iframe></div> <img src="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?view=1&post_id=727" width="1" height="1" style="display: none;" /><p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save"><img src="http://blog.tanelpoder.com/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://blog.tanelpoder.com/2010/07/30/exadata-v2-smart-scan-performance-troubleshooting-article/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The full power of Oracle&#8217;s diagnostic events, part 2: ORADEBUG DOC and 11g improvements</title>
		<link>http://blog.tanelpoder.com/2010/06/23/the-full-power-of-oracles-diagnostic-events-part-2-oradebug-doc-and-11g-improvements/</link>
		<comments>http://blog.tanelpoder.com/2010/06/23/the-full-power-of-oracles-diagnostic-events-part-2-oradebug-doc-and-11g-improvements/#comments</comments>
		<pubDate>Wed, 23 Jun 2010 11:46:31 +0000</pubDate>
		<dc:creator>Tanel Poder</dc:creator>
				<category><![CDATA[Cool stuff]]></category>
		<category><![CDATA[Internals]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Oracle 11g]]></category>
		<category><![CDATA[Tools]]></category>
		<category><![CDATA[Troubleshooting]]></category>

		<guid isPermaLink="false">http://blog.tanelpoder.com/?p=697</guid>
		<description><![CDATA[I haven&#8217;t written any blog entries for a while, so here&#8217;s a very sweet treat for low-level Oracle troubleshooters and internals geeks out there :) Over a year ago I wrote that Oracle 11g has a completely new low-level kernel diagnostics &#38; tracing infrastructure built in to it. I wanted to write a longer article [...]]]></description>
			<content:encoded><![CDATA[<p>I haven&#8217;t written any blog entries for a while, so here&#8217;s a very sweet treat for low-level Oracle troubleshooters and internals geeks out there :)</p>
<p>Over a year ago <a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL2Jsb2cudGFuZWxwb2Rlci5jb20vMjAwOS8wMy8wMy90aGUtZnVsbC1wb3dlci1vZi1vcmFjbGVzLWRpYWdub3N0aWMtZXZlbnRzLXBhcnQtMS1zeW50YXgtZm9yLWtzZC1kZWJ1Zy1ldmVudC1oYW5kbGluZy8=" target=\"_blank\">I wrote</a> that Oracle 11g has a completely new low-level kernel diagnostics &amp; tracing infrastructure built in to it. I wanted to write a longer article about it with comprehensive examples and use cases, but by now I realize I won&#8217;t ever have time for this, so I&#8217;ll just point you to the right direction :)</p>
<p>Basically, since 11g, you can use SQL_Trace, kernel undocumented traces, various dumps and other actions at much better granularity than before.</p>
<p>For example, you can enable SQL_Trace for a specific SQL_ID only:</p>
<pre>SQL&gt; alter session set events 'sql_trace[<strong>SQL: 32cqz71gd8wy3</strong>] 
<span style="font-size: 11.6667px;">{<strong>pgadep: exactdepth 0</strong>} {<strong>callstack: fname opiexe</strong>}
plan_stat=all_executions,wait=true,bind=true';</span>
<span style="font-size: 11.6667px;">
</span>
<span style="font-size: 11.6667px;">Session altered.</span></pre>
<p><span style="font-size: 13.3333px;">Actually I have done more in above example, I have also said that trace only when the PGA depth (the dep= in tracefile) is zero. This means that trace only top-level calls, issued directly by the client application and not recursively by some PL/SQL or by dictionary cache layer. Additionally I have added a check whether we are currently servicing opiexe function (whether the current call stack contains opiexe as a (grand)parent function) &#8211; this allows to trace &amp; dump only in specific cases of interest!</span></p>
<p>The syntax is actually more powerful than that, in this example I&#8217;m running kernel tracing for a kernel component plus instructing Oracle to dump various other things at level 1 (callstack,process state and query block debug info) whenever a tracepoint (event) in the SQL Transformation component family is hit:</p>
<pre>SQL&gt; alter session set events 'trace[<strong>RDBMS.SQL_Transform</strong>] <span style="font-size: 11.6667px;">[<strong>SQL: 32cqz71gd8wy3</strong>]
disk=high <strong>RDBMS.query_block_dump(1) processstate(1) callstack(1)</strong>';</span>
<span style="font-size: 11.6667px;">
</span>
<span style="font-size: 11.6667px;">Session altered.</span></pre>
<p>And by now you are probably asking that where is this syntax formally documented? Google and MOS searches don&#8217;t return anything useful. Well, as with many other things, a good reference is stored within Oracle kernel itself!</p>
<p>Just log on as sysdba and type ORADEBUG DOC:</p>
<p><strong>ORADEBUG DOC</strong></p>
<pre>SQL&gt; oradebug doc</pre>
<pre>Internal Documentation
<span style="font-size: 11.6667px;">**********************
</span><span style="font-size: 11.6667px;">EVENT                           Help on events (syntax, event list, ...)
</span><span style="font-size: 11.6667px;">COMPONENT       [&lt;comp_name&gt;]   List all components or describe &lt;comp_name&gt;</span></pre>
<p>This gives you the index page, now you can navigate on by running ORADEBUG DOC EVENT and take it from there. There&#8217;s lots of documentation there!</p>
<p><span style="font-size: 13.3333px;">I have put the output with some comments and examples into my website too:</span></p>
<p><span style="font-size: 13.3333px;"><a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL3RlY2guZTJzbi5jb20vb3JhY2xlL3Ryb3VibGVzaG9vdGluZy9vcmFkZWJ1Zy1kb2M=" target=\"_blank\">http://tech.e2sn.com/oracle/troubleshooting/oradebug-doc</a></span></p>
<p>Note that this feature is quite fresh, almost not used at all in the real (production) world, so I consider this quite experimental. I have managed to crash my session with some tests, so take the usual advice about any undocumented stuff (and oradebug) &#8211; don&#8217;t use it in production without thinking first and if you do use it, then use it at your own risk!</p>
<div class="facebook_like_button"><iframe src="http://www.facebook.com/plugins/like.php?href=http%3A%2F%2Fblog.tanelpoder.com%2F2010%2F06%2F23%2Fthe-full-power-of-oracles-diagnostic-events-part-2-oradebug-doc-and-11g-improvements%2F&amp;layout=standard&amp;show-faces=true&amp;width=450&amp;action=like&amp;font=arial&amp;colorscheme=light" scrolling="no" frameborder="0" allowTransparency="true" style="padding: 0px 0px; border:none; overflow:hidden; width:450px; height:70px;"></iframe></div> <img src="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?view=1&post_id=697" width="1" height="1" style="display: none;" /><p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save"><img src="http://blog.tanelpoder.com/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://blog.tanelpoder.com/2010/06/23/the-full-power-of-oracles-diagnostic-events-part-2-oradebug-doc-and-11g-improvements/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Oracle memory troubleshooting article</title>
		<link>http://blog.tanelpoder.com/2010/05/28/oracle-memory-troubleshooting-article/</link>
		<comments>http://blog.tanelpoder.com/2010/05/28/oracle-memory-troubleshooting-article/#comments</comments>
		<pubDate>Fri, 28 May 2010 16:37:45 +0000</pubDate>
		<dc:creator>Tanel Poder</dc:creator>
				<category><![CDATA[Internals]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Troubleshooting]]></category>

		<guid isPermaLink="false">http://blog.tanelpoder.com/?p=694</guid>
		<description><![CDATA[Randolf Geist has written a good article about systematic troubleshooting of a PL/SQL memory allocation &#38; CPU utilization problem &#8211; and he has used some of my tools too! http://oracle-randolf.blogspot.com/2010/05/advanced-oracle-troubleshooting-session.html]]></description>
			<content:encoded><![CDATA[<p>Randolf Geist has written a good article about systematic troubleshooting of a PL/SQL memory allocation &amp; CPU utilization problem &#8211; and he has used some of my tools too!</p>
<p><a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL29yYWNsZS1yYW5kb2xmLmJsb2dzcG90LmNvbS8yMDEwLzA1L2FkdmFuY2VkLW9yYWNsZS10cm91Ymxlc2hvb3Rpbmctc2Vzc2lvbi5odG1s" target=\"_blank\">http://oracle-randolf.blogspot.com/2010/05/advanced-oracle-troubleshooting-session.html</a></p>
<div class="facebook_like_button"><iframe src="http://www.facebook.com/plugins/like.php?href=http%3A%2F%2Fblog.tanelpoder.com%2F2010%2F05%2F28%2Foracle-memory-troubleshooting-article%2F&amp;layout=standard&amp;show-faces=true&amp;width=450&amp;action=like&amp;font=arial&amp;colorscheme=light" scrolling="no" frameborder="0" allowTransparency="true" style="padding: 0px 0px; border:none; overflow:hidden; width:450px; height:70px;"></iframe></div> <img src="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?view=1&post_id=694" width="1" height="1" style="display: none;" /><p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save"><img src="http://blog.tanelpoder.com/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://blog.tanelpoder.com/2010/05/28/oracle-memory-troubleshooting-article/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Quiz: Explaining index creation</title>
		<link>http://blog.tanelpoder.com/2010/04/23/quiz-explaining-index-creation/</link>
		<comments>http://blog.tanelpoder.com/2010/04/23/quiz-explaining-index-creation/#comments</comments>
		<pubDate>Fri, 23 Apr 2010 14:49:02 +0000</pubDate>
		<dc:creator>Tanel Poder</dc:creator>
				<category><![CDATA[Administration]]></category>
		<category><![CDATA[Cool stuff]]></category>
		<category><![CDATA[Internals]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Troubleshooting]]></category>

		<guid isPermaLink="false">http://blog.tanelpoder.com/?p=680</guid>
		<description><![CDATA[Did you know that it&#8217;s possible to use EXPLAIN PLAN FOR CREATE INDEX ON table(col1,col2,col3) syntax for explaining what exactly would be done when an index is created? That&#8217;s useful for example for seeing the Oracle&#8217;s estimated index size without having to actually create the index. You can also use EXPLAIN PLAN FOR ALTER INDEX [...]]]></description>
			<content:encoded><![CDATA[<p>Did you know that it&#8217;s possible to use EXPLAIN PLAN FOR <strong>CREATE INDEX </strong>ON table(col1,col2,col3) syntax for explaining what exactly would be done when an index is created?</p>
<p>That&#8217;s useful for example for seeing the Oracle&#8217;s estimated index size without having to actually create the index.</p>
<p>You can also use EXPLAIN PLAN FOR <strong>ALTER INDEX</strong> i <strong>REBUILD</strong> to see whether this operation would use a FULL TABLE SCAN or a FAST FULL INDEX SCAN (offline index rebuilds of valid indexes can use this method).</p>
<p>Anyway, you can experiment with this yourself, but here&#8217;s a little quiz (with a little gotcha :)</p>
<p>What kind of index creation statement would create such an execution plan?</p>
<pre>SQL&gt; explain plan for create index hack_index on hack_table ....the rest is a secret for now....
Explained.
<span style="font-family: Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif; font-size: small;"><span style="line-height: 19px; white-space: normal;"><span style="font-family: Consolas, Monaco, 'Courier New', Courier, monospace; font-size: small;"><span style="line-height: 18px; white-space: pre;">
<pre>PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 457720527
-------------------------------------------------------------------------------------------------
| Id  | Operation                | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | CREATE INDEX STATEMENT   |                      | 88868 |       |   657   (1)| 00:00:12 |
|   1 |  INDEX BUILD NON UNIQUE  | HACK_INDEX           |       |       |            |          |
|   2 |   SORT AGGREGATE         |                      |     1 |       |            |          |
|   3 |    VIEW                  | HACK_VIEW            | 74062 |       |   318   (1)| 00:00:06 |
|*  4 |     HASH JOIN            |                      | 74062 |  1012K|   210   (1)| 00:00:04 |
|   5 |      TABLE ACCESS FULL   | TEST_USERS           |    46 |   368 |     4   (0)| 00:00:01 |
|   6 |      INDEX FAST FULL SCAN| IDX2_INDEXED_OBJECTS | 74062 |   433K|   206   (1)| 00:00:04 |
|   7 |   SORT CREATE INDEX      |                      | 88868 |       |            |          |
|   8 |    TABLE ACCESS FULL     | HACK_TABLE           | 88868 |       |   544   (1)| 00:00:10 |
-------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("U"."USERNAME"="O"."OWNER")
Note
-----
- estimated index size: 3145K bytes</pre>
<p></span></span></span></span></pre>
<div class="facebook_like_button"><iframe src="http://www.facebook.com/plugins/like.php?href=http%3A%2F%2Fblog.tanelpoder.com%2F2010%2F04%2F23%2Fquiz-explaining-index-creation%2F&amp;layout=standard&amp;show-faces=true&amp;width=450&amp;action=like&amp;font=arial&amp;colorscheme=light" scrolling="no" frameborder="0" allowTransparency="true" style="padding: 0px 0px; border:none; overflow:hidden; width:450px; height:70px;"></iframe></div> <img src="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?view=1&post_id=680" width="1" height="1" style="display: none;" /><p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save"><img src="http://blog.tanelpoder.com/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://blog.tanelpoder.com/2010/04/23/quiz-explaining-index-creation/feed/</wfw:commentRss>
		<slash:comments>13</slash:comments>
		</item>
		<item>
		<title>cursor: pin S waits, sporadic CPU spikes and systematic troubleshooting</title>
		<link>http://blog.tanelpoder.com/2010/04/21/cursor-pin-s-waits-sporadic-cpu-spikes-and-systematic-troubleshooting/</link>
		<comments>http://blog.tanelpoder.com/2010/04/21/cursor-pin-s-waits-sporadic-cpu-spikes-and-systematic-troubleshooting/#comments</comments>
		<pubDate>Wed, 21 Apr 2010 21:55:09 +0000</pubDate>
		<dc:creator>Tanel Poder</dc:creator>
				<category><![CDATA[Internals]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Oracle 11g]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Troubleshooting]]></category>
		<category><![CDATA[Unix/Linux]]></category>

		<guid isPermaLink="false">http://blog.tanelpoder.com/?p=673</guid>
		<description><![CDATA[I recently consulted one big telecom and helped to solve their sporadic performance problem which had troubled them for some months. It was an interesting case as it happened in the Oracle / OS touchpoint and it was a product of multiple &#8220;root causes&#8221;, not just one, an early Oracle mutex design bug and a [...]]]></description>
			<content:encoded><![CDATA[<p>I recently consulted one big telecom and helped to solve their sporadic performance problem which had troubled them for some months. It was an interesting case as it happened in the Oracle / OS touchpoint and it was a product of multiple &#8220;root causes&#8221;, not just one, an early Oracle mutex design bug and a Unix scheduling issue &#8211; that&#8217;s why it had been hard to resolve earlier despite multiple SRs opened etc.</p>
<p><a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL21hcnRpbm1leWVyLmJsb2dzcG90LmNvbS8=" target=\"_blank\">Martin Meyer</a>, their lead DBA, posted some info about the problem and technical details, so before going on, you should read his blog entry and read my comments below after this:</p>
<ul>
<li><a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL21hcnRpbm1leWVyLmJsb2dzcG90LmNvbS8yMDEwLzA0L2xvbmctd2FpdC10aW1lcy1mb3ItY3Vyc29yLXBpbi1zLWFuZC5odG1s" target=\"_blank\">http://martinmeyer.blogspot.com/2010/04/long-wait-times-for-cursor-pin-s-and.html</a></li>
</ul>
<p><strong>Problem:</strong></p>
<p>So, the problem was, that occasionally the critical application transactions which should have taken very short time in the database (&lt;1s), took 10-15 seconds or even longer and timed out.</p>
<p><strong>Symptoms:</strong></p>
<ol>
<li>When the problem happened, the CPU usage also jumped up to 100% for the problem duration (from few tens of seconds up to few minutes).</li>
<li>In AWR snapshots (taken every 20 minutes), the &#8220;cursor: pin S&#8221; popped into top TOP5 waits (around 5-10% of total instance wait time) and sometimes also &#8220;cursor: pin S wait on X&#8221; which is a different thing, also &#8220;latch: library cache&#8221; and interestingly &#8220;log file sync&#8221;. These waits had then much higher average wait times per wait occurrence than normal (tens or hundreds of milliseconds per wait, on average).</li>
<li>The V$EVENT_HISTOGRAM view showed lots of cursor: pin S waits taking very long time (over a second, some even 30+ seconds) and this certainly isn&#8217;t normal (Martin has these numbers in his blog entry)</li>
</ol>
<p>AWR and OS CPU usage measurement tools are system-wide tools (as opposed to session-wide tools).</p>
<p><strong>Troubleshooting:</strong></p>
<p><em>I can&#8217;t give you exact numbers or AWR data here, but will explain the flow of troubleshooting and reasoning.</em></p>
<ul>
<li>As the symptoms involved CPU usage spikes, I first checked whether there were perhaps<em> logon storms</em> going on due a bad application server configuration, where the app server suddenly decides to fire up hundreds of more connections at the same time (that happens quite often, so it&#8217;s a usual suspect when troubleshooting such issues). A logon storm can consume lots of CPU as all these new processes need to be started up in OS, they attach to SGA (syscalls, memory pagetable set-up operations) and eventually they need to find &amp; allocate memory from shared pool and initialize session structures. This all takes CPU.However the <em>logons cumulative</em> statistic in AWR didn&#8217;t go up almost at all during the 20 minute snapshot, so that ruled out a logon storm. As the number of sessions in the end of AWR snapshot (compared to the beginning of it) did not go down, this ruled out a <em>logoff</em> storm too (which also consumes CPU as now the exiting processes need to release their resources etc).</li>
</ul>
<ul>
<li>It&#8217;s worth mentioning that <em>log file sync</em> waits also went up by over an order of magnitude (IIRC from 1-2ms to 20-60 ms per wait) during the CPU spikes. However as <em>log file parallel write</em> times didn&#8217;t go up so radically, this indicated that the log file sync wait time was wasted somewhere else too &#8211; which is very likely going to be CPU scheduling latency (waiting on the CPU runqueue) when CPUs are busy.</li>
</ul>
<ul>
<li>As one of the waits which popped up during the problem was cursor: pin S, then I chcecked V$MUTEX_SLEEP_HISTORY and it did not show any specific cursor as a significant contention point (all contention recorded in that sleep history buffer was evenly spread across many different cursors), so that indicated to me that the problem was likely not related to a single cursor related issue (a bug or just too heavy usage of that cursor). Note that this view was not queried during the worst problem time, so there was a chance that some symptoms were not in there anymore (V$MUTEX_SLEEP_HISTORY is a circular buffer of few hundred last mutex sleeps).</li>
</ul>
<ul>
<li>So, we had CPU starvation and very long cursor: pin S waits popping up at the same time. cursor: pin S operation should happen really fast as it&#8217;s a very simple operation (few tens of instructions only) of marking the cursor&#8217;s mutex <em>in-flux </em>so its reference count could be bumped up for a shared mutex get.</li>
</ul>
<ul>
<li>Whenever you see CPU starvation (CPUs 100% busy and runqueues are long) <em>and </em>latch or mutex contention, then the CPU starvation should be resolved first, as the contention may just be a symptom of the CPU starvation. The problem is that if you get unlucky and a latch or mutex holder process is preempted and taken off CPU by the scheduler, the latch/mutex holder can&#8217;t release the latch before it gets back onto CPU to complete its operation! But OS doesn&#8217;t have a clue about this, as latches/mutexes are just Oracle&#8217;s memory structures in SGA. So the latch/mutex holder is off CPU and everyone else who gets onto CPU may want to take the same latch/mutex. They can&#8217;t get it and spin shortly in hope that the holder releases it in next few microseconds, which isn&#8217;t gonna happen in this case, as the latch/mutex holder is still off CPU!</li>
</ul>
<ul>
<li>And now comes a big difference between latches and mutexes in Oracle 10.2: When a latch getter can&#8217;t get the latch after spinning, it will go to sleep to release the CPU. Even if there are many latch getters in the CPU runqueue before the latch holder, they all spin quickly and end up sleeping again. But when a mutex getter doesn&#8217;t get the mutex after spinning, it will not go to sleep!!! It will yield() the CPU instead, which means that it will go to the end of runqueue and try to get back onto CPU as soon as possible. So, mutex getters in 10.2 are much less graceful, they can burn a lot of CPU when the mutex they want is held by someone else for long time.</li>
<li>But so what, if a mutex holder is preempted and taken off CPU by OS scheduler &#8211; it should get back onto CPU pretty fast, once it works its way through the CPU runqueue?</li>
</ul>
<ul>
<li>Well, yes IF all the processes in the system have the same priority.</li>
</ul>
<ul>
<li>This is where a second problem comes into play &#8211; Unix process priority decay. When a process eats a lot of CPU (and does little IO / voluntary sleeping) then the OS lowers that processes CPU scheduling priority so that other, less CPU hungry processes would still get their fair share of CPU (especially when coming back from an IO wait for example etc).</li>
</ul>
<ul>
<li>When a mutex holder has a lower priority than most other processes and is now taken off CPU, a thing called <em>priority inversion</em> happens. Even though other processes do have higher priority, they can not proceed, as the critical lock or resource they need, is already held by the other process with a lower priority who can&#8217;t complete its work as the &#8220;high priority&#8221; processes keep the CPUs busy.</li>
</ul>
<ul>
<li>In case of latches, the problem is not that bad as the latch getters go to sleep until they are posted when the latch is released by the holder process (I&#8217;ve written about it <a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL2Jsb2cudGFuZWxwb2Rlci5jb20vMjAwOS8wMS8yMC9yZWxpYWJsZS1sYXRjaC13YWl0cy1hbmQtYS1uZXctYmxvZy8=" target=\"_blank\">here</a>). But the priority inversion takes a crazy turn in case of mutexes &#8211; as their getters don&#8217;t sleep (not even for a short time) by default, but yield the CPU and try to get back to it immediately and so on until they get the mutex. That can lead to huge CPU runqueue spikes, unresponsive systems and even hangs.</li>
</ul>
<ul>
<li>This is why starting from Oracle 11g the mutex getters do sleep instead of just yielding the CPU and also Oracle has backported the fix into Oracle 10.2.0.4, where a patch must be applied and where the <em>_first_spare_parameter</em> will specify the sleep duration in centiseconds.</li>
</ul>
<ul>
<li>So, knowing how mutexes worked in 10.2, all the symptoms led me to suspect this priority inversion problem, greatly amplified by how the mutex getters do never sleep by default. And we checked the effective priorities of all Oracle processes in the server, and we hit the jackpot &#8211; there was a number of processes with significantly lower priorities than all other processes had. And it takes only one process with low priority to cause all this trouble, just wait until it starts modifying a mutex and is preempted while doing this.</li>
</ul>
<ul>
<li>So, in order to fix both of the problems which amplified each other, we had to enable HPUX_SCHED_NOAGE Oracle parameter, to prevent priority decay of the processes and set the _first_spare_parameter to 10, which meant that default mutex sleep time will be 10 centiseconds (which is pretty long time in mutex/latching world, but better than crazily retrying without any sleeping at all). That way no process (the mutex holder) is pushed back and kept away from CPU for long periods of time.</li>
</ul>
<p>This was not a trivial problem, as it happened in Oracle / OS touchpoint and happened not because a single reason, but as a product of multiple separate reasons, amplifying each other.</p>
<p>There are few interesting, non-technical points here:</p>
<ol>
<li>When troubleshooting, don&#8217;t let performance tools like AWR (or any other tool!) tell you what your <em>problem</em> is! Your business, your users should tell you what the problem is and the tools should only be used for symptom drilldown (This is what <a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL2NhcnltaWxsc2FwLmJsb2dzcG90LmNvbS8=" target=\"_blank\">Cary Millsap</a> has been constantly telling us). Note how I mentioned the problem and symptoms separately in the beginning of my post &#8211; and the problem was that some business transactions (systemwide) timed out because the database response time was 5-15 seconds!</li>
<li>The detail and scope of your performance data must have <em>at least </em>as good detail and scope of your performance problem!<br />
In other words, if your problem is measured in few seconds, then your performance data should also be sampled at least every few seconds in order to be fully systematic.</p>
<p>The classic issue in this case was that the 20 minute AWR reports still showed IO wait times as main DB time consumers, but that was averaged over 20 minutes. But our <em>problem</em> happened severely and shortly within few seconds in that 20 minutes, so the averaging and aggregation over long period of time did hide the extreme performance issue that happened in a very short time.</li>
</ol>
<p>Next time when it seems to be impossible to diagnose a problem and if the troubleshooting effort ends up going in circles, then you should ask, &#8220;what&#8217;s the real problem and who and how is experiencing it&#8221; and see if your performance data&#8217;s detail and scope matches that problem!</p>
<p>Oh, this is a good point to mention that in addition to my Advanced Oracle Troubleshooting/SQL Tuning <a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL3RlY2guZTJzbi5jb20vb3JhY2xlLXRyYWluaW5nLXNlbWluYXJz">seminars</a> I also actually perform advanced Oracle troubleshooting <a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL2Jsb2cudGFuZWxwb2Rlci5jb20vY29udGFjdC8=">consulting</a> too! I eat mutexes for breakfast ;-)</p>
<div class="facebook_like_button"><iframe src="http://www.facebook.com/plugins/like.php?href=http%3A%2F%2Fblog.tanelpoder.com%2F2010%2F04%2F21%2Fcursor-pin-s-waits-sporadic-cpu-spikes-and-systematic-troubleshooting%2F&amp;layout=standard&amp;show-faces=true&amp;width=450&amp;action=like&amp;font=arial&amp;colorscheme=light" scrolling="no" frameborder="0" allowTransparency="true" style="padding: 0px 0px; border:none; overflow:hidden; width:450px; height:70px;"></iframe></div> <img src="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?view=1&post_id=673" width="1" height="1" style="display: none;" /><p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save"><img src="http://blog.tanelpoder.com/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://blog.tanelpoder.com/2010/04/21/cursor-pin-s-waits-sporadic-cpu-spikes-and-systematic-troubleshooting/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
		<item>
		<title>KGH: NO ACCESS &#8211; Buffer cache inside streams pool too!</title>
		<link>http://blog.tanelpoder.com/2010/04/21/kgh-no-access-buffer-cache-inside-streams-pool-too/</link>
		<comments>http://blog.tanelpoder.com/2010/04/21/kgh-no-access-buffer-cache-inside-streams-pool-too/#comments</comments>
		<pubDate>Wed, 21 Apr 2010 19:21:44 +0000</pubDate>
		<dc:creator>Tanel Poder</dc:creator>
				<category><![CDATA[Cool stuff]]></category>
		<category><![CDATA[Internals]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Oracle 11gR2]]></category>
		<category><![CDATA[Troubleshooting]]></category>

		<guid isPermaLink="false">http://blog.tanelpoder.com/?p=669</guid>
		<description><![CDATA[Some time ago I wrote that since Oracle 10.2, some of the buffer cache can physically reside within shared pool granules. I just noticed this in an 11.2 instance: SQL&#62; select * from v$sgastat where name like &#8216;KGH%&#8217;; POOL         NAME                       [...]]]></description>
			<content:encoded><![CDATA[<p>Some time ago I <a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL2Jsb2cudGFuZWxwb2Rlci5jb20vMjAwOS8wOS8wOS9rZ2gtbm8tYWNjZXNzLWFsbG9jYXRpb25zLWluLXZzZ2FzdGF0LWJ1ZmZlci1jYWNoZS13aXRoaW4tc2hhcmVkLXBvb2wv">wrote</a> that since Oracle 10.2, some of the buffer cache can physically reside within <strong>shared pool</strong> granules.</p>
<p>I just noticed this in an 11.2 instance:</p>
<div><span style="font-family: Consolas, Monaco, 'Courier New', Courier, monospace; line-height: 18px; font-size: 12px; white-space: pre;">SQL&gt; select * from v$sgastat where name like &#8216;KGH%&#8217;;</span></div>
<div>
<pre>POOL         NAME                            BYTES
------------ -------------------------- ----------
streams pool KGH: NO ACCESS                4186144</pre>
</div>
<div></div>
<div></div>
<div>So, it looks that also streams pool can surrender parts of its memory granules to buffer cache, if it&#8217;s unable to flush everything out from the granule for complete granule handover.</div>
<div>Let&#8217;s see whether that&#8217;s the case:</div>
<div>
<pre>

SQL&gt; select last_oper_type, last_oper_mode from v$sga_dynamic_components where component = 'streams pool';
LAST_OPER_TYP LAST_OPER
------------- ---------
SHRINK        DEFERRED</pre>
</div>
<div></div>
<div></div>
<div>Yep, the last streams pool shrink operation was left in DEFERRED status, which means the granule wasn&#8217;t handed over &#8211; streams pool kept the granule for itself, marked everything it could flush out as KGH: NO ACCESS in its heap header and handed these chunks over to buffer cache manager.</div>
<div>Lets check whether these chunks are actually used by buffer cache buffers:</div>
<div>(NB! Think twice before running this query in production as it may hold your shared pool latches for very long time):</div>
<div></div>
<div></div>
<div>
<pre>SQL&gt; select ksmchidx,ksmchdur,ksmchcom,ksmchptr,ksmchsiz,ksmchcls from x$ksmsst where ksmchcom = 'KGH: NO ACCESS';
  KSMCHIDX   KSMCHDUR KSMCHCOM         KSMCHPTR           KSMCHSIZ KSMCHCLS
---------- ---------- ---------------- ---------------- ---------- --------
         1          4 KGH: NO ACCESS   00000003A2401FE0    4186144 no acce</pre>
</div>
<div>So, there&#8217;s only one chunk flushed &amp; handed over to buffer cache in the streams pool heap. The KSMCHPTR column shows the starting address of that chunk in SGA and the KSMCHSIZ is the size of that chunk.</div>
<div>So, let&#8217;s see if there are any buffers within that address range. First I&#8217;ll calculate the end address of that chunk (start address + size -1 = end address)</div>
<div>
<pre>

SQL&gt; @calc 0x00000003A2401FE0 + 4186143
                     DEC                  HEX
------------------------ --------------------
         15611199487.000            3A27FFFFF</pre>
</div>
<div></div>
<div></div>
<div>And now lets query X$BH using that address range to see if / how many buffer cache buffers have been placed in there:</div>
<div>
<pre>

SQL&gt; select count(*) from x$bh where rawtohex(ba) between '00000003A2401FE0' and '00000003A27FFFFF';

  COUNT(*)
----------
       483</pre>
</div>
<div></div>
<div></div>
<div>We have just proven, that there are 483 buffer cache buffers (~3.8MB, 8kB buffers) which reside physically in streams pool heap!</div>
<div></div>
<div></div>
<div></div>
<div class="facebook_like_button"><iframe src="http://www.facebook.com/plugins/like.php?href=http%3A%2F%2Fblog.tanelpoder.com%2F2010%2F04%2F21%2Fkgh-no-access-buffer-cache-inside-streams-pool-too%2F&amp;layout=standard&amp;show-faces=true&amp;width=450&amp;action=like&amp;font=arial&amp;colorscheme=light" scrolling="no" frameborder="0" allowTransparency="true" style="padding: 0px 0px; border:none; overflow:hidden; width:450px; height:70px;"></iframe></div> <img src="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?view=1&post_id=669" width="1" height="1" style="display: none;" /><p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save"><img src="http://blog.tanelpoder.com/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://blog.tanelpoder.com/2010/04/21/kgh-no-access-buffer-cache-inside-streams-pool-too/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Non-trivial performance problems</title>
		<link>http://blog.tanelpoder.com/2010/04/03/non-trivial-performance-problems/</link>
		<comments>http://blog.tanelpoder.com/2010/04/03/non-trivial-performance-problems/#comments</comments>
		<pubDate>Sat, 03 Apr 2010 10:07:26 +0000</pubDate>
		<dc:creator>Tanel Poder</dc:creator>
				<category><![CDATA[Administration]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Troubleshooting]]></category>
		<category><![CDATA[Tuning]]></category>
		<category><![CDATA[SQL]]></category>

		<guid isPermaLink="false">http://blog.tanelpoder.com/?p=665</guid>
		<description><![CDATA[Gwen Shapira has written an article about a good example of a non-trivial performance problem. I&#8217;m not talking about anything advanced here (such as bugs or problems arising at OS/Oracle touchpoint) but that sometimes the root cause of a problem (or at least the reason why you notice this problem now) is not something deeply [...]]]></description>
			<content:encoded><![CDATA[<p>Gwen Shapira has written an article about a good example of a <a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL3Byb2RsaWZlLndvcmRwcmVzcy5jb20vMjAxMC8wNC8wMi9kYXlsaWdodC1zYXZpbmctdGltZS1jYXVzZXMtcGVyZm9ybWFuY2UtaXNzdWVzLw==">non-trivial performance problem</a>.</p>
<p>I&#8217;m not talking about anything advanced here (such as bugs or problems arising at OS/Oracle touchpoint) but that sometimes the root cause of a problem (or at least the reason why you notice this problem now) is not something deeply technical or related to some specific SQL optimizer feature or a configuration issue. Instead of focusing on the first symptom you see immediately, it pays off to take a step back and see how the problem task/application/SQL is actually used by the users or client applications.</p>
<p>In other words, talk to the users, ask how exactly they experience the problem and then drill down from there.</p>
<div class="facebook_like_button"><iframe src="http://www.facebook.com/plugins/like.php?href=http%3A%2F%2Fblog.tanelpoder.com%2F2010%2F04%2F03%2Fnon-trivial-performance-problems%2F&amp;layout=standard&amp;show-faces=true&amp;width=450&amp;action=like&amp;font=arial&amp;colorscheme=light" scrolling="no" frameborder="0" allowTransparency="true" style="padding: 0px 0px; border:none; overflow:hidden; width:450px; height:70px;"></iframe></div> <img src="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?view=1&post_id=665" width="1" height="1" style="display: none;" /><p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save"><img src="http://blog.tanelpoder.com/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://blog.tanelpoder.com/2010/04/03/non-trivial-performance-problems/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Oracle Latch Contention Troubleshooting</title>
		<link>http://blog.tanelpoder.com/2010/03/27/oracle-latch-contention-troubleshooting/</link>
		<comments>http://blog.tanelpoder.com/2010/03/27/oracle-latch-contention-troubleshooting/#comments</comments>
		<pubDate>Sun, 28 Mar 2010 04:46:08 +0000</pubDate>
		<dc:creator>Tanel Poder</dc:creator>
				<category><![CDATA[Cool stuff]]></category>
		<category><![CDATA[Internals]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Tools]]></category>
		<category><![CDATA[Troubleshooting]]></category>
		<category><![CDATA[Tuning]]></category>
		<category><![CDATA[contention]]></category>
		<category><![CDATA[latch]]></category>
		<category><![CDATA[method]]></category>
		<category><![CDATA[scripts]]></category>
		<category><![CDATA[systematic]]></category>

		<guid isPermaLink="false">http://blog.tanelpoder.com/?p=658</guid>
		<description><![CDATA[I wrote a latch contention troubleshooting article for IOUG Select journal last year (it was published earlier this year). I have uploaded this to tech.E2SN too, I recommend you to read it if you want to become systematic about latch contention troubleshooting: http://tech.e2sn.com/oracle/troubleshooting I&#8217;m working on getting the commenting &#038; feedback work at tech.E2SN site [...]]]></description>
			<content:encoded><![CDATA[<p>I wrote a latch contention troubleshooting article for IOUG Select journal last year (it was published earlier this year). I have uploaded this to tech.E2SN too, I recommend you to read it if you want to become systematic about latch contention troubleshooting:</p>
<p><a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL3RlY2guZTJzbi5jb20vb3JhY2xlL3Ryb3VibGVzaG9vdGluZw==">http://tech.e2sn.com/oracle/troubleshooting</a></p>
<p>I&#8217;m working on getting the commenting &#038; feedback work at tech.E2SN site too, but for now you can comment here at this blog entry&#8230;</p>
<div class="facebook_like_button"><iframe src="http://www.facebook.com/plugins/like.php?href=http%3A%2F%2Fblog.tanelpoder.com%2F2010%2F03%2F27%2Foracle-latch-contention-troubleshooting%2F&amp;layout=standard&amp;show-faces=true&amp;width=450&amp;action=like&amp;font=arial&amp;colorscheme=light" scrolling="no" frameborder="0" allowTransparency="true" style="padding: 0px 0px; border:none; overflow:hidden; width:450px; height:70px;"></iframe></div> <img src="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?view=1&post_id=658" width="1" height="1" style="display: none;" /><p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save"><img src="http://blog.tanelpoder.com/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://blog.tanelpoder.com/2010/03/27/oracle-latch-contention-troubleshooting/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Session Snapper v3.11 &#8211; bugfix update &#8211; now ASH report works properly on Oracle 10.1 too</title>
		<link>http://blog.tanelpoder.com/2010/03/27/session-snapper-v3-11-bugfix-update-now-ash-report-works-properly-on-oracle-10-1-too/</link>
		<comments>http://blog.tanelpoder.com/2010/03/27/session-snapper-v3-11-bugfix-update-now-ash-report-works-properly-on-oracle-10-1-too/#comments</comments>
		<pubDate>Sat, 27 Mar 2010 15:40:34 +0000</pubDate>
		<dc:creator>Tanel Poder</dc:creator>
				<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Tools]]></category>
		<category><![CDATA[Troubleshooting]]></category>
		<category><![CDATA[Tuning]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[scripts]]></category>
		<category><![CDATA[snapper]]></category>

		<guid isPermaLink="false">http://blog.tanelpoder.com/?p=655</guid>
		<description><![CDATA[This is an updated version of Snapper, which works ok on Oracle 10.1 now as well (9i support is coming some time in the future :) Thanks to Jamey Johnston for sending me the fix info (and saving me some time that way :) So if you have some problems with Snapper on Oracle 10.1, [...]]]></description>
			<content:encoded><![CDATA[<p>This is an updated version of Snapper, which works ok on Oracle 10.1 now as well (9i support is coming some time in the future :)</p>
<p>Thanks to Jamey Johnston for sending me the fix info (and saving me some time that way :)</p>
<p>So if you have some problems with Snapper on Oracle 10.1, please make sure you have the latest version v3.11, which you can get from here:</p>
<p><a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL3RlY2guZTJzbi5jb20vb3JhY2xlLXNjcmlwdHMtYW5kLXRvb2xzL3Nlc3Npb24tc25hcHBlcg==">http://tech.e2sn.com/oracle-scripts-and-tools/session-snapper</a></p>
<p>The output below is from Snapper 3.11 on Oracle 10.1.0.5, the ASH columns in the bottom part of the output are displayed correctly now:</p>
<pre>SQL&gt; @snapper ash,ash1,ash2,ash3,stats,gather=t 15 1 all
Sampling with interval 15 seconds, 1 times...

-- Session Snapper v3.11 by Tanel Poder @ E2SN ( http://tech.e2sn.com )

----------------------------------------------------------------------------------------------------------------------
    SID, USERNAME  , TYPE, STATISTIC                               ,         DELTA, HDELTA/SEC,    %TIME, GRAPH
----------------------------------------------------------------------------------------------------------------------
     52, SYSTEM    , TIME, PL/SQL execution elapsed time           ,         53968,      3.6ms,      .4%, |          |
     52, SYSTEM    , TIME, DB CPU                                  ,         10000,   666.67us,      .1%, |          |
     52, SYSTEM    , TIME, sql execute elapsed time                ,        118225,     7.88ms,      .8%, |@         |
     52, SYSTEM    , TIME, DB time                                 ,        118632,     7.91ms,      .8%, |@         |
     54, SYSTEM    , TIME, hard parse elapsed time                 ,        289905,    19.33ms,     1.9%, |@         |
     54, SYSTEM    , TIME, parse time elapsed                      ,        528034,     35.2ms,     3.5%, |@         |
     54, SYSTEM    , TIME, PL/SQL execution elapsed time           ,       5010579,   334.04ms,    33.4%, |@@@@      |
     54, SYSTEM    , TIME, DB CPU                                  ,      10660000,   710.67ms,    71.1%, |@@@@@@@@  |
     54, SYSTEM    , TIME, sql execute elapsed time                ,      12920952,    861.4ms,    86.1%, |@@@@@@@@@ |
     54, SYSTEM    , TIME, DB time                                 ,      12937606,   862.51ms,    86.3%, |@@@@@@@@@ |
     54, SYSTEM    , TIME, sequence load elapsed time              ,          1079,    71.93us,      .0%, |          |
     56, (MMNL)    , TIME, background cpu time                     ,           940,    62.67us,      .0%, |          |
     56, (MMNL)    , TIME, background elapsed time                 ,           940,    62.67us,      .0%, |          |
     58, (MMON)    , TIME, background cpu time                     ,           158,    10.53us,      .0%, |          |
     58, (MMON)    , TIME, background elapsed time                 ,           158,    10.53us,      .0%, |          |
     64, (RBAL)    , TIME, background cpu time                     ,            86,     5.73us,      .0%, |          |
     64, (RBAL)    , TIME, background elapsed time                 ,            86,     5.73us,      .0%, |          |
     68, (CJQ0)    , TIME, background cpu time                     ,           820,    54.67us,      .0%, |          |
     68, (CJQ0)    , TIME, background elapsed time                 ,           820,    54.67us,      .0%, |          |
     70, (SMON)    , TIME, background cpu time                     ,           141,      9.4us,      .0%, |          |
     70, (SMON)    , TIME, background elapsed time                 ,           141,      9.4us,      .0%, |          |
     71, (CKPT)    , TIME, background cpu time                     ,         14515,   967.67us,      .1%, |          |
     71, (CKPT)    , TIME, background elapsed time                 ,         14515,   967.67us,      .1%, |          |
     72, (LGWR)    , TIME, background cpu time                     ,       1530000,      102ms,    10.2%, |@         |
     72, (LGWR)    , TIME, background elapsed time                 ,       1954778,   130.32ms,    13.0%, |@@        |
     73, (DBW0)    , TIME, background cpu time                     ,         10000,   666.67us,      .1%, |          |
     73, (DBW0)    , TIME, background elapsed time                 ,        268787,    17.92ms,     1.8%, |@         |
     74, (MMAN)    , TIME, background cpu time                     ,           141,      9.4us,      .0%, |          |
     74, (MMAN)    , TIME, background elapsed time                 ,           141,      9.4us,      .0%, |          |
     75, (PMON)    , TIME, background cpu time                     ,          1636,   109.07us,      .0%, |          |
     75, (PMON)    , TIME, background elapsed time                 ,          1636,   109.07us,      .0%, |          |
--  End of Stats snap 1, end=2010-03-27 16:37:13, seconds=15

-----------------------------------------------------------------------
Active% | SQL_ID          | EVENT                     | WAIT_CLASS
-----------------------------------------------------------------------
    61% | 6d0z2j01c8ytc   | ON CPU                    | ON CPU
    22% |                 | log file parallel write   | System I/O
     7% | 6d0z2j01c8ytc   | db file sequential read   | User I/O
     3% | 0zkt25f36kbzd   | ON CPU                    | ON CPU
     3% |                 | db file parallel write    | System I/O
     2% | g1xapjmt4vm5c   | ON CPU                    | ON CPU
     2% |                 | ON CPU                    | ON CPU
     2% | gaxwgwd72b3pn   | ON CPU                    | ON CPU
     1% | 4ftbahd08ab2a   | ON CPU                    | ON CPU
     1% | c69wrxcndxuzw   | ON CPU                    | ON CPU

-----------------------------------------------------
Active% | EVENT                     | WAIT_CLASS
-----------------------------------------------------
    76% | ON CPU                    | ON CPU
    22% | log file parallel write   | System I/O
     9% | db file sequential read   | User I/O
     3% | db file parallel write    | System I/O
     3% | db file scattered read    | User I/O
     1% | direct path write temp    | User I/O

----------------------------------
Active% |    SID | SQL_ID
----------------------------------
    69% |     54 | 6d0z2j01c8ytc
    23% |     72 |
     3% |     54 | 0zkt25f36kbzd
     3% |     73 |
     3% |     54 | 8qs4shjvhk2w4
     2% |     54 | g1xapjmt4vm5c
     2% |     54 | gaxwgwd72b3pn
     1% |     54 | 3w6304ztrww4h
     1% |     54 | b86h705svfmjz
     1% |     54 | drppqann6dwfa

---------------------------------------------------
Active% | PLSQL_OBJE | PLSQL_SUBP | SQL_ID
---------------------------------------------------
    69% | N/A        | N/A        | 6d0z2j01c8ytc
    27% | N/A        | N/A        |
     3% | N/A        | N/A        | 0zkt25f36kbzd
     3% | N/A        | N/A        | 8qs4shjvhk2w4
     2% | N/A        | N/A        | g1xapjmt4vm5c
     2% | N/A        | N/A        | gaxwgwd72b3pn
     1% | N/A        | N/A        | 3w6304ztrww4h
     1% | N/A        | N/A        | b86h705svfmjz
     1% | N/A        | N/A        | drppqann6dwfa
     1% | N/A        | N/A        | c69wrxcndxuzw

--  End of ASH snap 1, end=2010-03-27 16:37:13, seconds=15, samples_taken=96

PL/SQL procedure successfully completed.

SQL&gt;</pre>
<div class="facebook_like_button"><iframe src="http://www.facebook.com/plugins/like.php?href=http%3A%2F%2Fblog.tanelpoder.com%2F2010%2F03%2F27%2Fsession-snapper-v3-11-bugfix-update-now-ash-report-works-properly-on-oracle-10-1-too%2F&amp;layout=standard&amp;show-faces=true&amp;width=450&amp;action=like&amp;font=arial&amp;colorscheme=light" scrolling="no" frameborder="0" allowTransparency="true" style="padding: 0px 0px; border:none; overflow:hidden; width:450px; height:70px;"></iframe></div> <img src="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?view=1&post_id=655" width="1" height="1" style="display: none;" /><p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save"><img src="http://blog.tanelpoder.com/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://blog.tanelpoder.com/2010/03/27/session-snapper-v3-11-bugfix-update-now-ash-report-works-properly-on-oracle-10-1-too/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Oracle Session Snapper v3.10</title>
		<link>http://blog.tanelpoder.com/2010/03/22/oracle-session-snapper-v3-10/</link>
		<comments>http://blog.tanelpoder.com/2010/03/22/oracle-session-snapper-v3-10/#comments</comments>
		<pubDate>Mon, 22 Mar 2010 16:35:50 +0000</pubDate>
		<dc:creator>Tanel Poder</dc:creator>
				<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Tools]]></category>
		<category><![CDATA[Troubleshooting]]></category>
		<category><![CDATA[Tuning]]></category>

		<guid isPermaLink="false">http://blog.tanelpoder.com/?p=639</guid>
		<description><![CDATA[Hi all, long time no see!  =8-) Now as I&#8217;m done with the awesome Hotsos Symposium (and the training day which I delivered) and have got some rest, I&#8217;ll start publishing some of the cool things I&#8217;ve been working on over the past half a year or so. The first is Oracle Session Snapper version [...]]]></description>
			<content:encoded><![CDATA[<p>Hi all, long time no see!  =8-)</p>
<p>Now as I&#8217;m done with the awesome Hotsos Symposium (and the training day which I delivered) and have got some rest, I&#8217;ll start publishing some of the cool things I&#8217;ve been working on over the past half a year or so.</p>
<p>The first is Oracle Session Snapper version 3!</p>
<p>There are some major improvements in Snapper 3, like ASH style session activity sampling!</p>
<p>When you troubleshoot a session&#8217;s performance (or instance performance) then the main things you want to know first are very very simple:</p>
<ol>
<li>Which SQL statements are being executed</li>
<li>What are they doing, are they working on CPU or waiting.</li>
<li>If waiting, then for what</li>
</ol>
<p>Often this is enough for troubleshooting what&#8217;s wrong. For example, if a session is waiting for a lock, then wait interface will show you that. If a single SQL statement is taking 99% of total response time, the V$SESSION (ASH style) samples will point out the problem SQL and so on. Simple stuff.</p>
<p>However there are cases where you need to go beyond wait interface and use V$SESSTAT (and other) counters and even take a &#8220;screwdriver&#8221; and open Oracle up from outside by stack tracing :-)</p>
<p>When I wrote the first version of Snapper for my own use some 4-5 years ago I wrote it mainly having the &#8220;beyond wait interface&#8221; part in mind. So I focused on V$SESSTAT and various other counters and left the basic troubleshooting to other tools. I used to manually sample V$SESSION/V$SESSION_WAIT a few times in a row to get a rough overview of what a session was doing or some other special-purpose scripts.</p>
<p>However after Snapper got more popular and I started getting some feedback about it I saw the need for covering more with Snapper, not just the &#8220;beyond wait interface&#8221; part, but also the &#8220;wait interface&#8221; and &#8220;which SQL&#8221; part too.</p>
<p>So, now I&#8217;m presenting Snapper v3 which does all the 3 points above using ASH style V$SESSION sampling and it still has the first step to &#8220;beyond wait interface&#8221; part in it, which is very useful for advanced performance troubleshooting and diagnosis &#8211; I&#8217;m talking about the V$SESSTAT counters above.</p>
<p>I&#8217;ve made some syntax changes in Snapper too and right now the v3 doesn&#8217;t work on Oracle 9.2 (it will work some day :)</p>
<p>To give you an idea of the new ASH style sampling capabilities, heres some example output:</p>
<pre>SQL&gt; @snapper ash=sid+event+wait_class,ash1=plsql_object_id+plsql_subprogram_id+sql_id,ash2=program+module+action 5 1 all
<div id="_mcePaste">Sampling...</div>
<div id="_mcePaste">-- Session Snapper v3.10 by Tanel Poder @ E2SN ( http://tech.e2sn.com )

--------------------------------------------------------------</div>
<div id="_mcePaste">Active% |    SID | EVENT                     | WAIT_CLASS</div>
<div id="_mcePaste">--------------------------------------------------------------
   100% |    133 | db file scattered read    | User I/O
     5% |    165 | control file parallel wri | System I/O
     2% |    162 | ON CPU                    | ON CPU
     2% |    167 | db file parallel write    | System I/O
     2% |    166 | log file parallel write   | System I/O</div>
<div id="_mcePaste">
---------------------------------------------------
Active% | PLSQL_OBJE | PLSQL_SUBP | SQL_ID
---------------------------------------------------
    77% |            |            | a5xyjp9gt796s
    23% |            |            | 4g4u44bk830ms
    12% |            |            |</div>
<div id="_mcePaste">-------------------------------------------------------------------------------------------
Active% | PROGRAM                   | MODULE                    | ACTION
-------------------------------------------------------------------------------------------
100% | sqlplus@mac01 (TNS V1-V3) | sqlplus@mac01 (TNS V1-V3) |
  5% | oracle@solaris02 (CKPT)   |                           |
  2% | oracle@solaris02 (DBW0)   |                           |
  2% | oracle@solaris02 (CJQ0)   |                           |
  2% | oracle@solaris02 (LGWR)   |                           |

--  End of ASH snap 1, end=2010-03-22 17:35:06, seconds=5, samples_taken=43</div>

<span style="font-family: Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif; font-size: medium;"><span style="line-height: 22px; white-space: normal;"><span style="font-family: Consolas, Monaco, 'Courier New', Courier, monospace;"><span style="line-height: 25px; white-space: pre;">
</span></span></span></span></pre>
<p>You can read some usage examples and download it here:</p>
<ul>
<li><a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL3RlY2guZTJzbi5jb20vb3JhY2xlLXNjcmlwdHMtYW5kLXRvb2xzL3Nlc3Npb24tc25hcHBlcg==" target=\"_blank\">http://tech.e2sn.com/oracle-scripts-and-tools/session-snapper</a></li>
</ul>
<p>P.S. People who attended Hotsos Symposium Training Day where I demoed the initial version of Snapper v3 &#8211; download the new version from above link (v3.10), it&#8217;s much more flexible than the one I demoed couple of weeks ago!</p>
<p><span style="color: #ff0000;"><strong><span style="font-family: monospace; color: #000000;"><span style="font-weight: normal;"><br />
</span></span></strong></span></p>
<div class="facebook_like_button"><iframe src="http://www.facebook.com/plugins/like.php?href=http%3A%2F%2Fblog.tanelpoder.com%2F2010%2F03%2F22%2Foracle-session-snapper-v3-10%2F&amp;layout=standard&amp;show-faces=true&amp;width=450&amp;action=like&amp;font=arial&amp;colorscheme=light" scrolling="no" frameborder="0" allowTransparency="true" style="padding: 0px 0px; border:none; overflow:hidden; width:450px; height:70px;"></iframe></div> <img src="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?view=1&post_id=639" width="1" height="1" style="display: none;" /><p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save"><img src="http://blog.tanelpoder.com/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://blog.tanelpoder.com/2010/03/22/oracle-session-snapper-v3-10/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
	</channel>
</rss>
