<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Tanel Poder&#039;s blog: IT &#38; Mobile for Geeks and Pros &#187; Search Results  &#187;  latchprof++latchprofx</title>
	<atom:link href="http://blog.tanelpoder.com/search/latchprof++latchprofx/feed/rss2/" rel="self" type="application/rss+xml" />
	<link>http://blog.tanelpoder.com</link>
	<description>Oracle, Exadata, Performance, Troubleshooting - Mobile Life and Productivity.</description>
	<lastBuildDate>Thu, 02 Feb 2012 21:38:55 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>IOUG Select Journal Editor&#8217;s Choice Award 2011</title>
		<link>http://blog.tanelpoder.com/2011/06/29/ioug-select-journal-editors-choice-award-2011/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=ioug-select-journal-editors-choice-award-2011</link>
		<comments>http://blog.tanelpoder.com/2011/06/29/ioug-select-journal-editors-choice-award-2011/#comments</comments>
		<pubDate>Wed, 29 Jun 2011 10:23:18 +0000</pubDate>
		<dc:creator>Tanel Poder</dc:creator>
				<category><![CDATA[Cool stuff]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Troubleshooting]]></category>
		<category><![CDATA[Tuning]]></category>

		<guid isPermaLink="false">http://blog.tanelpoder.com/2011/06/29/ioug-select-journal-editors-choice-award-2011/</guid>
		<description><![CDATA[In May I received the IOUG Select Journal Editor&#8217;s Choice Award for my Systematic Oracle Latch Contention Troubleshooting article where I introduced my LatchProfX tool for advanced drilldown into complex latch contention problems (thanks IOUG and John Kanagaraj!). As the relevant IOUG webpage hasn&#8217;t been updated yet, I thought to delay this announcement until the [...]]]></description>
			<content:encoded><![CDATA[<p>In May I received the <strong>IOUG Select Journal Editor&#8217;s Choice Award</strong> for my <strong>Systematic Oracle Latch Contention Troubleshooting</strong> article where I introduced my LatchProfX tool for advanced drilldown into complex latch contention problems (thanks IOUG and <a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL3d3dy5saW5rZWRpbi5jb20vaW4vam9obmthbmFnYXJhag==" target=\"_blank\">John Kanagaraj</a>!).</p>
<p>As the relevant IOUG webpage hasn&#8217;t been updated yet, I thought to delay this announcement until the update was done &#8211; but I just found an official enough announcement (press release) by accident from Reuters site:</p>
<ul>
<li><a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL3d3dy5yZXV0ZXJzLmNvbS9hcnRpY2xlLzIwMTEvMDUvMTIvaWRVUzI3OTUxOCsxMi1NYXktMjAxMStCVzIwMTEwNTEy" target=\"_blank\">http://www.reuters.com/article/2011/05/12/idUS279518+12-May-2011+BW20110512</a></li>
</ul>
<p>Woo-hoo! :-)</p>
<p>The article itself is here:</p>
<ul>
<li><a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL3RlY2guZTJzbi5jb20vb3JhY2xlL3Ryb3VibGVzaG9vdGluZy9sYXRjaC1jb250ZW50aW9uLXRyb3VibGVzaG9vdGluZw==" target=\"_blank\">http://tech.e2sn.com/oracle/troubleshooting/latch-contention-troubleshooting</a></li>
</ul>
<p>Thanks to IOUG crew, John Kanagaraj and everyone else who has read, used my stuff and given feedback! :-)</p>
 <img src="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?view=1&post_id=1198" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://blog.tanelpoder.com/2011/06/29/ioug-select-journal-editors-choice-award-2011/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Cache buffers chains latch contention troubleshooting using latchprofx.sql example</title>
		<link>http://blog.tanelpoder.com/2011/06/05/cache-buffers-chains-latch-contention-using-latchprofx-sql-example/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=cache-buffers-chains-latch-contention-using-latchprofx-sql-example</link>
		<comments>http://blog.tanelpoder.com/2011/06/05/cache-buffers-chains-latch-contention-using-latchprofx-sql-example/#comments</comments>
		<pubDate>Sun, 05 Jun 2011 12:16:26 +0000</pubDate>
		<dc:creator>Tanel Poder</dc:creator>
				<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Tools]]></category>
		<category><![CDATA[Troubleshooting]]></category>

		<guid isPermaLink="false">http://blog.tanelpoder.com/2011/06/05/cache-buffers-chains-latch-contention-using-latchprofx-sql-example/</guid>
		<description><![CDATA[Laurent Demaret has written a good article about how he systematically troubleshooted cache buffers chains latch contention, starting from wait interface and drilling down into details with my latchprofx tool: http://www.cernatis.com/index.php/oracle-databases/performancetuning/107-fix-high-qlatch-cache-buffers-chainsq-waits A common cause for cache buffers chains latch contention is that some blocks are visited and re-visited way too much by a query execution. [...]]]></description>
			<content:encoded><![CDATA[<p>Laurent Demaret has written a good article about how he systematically troubleshooted cache buffers chains latch contention, starting from wait interface and drilling down into details with my latchprofx tool:</p>
<ul>
<li><a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL3d3dy5jZXJuYXRpcy5jb20vaW5kZXgucGhwL29yYWNsZS1kYXRhYmFzZXMvcGVyZm9ybWFuY2V0dW5pbmcvMTA3LWZpeC1oaWdoLXFsYXRjaC1jYWNoZS1idWZmZXJzLWNoYWluc3Etd2FpdHM=" target=\"_blank\">http://www.cernatis.com/index.php/oracle-databases/performancetuning/107-fix-high-qlatch-cache-buffers-chainsq-waits</a></li>
</ul>
<p>A common cause for cache buffers chains latch contention is that some blocks are visited and re-visited way too much by a query execution. This usually happens due to nested loops joins or FILTER loops retrieving many rows from their outer (driving) row sources and then visiting the inner row-source again for each row from driving row source. Once you manage to fix your execution plan (perhaps by getting a hash join instead of the loop), then the blocks will not be re-visited so much and the latches will be hammered much less too.</p>
<p>The moral of the story is that if you have latch contention in a modern Oracle database, you don&#8217;t need to start tweaking undocumented latching parameters, but reduce the latch usage instead. And Laurent has done a good job with systematically identifying the SQL that needs to be fixed.</p>
<p>Good stuff!</p>
<p>If you don&#8217;t know what LatchProfX is, read this:</p>
<ul>
<li><a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL3RlY2guZTJzbi5jb20vb3JhY2xlL3Ryb3VibGVzaG9vdGluZy9sYXRjaC1jb250ZW50aW9uLXRyb3VibGVzaG9vdGluZw==" target=\"_blank\">http://tech.e2sn.com/oracle/troubleshooting/latch-contention-troubleshooting</a></li>
</ul>
 <img src="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?view=1&post_id=984" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://blog.tanelpoder.com/2011/06/05/cache-buffers-chains-latch-contention-using-latchprofx-sql-example/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>An index of my TPT scripts</title>
		<link>http://blog.tanelpoder.com/2011/03/24/an-index-of-my-tpt-scripts/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=an-index-of-my-tpt-scripts</link>
		<comments>http://blog.tanelpoder.com/2011/03/24/an-index-of-my-tpt-scripts/#comments</comments>
		<pubDate>Thu, 24 Mar 2011 14:10:05 +0000</pubDate>
		<dc:creator>Tanel Poder</dc:creator>
				<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Productivity]]></category>
		<category><![CDATA[Administration]]></category>
		<category><![CDATA[Tools]]></category>
		<category><![CDATA[Troubleshooting]]></category>
		<category><![CDATA[Tuning]]></category>

		<guid isPermaLink="false">http://blog.tanelpoder.com/2011/03/24/an-index-of-my-tpt-scripts/</guid>
		<description><![CDATA[A lot of people have asked me whether there&#8217;s some sort of index or &#8220;table of contents&#8221; of my TPT scripts (there&#8217;s over 500 scripts in the tpt_public.zip file &#8211; http://tech.e2sn.com/oracle-scripts-and-tools ) I have planned to create such index for years, but never got to it. I probably never will :) So a good way [...]]]></description>
			<content:encoded><![CDATA[<p>A lot of people have asked me whether there&#8217;s some sort of index or &#8220;table of contents&#8221; of my TPT scripts (there&#8217;s over 500 scripts in the tpt_public.zip file &#8211; <a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL3RlY2guZTJzbi5jb20vb3JhY2xlLXNjcmlwdHMtYW5kLXRvb2xz" target=\"_blank\">http://tech.e2sn.com/oracle-scripts-and-tools</a> )</p>
<p>I have planned to create such index for years, but never got to it. I probably never will :) So a good way to extract the descriptions of some scripts is this (run the command in the directory where you extracted my scripts to):</p>
<p><em><strong>Note:</strong> the single and double-quotes may get messed up when the browser tries to be smart and replace them with nicer looking characters (which Unix doesn&#8217;t recognize then). When copying &amp; pasting this command, make sure that the single &amp; double-quotes are the regular ones Unix shell can accept):</em></p>
<div><span style="font-family: 'courier new', monospace;">$<span class="Apple-converted-space"> </span><strong>grep -i Purpose: *.sql | awk -F: &#8216;{ printf(&#8220;%20s %-50s\n&#8221;, $1, $3) }&#8217;</strong></span></div>
<div><span style="font-family: 'courier new', monospace;">            bhla.sql      Report which blocks are in buffer cache, protected by a cache</span></div>
<div><span style="font-family: 'courier new', monospace;">         bufprof.sql      Display buffer gets done by a session and their reason</span></div>
<div><span style="font-family: 'courier new', monospace;">            calc.sql      Basic calculator and dec/hex converter       </span></div>
<div><span style="font-family: 'courier new', monospace;">        channels.sql      Report KSR channel message counts by channel endpoints</span></div>
<div><span style="font-family: 'courier new', monospace;">        curheaps.sql      Show main cursor data block heap sizes and their contents</span></div>
<div><span style="font-family: 'courier new', monospace;">             dba.sql      Convert Data Block Address (a 6 byte hex number) to file#, block#</span></div>
<div><span style="font-family: 'courier new', monospace;">             ddl.sql      Extracts DDL statements for specified objects</span></div>
<div><span style="font-family: 'courier new', monospace;">              df.sql  Show Oracle tablespace free space in Unix df style</span></div>
<div><span style="font-family: 'courier new', monospace;">        diag_sid.sql      Display current Session Wait info            </span></div>
<div><span style="font-family: 'courier new', monospace;">        diag_sid.sql      An easy to use Oracle session-level performance snapshot utility</span></div>
<div><span style="font-family: 'courier new', monospace;">           disco.sql      Generates commands for disconnecting selected sessions</span></div>
<div><span style="font-family: 'courier new', monospace;">     getplusparm.sql      get sqlplus parameter value (such linesize, pagesize, sqlcode,</span></div>
<div><span style="font-family: 'courier new', monospace;">            grpn.sql      Quick group by query for aggregating Numeric columns</span></div>
<div><span style="font-family: 'courier new', monospace;">            hash.sql      Show the hash value, SQL_ID and child number of previously</span></div>
<div><span style="font-family: 'courier new', monospace;">             i2h.sql      Advanced Oracle Troubleshooting Seminar demo script</span></div>
<div><span style="font-family: 'courier new', monospace;">              im.sql      Display In-Memory Undo (IMU) buffer usage    </span></div>
<div><span style="font-family: 'courier new', monospace;">            init.sql<span class="Apple-converted-space"> </span>Initializes sqlplus variables for 156 character terminal</span></div>
<div><span style="font-family: 'courier new', monospace;">       kglbroken.sql                                                   </span></div>
<div><span style="font-family: 'courier new', monospace;">       kglbroken.sql      Report broken kgl locks for an object this can be used for </span></div>
<div><span style="font-family: 'courier new', monospace;">            kill.sql      Generates commands for killing selected sessions</span></div>
<div><span style="font-family: 'courier new', monospace;">              la.sql      Show which latch occupies a given memory address and its stats</span></div>
<div><span style="font-family: 'courier new', monospace;">     lastchanged.sql      Detect when a datablock in table was last changed</span></div>
<div><span style="font-family: 'courier new', monospace;">       latchprof.sql      Perform high-frequency sampling on V$LATCHHOLDER</span></div>
<div><span style="font-family: 'courier new', monospace;">   latchprof_old.sql      Perform high-frequency sampling on V$LATCHHOLDER</span></div>
<div><span style="font-family: 'courier new', monospace;">      latchprofx.sql      Perform high-frequency sampling on V$LATCHHOLDER</span></div>
<div><span style="font-family: 'courier new', monospace;">              lh.sql      Show latch holding SIDs and latch details from V$LATCHHOLDER</span></div>
<div><span style="font-family: 'courier new', monospace;">             lhp.sql      Perform high-frequency sampling on V$LATCHHOLDER</span></div>
<div><span style="font-family: 'courier new', monospace;">            lhpx.sql      Perform high-frequency sampling on V$LATCHHOLDER</span></div>
<div><span style="font-family: 'courier new', monospace;">     lotshparses.sql      Generate Lots of hard parses and shared pool activity </span></div>
<div><span style="font-family: 'courier new', monospace;">    lotshparses2.sql      Generate Lots of hard parses and shared pool activity </span></div>
<div><span style="font-family: 'courier new', monospace;">        lotslios.sql      Generate Lots of Logical IOs for testing purposes</span></div>
<div><span style="font-family: 'courier new', monospace;">        lotspios.sql      Generate Lots of Physical IOs for testing purposes</span></div>
<div><span style="font-family: 'courier new', monospace;">     lotssparses.sql      Generate Lots of soft parses and library cache/mutex activity </span></div>
<div><span style="font-family: 'courier new', monospace;">    lotssparses2.sql      Generate Lots of soft parses and library cache/mutex activity </span></div>
<div><span style="font-family: 'courier new', monospace;">       mutexprof.sql      Display KGX mutex sleep history from v$mutex_sleep_history</span></div>
<div><span style="font-family: 'courier new', monospace;">       nonshared.sql      Print reasons for non-shared child cursors from v$sql_shared_cursor</span></div>
<div><span style="font-family: 'courier new', monospace;">      nonshared2.sql      Show the reasons why more child cursors were created instead of</span></div>
<div><span style="font-family: 'courier new', monospace;">      ostackprof.sql      Take target process stack samples and show an execution profile</span></div>
<div><span style="font-family: 'courier new', monospace;">            pmem.sql      Show process memory usage breakdown &#8211; lookup by process SPID</span></div>
<div><span style="font-family: 'courier new', monospace;">     pmem_detail.sql      Show process memory usage breakdown details &#8211; lookup by process SPID</span></div>
<div><span style="font-family: 'courier new', monospace;">        prefetch.sql      Show KCB layer prefetch                      </span></div>
<div><span style="font-family: 'courier new', monospace;">          pvalid.sql      Show valid parameter values from V$PARAMETER_VALID_VALUES</span></div>
<div><span style="font-family: 'courier new', monospace;">        rowcache.sql      Show parent rowcache entries mathcing an object name</span></div>
<div><span style="font-family: 'courier new', monospace;">              rs.sql      Display available Redo Strands               </span></div>
<div><span style="font-family: 'courier new', monospace;">               s.sql      Display current Session Wait and SQL_ID info (10g+)</span></div>
<div><span style="font-family: 'courier new', monospace;">          sample.sql      Sample any V$ view or X$ table and display aggregated results</span></div>
<div><span style="font-family: 'courier new', monospace;">      sampleaddr.sql      High-frequency sampling of contents of a SGA memory address</span></div>
<div><span style="font-family: 'courier new', monospace;">             ses.sql      Display Session statistics for given sessions, filter by</span></div>
<div><span style="font-family: 'courier new', monospace;">            ses2.sql      Display Session statistics for given sessions, filter by</span></div>
<div><span style="font-family: 'courier new', monospace;">        sgastatx.sql      Show shared pool stats by sub-pool from X$KSMSS</span></div>
<div><span style="font-family: 'courier new', monospace;">            smem.sql      Show process memory usage breakdown &#8211; lookup by session ID</span></div>
<div><span style="font-family: 'courier new', monospace;">     smem_detail.sql      Show process memory usage breakdown details &#8211; lookup by session ID</span></div>
<div><span style="font-family: 'courier new', monospace;">         snapper.sql      An easy to use Oracle session-level performance snapshot utility</span></div>
<div><span style="font-family: 'courier new', monospace;">     snapper3.15.sql      An easy to use Oracle session-level performance snapshot utility</span></div>
<div><span style="font-family: 'courier new', monospace;">      snapper_v1.sql      An easy to use Oracle session-level performance snapshot utility</span></div>
<div><span style="font-family: 'courier new', monospace;">      snapper_v2.sql      An easy to use Oracle session-level performance snapshot utility</span></div>
<div><span style="font-family: 'courier new', monospace;">            stat.sql      Execute SQL statement in script argument and report basic</span></div>
<div><span style="font-family: 'courier new', monospace;">              sw.sql      Display current Session Wait info            </span></div>
<div><span style="font-family: 'courier new', monospace;">             sw2.sql      Display current Session Wait info            </span></div>
<div><span style="font-family: 'courier new', monospace;">             swg.sql      Display given Session Wait info grouped by state and event</span></div>
<div><span style="font-family: 'courier new', monospace;">             swo.sql      Display current Session Wait info            </span></div>
<div><span style="font-family: 'courier new', monospace;">          topsql.sql      Show TOP SQL ordered by user-provided criteria</span></div>
<div><span style="font-family: 'courier new', monospace;">            usql.sql      Show another session&#8217;s SQL directly from library cache</span></div>
<div><span style="font-family: 'courier new', monospace;">           usqlx.sql      Show another session&#8217;s SQL directly from library cache</span></div>
<div><span style="font-family: 'courier new', monospace;">        waitprof.sql      Sample V$SESSION_WAIT at high frequency and show resulting </span></div>
<div><span style="font-family: 'courier new', monospace;">              xb.sql      Explain a SQL statements execution plan with execution </span></div>
<div><span style="font-family: 'courier new', monospace;">            xde2.sql      Describe X$ tables, column offsets and report indexed fixed table</span></div>
<div><span style="font-family: 'courier new', monospace;">              xm.sql      Explain a SQL statements execution plan directly from library cache</span></div>
<div><span style="font-family: 'courier new', monospace;">             xma.sql      Explain a SQL statements execution plan directly from library cache</span></div>
<div><span style="font-family: 'courier new', monospace;">            xmai.sql      Explain a SQL statements execution plan with execution </span></div>
<div><span style="font-family: 'courier new', monospace;">             xms.sql      Explain your last SQL statements execution plan with execution </span></div>
<div><span style="font-family: 'courier new', monospace;">            xmsh.sql      Explain a SQL statements execution plan with execution </span></div>
<div><span style="font-family: 'courier new', monospace;">            xmsi.sql      Explain a SQL statements execution plan with execution </span></div>
<p>&nbsp;</p>
 <img src="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?view=1&post_id=896" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://blog.tanelpoder.com/2011/03/24/an-index-of-my-tpt-scripts/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>New versions of LatchProf and LatchProfX for latch contention troubleshooting and tuning</title>
		<link>http://blog.tanelpoder.com/2010/02/15/new-versions-of-latchprof-and-latchprofx-for-latch-contention-troubleshooting-and-tuning/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=new-versions-of-latchprof-and-latchprofx-for-latch-contention-troubleshooting-and-tuning</link>
		<comments>http://blog.tanelpoder.com/2010/02/15/new-versions-of-latchprof-and-latchprofx-for-latch-contention-troubleshooting-and-tuning/#comments</comments>
		<pubDate>Tue, 16 Feb 2010 03:01:51 +0000</pubDate>
		<dc:creator>Tanel Poder</dc:creator>
				<category><![CDATA[Cool stuff]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Administration]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Troubleshooting]]></category>

		<guid isPermaLink="false">http://blog.tanelpoder.com/?p=620</guid>
		<description><![CDATA[The LatchProf and LatchProfX scripts allow you to be more systematic with latch contention troubleshooting and tuning. No more guesswork is needed as these scripts give you exact session IDs and in this version also SQLIDs of the troublemaking applications. You can download the new versions here: LatchProf (reads V$ views) LatchProfX (reads X$ tables, [...]]]></description>
			<content:encoded><![CDATA[<p>The LatchProf and LatchProfX scripts allow you to be more systematic with latch contention troubleshooting and tuning. No more guesswork is needed as these scripts give you exact session IDs and in this version also SQLIDs of the troublemaking applications.</p>
<p>You can download the new versions here:</p>
<ul>
<li><a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL2ZpbGVzLmUyc24uY29tL3NjcmlwdHMvbGF0Y2hwcm9mLnNxbCA=" target=\"_blank\">LatchProf</a> (reads V$ views)</li>
<li><a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL2ZpbGVzLmUyc24uY29tL3NjcmlwdHMvbGF0Y2hwcm9meC5zcWwg" target=\"_blank\">LatchProfX</a> (reads X$ tables, but gives better info, run as SYS)</li>
</ul>
<p>Example output (with SQLID info) is below:</p>
<pre class="brush: plain; title: ; notranslate">SQL&gt; @latchprof name,sid,sqlid % % 100000

-- LatchProf 1.21 by Tanel Poder ( http://www.tanelpoder.com )

NAME                                       SID SQLID               Held       Gets  Held %     Held ms Avg hold ms
----------------------------------- ---------- ------------- ---------- ---------- ------- ----------- -----------
cache buffers chains                       133 3jbwa65aqmkvm        349        348     .35      14.169        .041
simulator lru latch                        133 3jbwa65aqmkvm         51         51     .05       2.071        .041
row cache objects                          133 3jbwa65aqmkvm          5          5     .01        .203        .041
cache buffers chains                        24                        5          5     .01        .203        .041
cache buffers chains                       149 3jbwa65aqmkvm          3          3     .00        .122        .041
resmgr group change latch                   33 147a57cxq3w5y          2          2     .00        .081        .041
cache buffers chains                         9 5raw2bzx227wp          2          1     .00        .081        .081
In memory undo latch                       149 f3y38zthh270n          2          1     .00        .081        .081
active checkpoint queue latch                5                        2          1     .00        .081        .081
cache buffers chains                       149 75621g9y3xmvd          2          2     .00        .081        .041
cache buffers chains                         9 gvgdv2v90wfa7          2          2     .00        .081        .041
cache buffers chains                        33 75621g9y3xmvd          2          2     .00        .081        .041
checkpoint queue latch                       5                        1          1     .00        .041        .041
ksuosstats global area                       8                        1          1     .00        .041        .041
cache buffers lru chain                    133 3jbwa65aqmkvm          1          1     .00        .041        .041
multiblock read objects                    155 75ju2gn3s8009          1          1     .00        .041        .041
resmgr group change latch                    9 0w2qpuc6u2zsp          1          1     .00        .041        .041
resmgr group change latch                   33 apgb2g9q2zjh1          1          1     .00        .041        .041
resmgr group change latch                  133 apgb2g9q2zjh1          1          1     .00        .041        .041
space background task latch                 17                        1          1     .00        .041        .041
cache buffers chains                       149 5raw2bzx227wp          1          1     .00        .041        .041
cache buffers chains                        33 5raw2bzx227wp          1          1     .00        .041        .041
cache buffers chains                        33 05s4vdwsf5802          1          1     .00        .041        .041
cache buffers chains                        31 0yas01u2p9ch4          1          1     .00        .041        .041
cache buffers chains                         9 41zu158rqf4kf          1          1     .00        .041        .041
In memory undo latch                        33 0bzhqhhj9mpaa          1          1     .00        .041        .041
In memory undo latch                        31 gvgdv2v90wfa7          1          1     .00        .041        .041
In memory undo latch                         9 gvgdv2v90wfa7          1          1     .00        .041        .041
simulator lru latch                        149 3jbwa65aqmkvm          1          1     .00        .041        .041
row cache objects                          133 5yq51dtyc6qf2          1          1     .00        .041        .041
SQL memory manager workarea list la        133 3jbwa65aqmkvm          1          1     .00        .041        .041
enqueues                                   141                        1          1     .00        .041        .041
row cache objects                          132                        1          1     .00        .041        .041

33 rows selected.
</pre>
<p>LatchProf scripts allow you to easily identify which session and SQLID (or sqlhash in 9i) cause the latch(es) to be held the most.</p>
<p>Let&#8217;s check what&#8217;s the most &#8220;latch-holding&#8221; SQL reported by LatchProf:</p>
<pre class="brush: sql; title: ; notranslate">SQL&gt; @sqlid 3jbwa65aqmkvm

HASH_VALUE  CH# SQL_TEXT
---------- ---- ------------------------------------------------------------------------------------------------------------------------------------------------------
1432996723    0 SELECT O.ORDER_ID, LINE_ITEM_ID, PRODUCT_ID, UNIT_PRICE, QUANTITY, ORDER_MODE, ORDER_STATUS, ORDER_TOTAL, SALES_REP_ID, PROMOTION_ID, C.CUSTOMER_ID,
 CUST_FIRST_NAME, CUST_LAST_NAME, CREDIT_LIMIT, CUST_EMAIL, ORDER_DATE FROM ORDERS O , ORDER_ITEMS OI, CUSTOMERS C WHERE O.ORDER_ID = OI.ORDER_ID AND
 O.CUSTOMER_ID = C.CUSTOMER_ID AND O.ORDER_STATUS &lt;= 4
</pre>
<p>If you want to read more about the capabilities of LatchProf and LatchProfX, go here:</p>
<ul>
<li><a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL2Jsb2cudGFuZWxwb2Rlci5jb20vMjAwOC8wNy8wOS9hZHZhbmNlZC1vcmFjbGUtdHJvdWJsZXNob290aW5nLWd1aWRlLXBhcnQtNy1zYW1wbGluZy1sYXRjaC1ob2xkZXItc3RhdGlzdGljcy11c2luZy1sYXRjaHByb2Yv">http://blog.tanelpoder.com/2008/07/09/advanced-oracle-troubleshooting-guide-part-7-sampling-latch-holder-statistics-using-latchprof/</a></li>
<li><a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL2Jsb2cudGFuZWxwb2Rlci5jb20vMjAwOC8wNy8yMy9hZHZhbmNlZC1vcmFjbGUtdHJvdWJsZXNob290aW5nLWd1aWRlLXBhcnQtOC1ldmVuLW1vcmUtZGV0YWlsZWQtbGF0Y2gtdHJvdWJsZXNob290aW5nLXVzaW5nLWxhdGNocHJvZngv">http://blog.tanelpoder.com/2008/07/23/advanced-oracle-troubleshooting-guide-part-8-even-more-detailed-latch-troubleshooting-using-latchprofx/</a></li>
</ul>
<p>Note that the latest version (v1.21) also fixes a problem with Oracle 11.2 where the script execution plan order was wrong, causing the sampling to not happen in correct order. I added a NO_TRANSFORM_DISTINCT_AGG hint to disable a new transformation happening in 11.2 to make the scripts behave correctly&#8230;</p>
 <img src="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?view=1&post_id=620" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://blog.tanelpoder.com/2010/02/15/new-versions-of-latchprof-and-latchprofx-for-latch-contention-troubleshooting-and-tuning/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>KGL simulator, shared pool simulator and buffer cache simulator &#8211; what are these?</title>
		<link>http://blog.tanelpoder.com/2009/09/13/kgl-simulator-shared-pool-simulator-and-buffer-cache-simulator-what-are-these/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=kgl-simulator-shared-pool-simulator-and-buffer-cache-simulator-what-are-these</link>
		<comments>http://blog.tanelpoder.com/2009/09/13/kgl-simulator-shared-pool-simulator-and-buffer-cache-simulator-what-are-these/#comments</comments>
		<pubDate>Mon, 14 Sep 2009 00:13:39 +0000</pubDate>
		<dc:creator>Tanel Poder</dc:creator>
				<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Administration]]></category>
		<category><![CDATA[Internals]]></category>
		<category><![CDATA[Performance]]></category>

		<guid isPermaLink="false">http://blog.tanelpoder.com/?p=444</guid>
		<description><![CDATA[If you have queried v$sgastat on recent Oracle versions (by which I mean 9i and above) you probably have seen allocations for some sort of simulators in Oracle instance. Here&#8217;s an example: SQL&#62; select * from v$sgastat where lower(name) like '%sim%' order by name; POOL         NAME                            BYTES ------------ -------------------------- ---------- shared pool  kglsim alloc latch [...]]]></description>
			<content:encoded><![CDATA[<p>If you have queried v$sgastat on recent Oracle versions (by which I mean 9i and above) you probably have seen allocations for some sort of simulators in Oracle instance. Here&#8217;s an example:</p>
<pre>SQL&gt; select * from v$sgastat where lower(name) like '%<strong>sim</strong>%' order by name;

POOL         NAME                            BYTES
------------ -------------------------- ----------
shared pool  kglsim alloc latch area          1700
shared pool  kglsim alloc latches               68
shared pool  kglsim count of pinned he        9248
shared pool  kglsim free heap list             204
shared pool  kglsim free obj list              204
shared pool  kglsim hash table                4104
shared pool  kglsim hash table bkts        2097152
shared pool  kglsim heap                    635536
shared pool  kglsim latch area                1700
shared pool  kglsim latches                     68
shared pool  kglsim main lru count           87040
shared pool  kglsim main lru size           174080
shared pool  kglsim object batch            909440
shared pool  kglsim pin list arr               816
shared pool  kglsim recovery area             2112
shared pool  kglsim sga                      22092
shared pool  kglsim size of pinned mem       18496
shared pool  ksim client list                   84
shared pool  log_simultaneous_copies           480
shared pool  sim cache nbufs                   640
shared pool  sim cache sizes                   640
shared pool  sim kghx free lists                 4
shared pool  sim lru segments                  640
shared pool  sim segment hits                 1280
shared pool  sim segment num bufs              640
shared pool  sim state object                   48
shared pool  sim trace buf                    5140
shared pool  sim trace buf context             120
shared pool  sim_knlasg                       1200
shared pool  simulator hash buckets          16512
shared pool  simulator latch/bucket st        4608

31 rows selected.</pre>
<p>See, a bunch of &#8220;kgl sim&#8221; and then just &#8220;sim&#8221; allocations.</p>
<p>&#8230; or sometimes you can see latch contention on following latches:</p>
<pre>SQL&gt; select name from v$latch where name like '%sim%';

NAME
-------------------------------------------------------
ksim membership request latch
simulator lru latch
simulator hash latch
sim partition latch
shared pool simulator
shared pool sim alloc

6 rows selected.</pre>
<p>Again, there seems to be some &#8220;simulation&#8221; work going on in Oracle instance.</p>
<p>So what are these simulators about?</p>
<p><span id="more-444"></span></p>
<p>These simulators help Oracle to determine how much time and effort would have been saved if buffer cache or shared pool size was bigger than current. And this info is summarized and externalized in various advisor views such V$SHARED_POOL_ADVICE, V$MTTR_TARGET_ADVICE, V$DB_CACHE_ADVICE and so on. These simulation engines were introduced in Oracle 9i along the mentioned V$ views.</p>
<p>Also this stuff is used by MMAN for making SGA resizing decisions if SGA_TARGET (10g+) or MEMORY_TARGET (11g+) is enabled.</p>
<p>I decided to write an article about simulators as quite a few people have been asking me about it lately. So, I first googled around to see if someone has already written about it already and found one <a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL3N0YWNrb3ZlcmZsb3cuY29tL3F1ZXN0aW9ucy8xODM5NjMvd2hhdC1pcy10aGUtb3JhY2xlLWtnbC1zaW11bGF0b3I=">here</a>. It turned out it was written by me, I just didn&#8217;t know about it anymore :)</p>
<p>So, I will write an edited version of my original explanation here. Note that I don&#8217;t know the exact algorithm of the simulator, just explaining the fundamentals here.</p>
<p>KGL simulator is the shared pool memory allocation tracking and simulation engine.</p>
<p>KGL=Kernel General Library cache manager, as the name says it deals with library objects such cursors, cached stored object definitions (PL/SQL stored procs, table definitions etc).</p>
<p>KGL simulator is used for estimating the benefit of caching if the cache was larger than currently. The general idea is that when flushing out a library cache object, it&#8217;s hash value (and few other bits of info) are still kept in the KGL simulator hash table. This stores a history of objects <em><strong>which were in memory in past but got flushed out</strong></em>. That&#8217;s the key part: we keep a history of objects flushed out (this is done by selective sampling though, not at every object flush as it would have too much overhead).</p>
<p>When loading a library cache object into library cache (which means that no existing such object is in there already), Oracle goes and checks the KGL simulator hash table to see whether an object with matching hash value is in there. If a matching object is found, that means that the required object had been in cache in past, but flushed out due space pressure. That&#8217;s the second important part: we know when we would have been able to (probably) just reuse an object if the shared pool had been bigger.</p>
<p>Using that information of how many library cache object (re)loads could have been been avoided if cache had been bigger (thanks to KGL simulator history) and also thanks to knowing how much time the (current) object reloads took, Oracle can predict how much response time would have been saved instancewide if shared pool was bigger. So, we don&#8217;t need to increase shared pool itself to measure the benefit, we <em><strong>simulate</strong></em> a bigger shared pool by just keeping a history of aged out library cache objects hash values in a special array &#8211; and that&#8217;s why there are various kglsim and simulator allocations seen in V$SGASTAT.</p>
<p>The simulation results are seen from v$shared_pool_advice and are used by other advisors such ADDM.</p>
<p>Apparently Oracle keeps track also about how much memory is pinned in the shared pool (thus can&#8217;t be flushed out) over time. This info is probably used also for estimating shared pool resize requirements as pinned memory chunks are not usable for anything else (as long as they are pinned) in the shared pool. These measurements are also done on sampling basis.</p>
<p>I did a little test with my <a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL3d3dy50YW5lbHBvZGVyLmNvbS9maWxlcy9zY3JpcHRzL2xvdHNocGFyc2VzLnNxbA==" target=\"_blank\">lotshparses.sql</a> script (which just runs a lot of hard parses inside a PL/SQL loop) and ran latchprofx to report which latches and why was my session using. Note the &#8220;sim&#8221; instead of usual % in the latchprofx syntax. I am interested in only these latches which have &#8220;sim&#8221; in their name in this experiment:</p>
<pre>SQL&gt; <a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL3d3dy50YW5lbHBvZGVyLmNvbS9maWxlcy9zY3JpcHRzL2xhdGNocHJvZnguc3Fs" target=\"_blank\">@latchprofx</a> sid,name,func,hmode 146 <em><strong>sim</strong></em> 100000

-- LatchProfX 1.12 by Tanel Poder ( http://www.tanelpoder.com )

       SID NAME                                FUNC                                     HMODE              Held       Gets  Held %
---------- ----------------------------------- ---------------------------------------- ------------ ---------- ---------- -------
 146 shared pool simulator               kglsim_unpin_simhp                       exclusive           514         49     .51
 146 shared pool simulator               kglsim_upd_newhp                         exclusive           206        198     .21
 146 shared pool simulator               kglsim_scan_lru: scan                    exclusive            95         93     .10
 146 shared pool simulator               kglsim_chg_simhp_free                    exclusive            32         32     .03
 146 shared pool simulator               kglsim_chg_simhp_noob                    exclusive            32         32     .03
 146 shared pool sim alloc               kglsim_chk_heaplist: alloc               exclusive             4          3     .00
 146 shared pool sim alloc               kglsim_chk_objlist: alloc                exclusive             3          2     .00
 146 shared pool simulator               kglsim_chk_objlist: extend               exclusive             1          1     .00
 146 shared pool simulator               kglsim_scan_lru: rebalance               exclusive             1          1     .00</pre>
<p>This simulation engine is pretty cool because its the first time someone has tried to measure the <strong><em>time</em></strong> what could be saved by increasing shared pool size. (Response) Time is the ultimate measure of performance, so this simulator measures the right thing. Do you remember what people used before that? Library cache hit ratio! A lot of people back then (and some still today) have been misleaded by measuring the wrong things like hit ratios.</p>
<p>Anyway, while this is a cool feature &#8211; it can cause trouble in big systems. This is because the extra CPU usage for simulation and also the potential concurrency issues like simulator latch contention (like the shared pool simulator latch above, these latches are used even on 11g where most library cache latches have been replaced with KGX mutexes for efficiency and avoiding false contention issues). Also, some people are sensitive about the potential large memory allocations the shared pool simulator can take from shared pool (this is controlled by <em><strong>_kglsim_max_percent</strong></em> parameter which defaults to max 5% of total shared pool size). As these allocations are dynamic, then when your application suddenly starts making lots of hard parses due newly released code, causing memory shortage, then the simulator can make things even worse as it can make relatively big allocations for itself too! And as far as I know the simulator arrays don&#8217;t shrink back.</p>
<p>Note that I&#8217;m talking about big systems here, in most other cases the simulation shouldn&#8217;t cause trouble (on recent db versions anyway).</p>
<p>In some databases I have disabled all of the advisors and simulators with setting <em><strong>statistics_level = basic</strong></em>. All features what you see from V$STATISTICS_LEVEL would be disabled when setting statistics_level to basic:</p>
<pre>SQL&gt; select statistics_name,activation_level from v$statistics_level;

STATISTICS_NAME                          ACTIVAT
---------------------------------------- -------
Buffer Cache Advice                      TYPICAL
MTTR Advice                              TYPICAL
Timed Statistics                         TYPICAL
Timed OS Statistics                      ALL
Segment Level Statistics                 TYPICAL
PGA Advice                               TYPICAL
Plan Execution Statistics                ALL
Shared Pool Advice                       TYPICAL
Modification Monitoring                  TYPICAL
Longops Statistics                       TYPICAL
Bind Data Capture                        TYPICAL
Ultrafast Latch Statistics               TYPICAL
Threshold-based Alerts                   TYPICAL
Global Cache Statistics                  TYPICAL
Active Session History                   TYPICAL
Undo Advisor, Alerts and Fast Ramp up    TYPICAL
Streams Pool Advice                      TYPICAL
Time Model Events                        TYPICAL
Plan Execution Sampling                  TYPICAL
Automated Maintenance Tasks              TYPICAL
SQL Monitoring                           TYPICAL
Adaptive Thresholds Enabled              TYPICAL
V$IOSTAT_* statistics                    TYPICAL
.
23 rows selected.</pre>
<p>Sometimes you don&#8217;t want to disable absolutely everything, but only the troublemakers. The good news is that there&#8217;s an (undocumented) parameter for disabling every feature separately. As always, make sure that you  know what you&#8217;re doing before changing these and get blessing from Oracle support in order to not get yourself in trouble.</p>
<p>So, for example the library cache memory simulation (shared pool advice) can be disabled by setting <strong><em>_library_cache_advice = false</em></strong> .</p>
<p>I covered shared pool and library cache above. Oracle has more cache advisors, the underlying principle in them is the same. They save some small metadata/info about flushed out objects in a simulator array and just check that array when someone is allocating memory for new object. If matching object&#8217;s metadata still happens to be in the simulator array, then Oracle knows that we could have avoided the time and effort loading (generating) that object back into memory if the corresponding cache had been bigger. Of course the algorithm must be more sophisticated than just checking for object&#8217;s existence, there are probably object&#8217;s flush timestamp and size stored in the simulator arrays as well.</p>
<p>While the library cache simulator array stores library cache objects hash values for looking up historical memory allocations metadata, then the buffer cache advisor stores just data block addresses (DBAs) of blocks previously in cache. When Oracle spots that a block was previously in cache while intiating physical IO for re-reading it back again, it will measure the time it took to complete the current physical IO and add this info into the simulator array. Again, that&#8217;s how Oracle can measure how much <strong><em>response time</em></strong> would have been saved if the block would still have been in the buffer cache.</p>
<p>The buffer cache advisor needs some special attention, as I have had most issues with that (on large systems again), mostly with buffer cache&#8217;s <em><strong>simulator lru latch</strong></em> contention in systems with lots of CPUs and physical IOs going on. The problem here is that even though you can have hundreds of thousands of cache buffers chains latches in a system, the number of simulator lru latches remains relatively low by default, causing contention.</p>
<p>Oracle provides a documented parameter for disabling the buffer cache simulation &#8211; you need to set <em><strong>db_cache_advice</strong></em> = off (or ready).</p>
<p>So, hopefully this sheds some light into how the advisors work and what the heck the simulators simulate in Oracle. Even though I provided a list of parameters which allow you to control this stuff &#8211; don&#8217;t change any of them unless you actually have a problem which they should fix (and you are sure that these parameters would fix <em>your</em> problem).</p>
<p><strong><br />
</strong></p>
<p><strong>Update:</strong></p>
<p>Jonathan Lewis has written a good article about buffer cache advisor details on his website &#8211; I recommend to read that one too if you&#8217;re interested in this stuff:</p>
<p><a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL3d3dy5qbGNvbXAuZGVtb24uY28udWsvQ2FjaGVfYWR2aWNlLmh0bWw=" target=\"_blank\">http://www.jlcomp.demon.co.uk/Cache_advice.html</a></p>
 <img src="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?view=1&post_id=444" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://blog.tanelpoder.com/2009/09/13/kgl-simulator-shared-pool-simulator-and-buffer-cache-simulator-what-are-these/feed/</wfw:commentRss>
		<slash:comments>16</slash:comments>
		</item>
		<item>
		<title>latch: cache buffers chains latch contention &#8211; a better way for finding the hot block</title>
		<link>http://blog.tanelpoder.com/2009/08/27/latch-cache-buffers-chains-latch-contention-a-better-way-for-finding-the-hot-block/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=latch-cache-buffers-chains-latch-contention-a-better-way-for-finding-the-hot-block</link>
		<comments>http://blog.tanelpoder.com/2009/08/27/latch-cache-buffers-chains-latch-contention-a-better-way-for-finding-the-hot-block/#comments</comments>
		<pubDate>Thu, 27 Aug 2009 09:37:08 +0000</pubDate>
		<dc:creator>Tanel Poder</dc:creator>
				<category><![CDATA[Cool stuff]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Internals]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Tools]]></category>
		<category><![CDATA[Troubleshooting]]></category>

		<guid isPermaLink="false">http://blog.tanelpoder.com/?p=417</guid>
		<description><![CDATA[Here&#8217;s a treat for Oracle performance professionals and geeks who are looking for more systematic ways for cache buffers chains (CBC) latch contention troubleshooting. Cache buffers chains latches are taken when a process wants to walk through a cache buffer hash chain, looking if the block with required DBA (data block address) is in buffer [...]]]></description>
			<content:encoded><![CDATA[<p><!--[if !mso]> <mce:style><!  v\:* {behavior:url(#default#VML);} o\:* {behavior:url(#default#VML);} p\:* {behavior:url(#default#VML);} .shape {behavior:url(#default#VML);} v\:textbox {display:none;} --> <!--[endif]--></p>
<div>
<p>Here&#8217;s a treat for Oracle performance professionals and geeks who are looking for more systematic ways for cache buffers chains (CBC) latch contention troubleshooting. Cache buffers chains latches are taken when a process wants to walk through a cache buffer hash chain, looking if the block with required DBA (data block address) is in buffer cache. If the block happens to be in cache, then in most cases it has to be pinned first before use and unpinned after use, to make sure no-one else can perform an incompatible operation on that block at the same time. The modification of pin structures (pinning/unpinning) is also protected by CBC latches.</p></div>
<div>
<p>CBC latch contention can happen for multiple reasons, but one reason is that there is some <em>really</em> hot block in a SMP system with high number of CPUs (or CMT system with high number of threads like Sun T-series servers). Sometimes there happen to be multiple moderately hot blocks &#8220;under&#8221; the same CBC latch, which can result in latch contention again.</div>
<div>
<p>Traditionally DBAs used to look up the <strong>child latch address</strong> from V$SESSION_WAIT, sql_trace output or ASH and then look up all buffers protected by that latch from X$BH, using HLADDR column (HLADDR stands for Hash Latch Address). I also have a script for that, <a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL3d3dy50YW5lbHBvZGVyLmNvbS9maWxlcy9zY3JpcHRzL2JobGEuc3Fs" target=\"_blank\">bhla.sql</a> (Buffer Headers by Latch Address), which reports me all blocks currently in buffer cache, &#8220;under&#8221; that particular latch and the corresponding data block addresses and object names:</div>
<div><span id="more-417"></span></div>
<pre>SQL&gt; <strong>@bhla </strong>27E5A780  <em>&lt;-- latch address reported by wait interface (Parameter1 in latch: cache buffers chains wait event)</em>

FLG_LRUFLG                 OBJ OBJECT_TYPE         object                                          TCH DBA     
------------------- ---------- ------------------- ---------------------------------------- ---------- ---------
80000:8                      2 TABLE               SYS.SUBCOLTYPE$                                   0 1 7873  
80000:8                      2 TABLE               SYS.ATTRCOL$                                      0 1 7873  
0:0                      50472 INDEX               SYSMAN.MGMT_JOB_EXEC_IDX04                        1 3 27699 
0:0                      50472 INDEX               SYSMAN.MGMT_JOB_EXEC_IDX04                        1 3 27699 
0:4                      50472 INDEX               SYSMAN.MGMT_JOB_EXEC_IDX04                        1 3 27699 
2202000:8                 3710 TABLE               SYS.WRI$_ADV_REC_ACTIONS                          1 3 972   
0:0                      50472 INDEX               SYSMAN.MGMT_JOB_EXEC_IDX04                        1 3 27699 
0:0                      50472 INDEX               SYSMAN.MGMT_JOB_EXEC_IDX04                        1 3 27699 
2202000:8                94227 TABLE PARTITION     SYS.WRH$_ROWCACHE_SUMMARY                         2 3 36142 
80000:8                  54880 INDEX               PERFSTAT.STATS$UNDOSTAT_PK                       <strong>14 </strong>4 118331
0:8                         37 INDEX               SYS.I_OBJ2                                       <strong>24 </strong>1 55591 
2000:8                    3680 TABLE               SYS.WRI$_ADV_TASKS                               <strong>26 </strong>3 739   
0:8                         75 TABLE               SYS.IDL_UB2$                                     <strong>35 </strong>1 11745 
0:8                         73 TABLE               SYS.IDL_UB1$                                     <strong>38 </strong>1 12211</pre>
<p>From above we see that this child latch is protecting many blocks in buffer cache. So which one is the troublemaker? This is where the touchcount (TCH column) has been traditionally used, the higher the TCH the hotter the buffer, right?</p>
<p>Actually this may not always be true. For example if there is a somewhat hot block which has been steadily in buffer cache for weeks, it may have very high touch count. Now if there is an extremely hot block used by every morning&#8217;s batch job (and causing the trouble) but not used later on in the day, this block will get aged out later in the day. Once a block is aged out, its metadata, including TCH is gone! So when the block is loaded back in next morning (causing trouble again), its touch count starts from zero again.</p>
<p>So, by just looking into touchcount numbers you can&#8217;t always reliably detect who&#8217;s the <em>current</em> troublemaker. One option here would be to query the touchcounts, then wait for a minute, query the touchcounts again and see who&#8217;s touchcount increased the most during the minute. This would give a bit better picture about who is causing trouble <em>now.</em></p>
<p>But still, it would not be always reliable for another reason &#8211; touchcounts are incremented only after 3 seconds have passed since last increment! This factor has been coded in to avoid situation such a short but crazy nested loop join hammering a single buffer hundreds of thousands of times in few seconds and then finishing. The buffer wouldn&#8217;t be hot anymore but the touchcount would be hundreds of thousands due a single SQL execution. So, unless 3 seconds (of SGA internal time) has passed since last TCH update, the touchcounts would not be increased during buffer access.</p>
<blockquote><p>This time is controlled by SGA variable <strong>kcbatt_</strong> by the way:</p>
<pre>SQL&gt; oradebug dumpvar sga kcbatt_
ub4 kcbatt_ [3C440F4, 3C440F8) = 0000000<strong>3

</strong></pre>
</blockquote>
<p>This 3-second delay leaves us in the following situation, let say there are 2 blocks protected by a CBC child latch:</p>
<p>One block has been accessed <strong>once every 3 seconds</strong> for 24 hours in a row. A block accessed once per 3 seconds is definitely not a hot block, but its touchcount would be around 28800 (86400 seconds per 24 hours / 3 = 28800).</p>
<p>And there is another block which is accessed crazily for 2 seconds in a row and this happens every 10 seconds. 2 seconds of consecutive access would increase the touchcount by 1. If such access pattern has been going on every 10 seconds over last 24 hours, then the touch count for that buffer would be 86400 / 10 = 8640.</p>
<p>In the first case we can have a very cold block with TCH = 28800 and in second case a very hot block with TCH = 8640 only and this can mislead DBAs to fixing the wrong problem.</p>
<p>Luckily this problem can be solved with LatchProfX :-)</p>
<p>If you don&#8217;t know what LatchProfX is, you may want to read these articles first &#8211; LatchProf gives you a better way for latch contention troubleshooting compared to old approaches:</p>
<ul>
<li><a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL2Jsb2cudGFuZWxwb2Rlci5jb20vMjAwOC8wNy8wOS9hZHZhbmNlZC1vcmFjbGUtdHJvdWJsZXNob290aW5nLWd1aWRlLXBhcnQtNy1zYW1wbGluZy1sYXRjaC1ob2xkZXItc3RhdGlzdGljcy11c2luZy1sYXRjaHByb2Yv" target=\"_blank\">http://blog.tanelpoder.com/2008/07/09/advanced-oracle-troubleshooting-guide-part-7-sampling-latch-holder-statistics-using-latchprof/</a></li>
<li><a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL2Jsb2cudGFuZWxwb2Rlci5jb20vMjAwOC8wNy8yMy9hZHZhbmNlZC1vcmFjbGUtdHJvdWJsZXNob290aW5nLWd1aWRlLXBhcnQtOC1ldmVuLW1vcmUtZGV0YWlsZWQtbGF0Y2gtdHJvdWJsZXNob290aW5nLXVzaW5nLWxhdGNocHJvZngv">http://blog.tanelpoder.com/2008/07/23/advanced-oracle-troubleshooting-guide-part-8-even-more-detailed-latch-troubleshooting-using-latchprofx/</a></li>
<li><a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL2Jsb2cudGFuZWxwb2Rlci5jb20vMjAwOS8wMy8yMC9hbm90aGVyLWxhdGNocHJvZngtdXNlLWNhc2Uv">http://blog.tanelpoder.com/2009/03/20/another-latchprofx-use-case/</a></li>
</ul>
<p>And now I introduce a new feature in LatchProfX.</p>
<p>In addition to being able to report the SID of the latch holder and the kernel function why this latch was taken by the session, the newest LatchProfX can also report the <strong>object</strong> what was protected by the latch under investigation. For cache buffers chains the object is <em>data block address</em> of the block being accessed.</p>
<p>Here&#8217;s an example, let say I have CBC latch contention problems and from monitoring tools or sql_trace I have identified the hot latch, I use my <a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL3d3dy50YW5lbHBvZGVyLmNvbS9maWxlcy9zY3JpcHRzL3N3LnNxbA==" target=\"_blank\">sw.sql</a> script here, to report what a session is waiting for:</p>
<pre>SQL&gt; <a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL3d3dy50YW5lbHBvZGVyLmNvbS9maWxlcy9zY3JpcHRzL3N3LnNxbA==" target=\"_blank\"><strong>@sw</strong></a> 138
 SID STATE   EVENT                        SEC_IN_WAIT                 P1
------- ------- ---------------------------- ----------- ------------------
    138 WAITING latch: cache buffers chains            0 address=0x<strong>27E5A780</strong></pre>
<p>Sw reports that this session is waiting on cache buffer chains child latch with address 0x27E5a780.</p>
<p>Now we can run latchprofx and report sid, name, hold mode and protected object for that child latch only (instead of latch name we specified its address):</p>
<pre>SQL&gt; <a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL3d3dy50YW5lbHBvZGVyLmNvbS9maWxlcy9zY3JpcHRzL2xhdGNocHJvZnguc3Fs">@latchprofx</a> sid,name,hmode,<strong>object</strong> % <strong>27E5A780</strong> 100000

 SID NAME                   HMODE         OBJECT   Held  Gets  Held %  Held ms
---- ---------------------- ------------ ------- ------ ----- ------- --------
 138 cache buffers chains   exclusive     <strong>40EB02</strong>   3928  3797    3.93   41.637
 151 cache buffers chains   exclusive     <strong>40EB02</strong>   3711  3660    3.71   39.337
 138 cache buffers chains   shared        <strong>40EB02</strong>    623   623     .62    6.604
 151 cache buffers chains   shared        <strong>40EB02</strong>    544   544     .54    5.766</pre>
<p>As you see, LatchProfX has reported a data block address in the object column. This is the datablock we are trying to access in buffer cache under protection of the given child latch.</p>
<p>Lets see what this DBA translates to:</p>
<pre>
<pre>SQL&gt; <strong><a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL3d3dy50YW5lbHBvZGVyLmNvbS9maWxlcy9zY3JpcHRzL2RiYS5zcWw=" target=\"_blank\">@dba</a> 40EB02</strong>

 RFILE#     BLOCK#
---------- ----------
         1      60162

Press enter to find the segment using V$BH (this may take CPU time), CTRL+C to cancel:

STATE      BLOCK_CLASS        OBJECT_TYPE         object                                          TCH  MODE_HELD
---------- ------------------ ------------------- ---------------------------------------- ---------- ----------
xcur       data block         INDEX               SYS.SYS_IOT_TOP_<strong>94276</strong>                           331          1

Press enter to query what segment resides there using DBA_EXTENTS (this can be IO intensive), CTRL+C to cancel:

OWNER                          SEGMENT_NAME         PARTITION_NAME  TABLESPACE_NAME
------------------------------ -------------------- --------------- -----------------
SYS                            SYS_IOT_TOP_94276                    SYSTEM</pre>
<p>So, the troublemaker is an IOT index segment block belonging to table with object_id <strong>94276</strong>. I can find the table name from dba_objects by object_id, or by using my script:</pre>
<pre>SQL&gt;<strong> <a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL3d3dy50YW5lbHBvZGVyLmNvbS9maWxlcy9zY3JpcHRzL29pZC5zcWw=" target=\"_blank\">@oid</a></strong> 94276

owner                     object_name                    object_type        CREATED
------------------------- ------------------------------ ------------------ -----------------
SYS                       <strong>KILL_CPU</strong>                       TABLE              20090825 23:19:49</pre>
<p>Also, as we know the segment name and relative file number (RFILE#) and block number (BLOCK#)  from above output we can dump the datablock with an ALTER SYSTEM DUMP DATAFILE command.</p>
<p>As that command requires absolute file id (not relative) we have to do a little conversion first (in cases where there are multiple files with same relative file number in database):</p>
<pre>SQL&gt; select file_id from dba_extents where <strong>relative_fno</strong> = 1 and <strong>segment_name</strong> = 'SYS_IOT_TOP_94276';

 FILE_ID
----------
 1</pre>
<p>Of course in my little test case all relative file numbers match with absolute numbers (but if you have lots of datafiles, have directly migrated from Oracle 7 or use transportable tablespaces, you may hit this issue).</p>
<p>Finally I can run ALTER SYSTEM DUMP DATAFILE 1 BLOCK 60162 to find out more about the contents of that block. Another option would be to construct a rowid from the data_object_id, rfile and block and query the table using that but as that&#8217;s complicated for IOTs, I&#8217;ll leave this to a future post.</p>
<p>You can download latest version of LatchProfX here:</p>
<ul>
<li><a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL3d3dy50YW5lbHBvZGVyLmNvbS9maWxlcy9zY3JpcHRzL2xhdGNocHJvZnguc3Fs" target=\"_blank\">http://www.tanelpoder.com/files/scripts/latchprofx.sql</a></li>
</ul>
 <img src="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?view=1&post_id=417" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://blog.tanelpoder.com/2009/08/27/latch-cache-buffers-chains-latch-contention-a-better-way-for-finding-the-hot-block/feed/</wfw:commentRss>
		<slash:comments>30</slash:comments>
		</item>
		<item>
		<title>Done with my seminar in Netherlands &#8211; here&#8217;s some feedback and couple beta new versions of my scripts</title>
		<link>http://blog.tanelpoder.com/2009/04/05/done-with-my-seminar-in-netherlands-heres-some-feedback-and-couple-beta-new-versions-of-my-scripts/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=done-with-my-seminar-in-netherlands-heres-some-feedback-and-couple-beta-new-versions-of-my-scripts</link>
		<comments>http://blog.tanelpoder.com/2009/04/05/done-with-my-seminar-in-netherlands-heres-some-feedback-and-couple-beta-new-versions-of-my-scripts/#comments</comments>
		<pubDate>Sun, 05 Apr 2009 22:09:59 +0000</pubDate>
		<dc:creator>Tanel Poder</dc:creator>
				<category><![CDATA[Cool stuff]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Training]]></category>
		<category><![CDATA[Troubleshooting]]></category>

		<guid isPermaLink="false">http://blog.tanelpoder.com/?p=275</guid>
		<description><![CDATA[The Miracle Masterclass I delivered last week in Netherlands was awesome, I had lots of attendees and lots of good questions and feedback. As usual, I ran out of time in the end though. I will make this up with some articles about the stuff I didn&#8217;t have time to explain :) I used Snapper [...]]]></description>
			<content:encoded><![CDATA[<p>The Miracle Masterclass I delivered last week in Netherlands was awesome, I had lots of attendees and lots of good questions and feedback. As usual, I ran out of time in the end though. I will make this up with some articles about the stuff I didn&#8217;t have time to explain :)</p>
<p>I used Snapper v2.0 beta during my demos there and also an improved version of LatchProfX which can give some additional context information about reasons why latches were taken. As they are beta, not final versions, I have uploaded them into a temporary location:</p>
<p><em>Update: These scripts are out of beta now so here are their permanent locations:</em></p>
<ul>
<li><a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL3d3dy50YW5lbHBvZGVyLmNvbS9maWxlcy9zY3JpcHRzL3NuYXBwZXIuc3Fs">http://www.tanelpoder.com/files/scripts/snapper.sql</a></li>
<li><a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL3d3dy50YW5lbHBvZGVyLmNvbS9maWxlcy9zY3JpcHRzL2xhdGNocHJvZnguc3Fs">http://www.tanelpoder.com/files/scripts/latchprofx.sql</a></li>
</ul>
<p>Also, the Sesspack (session level statspack) v0.05 files are in my TPT script set, under tools/sesspack_v0.05 directory in the zip file:</p>
<ul>
<li><a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL3d3dy50YW5lbHBvZGVyLmNvbS9maWxlcy9UUFRfcHVibGljLnppcA==">http://www.tanelpoder.com/files/TPT_public.zip</a></li>
</ul>
<p>Toine van Beckhoven attended <a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL2Jsb2cudGFuZWxwb2Rlci5jb20vc2VtaW5hci8=" target=\"_blank\">the seminar</a> as well and wrote a review of it, if you&#8217;re still hesitating if you should attend as well, read <a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL3RvaW5ldmFuYmVja2hvdmVuLndvcmRwcmVzcy5jb20vMjAwOS8wNC8wMi90aGUtdHJhaW5zLWFyZS1hLW5pZ2h0bWFyZS1pdHMtYS1taXJhY2xlLWFuZC1ob3ctbGl0dGxlLWNhbi15b3Uta25vdy8=" target=\"_blank\">his review</a> first :)</p>
 <img src="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?view=1&post_id=275" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://blog.tanelpoder.com/2009/04/05/done-with-my-seminar-in-netherlands-heres-some-feedback-and-couple-beta-new-versions-of-my-scripts/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Another LatchProfX use case</title>
		<link>http://blog.tanelpoder.com/2009/03/20/another-latchprofx-use-case/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=another-latchprofx-use-case</link>
		<comments>http://blog.tanelpoder.com/2009/03/20/another-latchprofx-use-case/#comments</comments>
		<pubDate>Fri, 20 Mar 2009 19:35:41 +0000</pubDate>
		<dc:creator>Tanel Poder</dc:creator>
				<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Internals]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Tools]]></category>
		<category><![CDATA[Troubleshooting]]></category>

		<guid isPermaLink="false">http://blog.tanelpoder.com/2009/03/20/another-latchprofx-use-case/</guid>
		<description><![CDATA[Riyaj Shamsudeen wrote an excellent article about systematic latch contention troubleshooting. Especially if the latch contention problem is ongoing, looking into system wide stats (like v$latch.sleep columns) is not the best idea in busy systems. This may sometimes lead you to fixing the wrong problem. This is because sometimes the latch contention is not caused [...]]]></description>
			<content:encoded><![CDATA[<p>Riyaj Shamsudeen wrote an excellent <a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL29yYWludGVybmFscy53b3JkcHJlc3MuY29tLzIwMDkvMDMvMTMvZHluYW1pY19wbGFuX3RhYmxlLXhrcWxmeHBsLWFuZC1leHRyZW1lLWxpYnJhcnktY2FjaGUtbGF0Y2gtY29udGVudGlvbi8=">article</a> about <em>systematic</em> latch contention troubleshooting.</p>
<p>Especially if the latch contention problem is ongoing, looking into system wide stats (like v$latch.sleep columns) is not the best idea in busy systems. This may sometimes lead you to fixing the wrong problem.</p>
<p>This is because sometimes the latch contention is not caused by some system wide inefficiency but rather by one or few sessions.</p>
<p>The right approach would be to <em>measure</em> the following things:</p>
<ol>
<li>Which latch (and exact child latches) the problem session is waiting for (query v$session_wait, ASH, sql_trace)</li>
<li>Why is the problem session trying to get that latch so often (query v$sesstat counters for problem session and check execution plan if only single/few statements executed)</li>
<li>Why this latch is busy: who&#8217;s holding it (query v$latchholder, LatchProf)</li>
<li>Why is that someone holding the latch so much (query v$sesstat counters for that session or run LatchProfX)</li>
</ol>
<p>Riyaj used this approach and successfully found out the troublemaker causing heavy library cache latch contention. He used my <a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL2Jsb2cudGFuZWxwb2Rlci5jb20vMjAwOC8wNy8yMy9hZHZhbmNlZC1vcmFjbGUtdHJvdWJsZXNob290aW5nLWd1aWRlLXBhcnQtOC1ldmVuLW1vcmUtZGV0YWlsZWQtbGF0Y2gtdHJvdWJsZXNob290aW5nLXVzaW5nLWxhdGNocHJvZngv">LatchProfX</a> tool for part of the diagnosis (and I&#8217;m very happy to see that my advanced oracle troubleshooting tools find real world use in hands of other troubleshooters/tuners too!).</p>
<p>Here&#8217;s an excerpt from latchprofx output (from a little test case I put together), showing which session is holding which library cache latch how much, hold mode (shared/exclusive) and also the actual reason (function name) why the latch is held:</p>
<pre><code>SQL&gt; @latchprofx sid,name,laddr,<strong>ksllwnam,ksllwlbl,hmode</strong> 159 % 100000

-- LatchProfX 1.08 by Tanel Poder ( http://www.tanelpoder.com )

 SID NAME           LADDR            KSLLWNAM  KSLLWLBL  HMODE        Held  Gets  Held %  Held ms Avg hold ms
---- -------------- ---------------- --------- --------- ---------- ------ ----- ------- -------- -----------
 159 library cache  00000003A93513E0 <strong>kglic</strong>     child     exclusive   10044  9314   10.04  301.320        .032
 159 library cache  00000003A9351340 kglic     child     exclusive    9895  8458    9.90  296.850        .035
 159 library cache  00000003A9351660 kglic     child     exclusive    9761  8440    9.76  292.830        .035
 159 library cache  00000003A9351700 kglic     child     exclusive    9737  8924    9.74  292.110        .033
 159 library cache  00000003A9351480 kglic     child     exclusive    8999  7765    9.00  269.970        .035
 159 library cache  00000003A93515C0 kglic     child     exclusive    8553  7832    8.55  256.590        .033
 159 library cache  00000003A9351520 kglic     child     exclusive    6852  6828    6.85  205.560        .030

7 rows selected.

</code></pre>
<p>kglic means Kernel Generic Library cache Iterate Chain (AFAIK), it&#8217;s the function which is executed when you access most X$KGL tables. And this is where Riyaj got the evidence that the latch contention comes from inefficient scanning of library cache, caused by a session running queries against V$SQL_PLAN views with a bad plan.</p>
<p>So, check out <a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL29yYWludGVybmFscy53b3JkcHJlc3MuY29tLzIwMDkvMDMvMTMvZHluYW1pY19wbGFuX3RhYmxlLXhrcWxmeHBsLWFuZC1leHRyZW1lLWxpYnJhcnktY2FjaGUtbGF0Y2gtY29udGVudGlvbi8=">Riyaj&#8217;s articele</a>, my <a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL2Jsb2cudGFuZWxwb2Rlci5jb20vMjAwOC8wNy8yMy9hZHZhbmNlZC1vcmFjbGUtdHJvdWJsZXNob290aW5nLWd1aWRlLXBhcnQtOC1ldmVuLW1vcmUtZGV0YWlsZWQtbGF0Y2gtdHJvdWJsZXNob290aW5nLXVzaW5nLWxhdGNocHJvZngv">LatchProfX script</a>, my Latch &amp; Mutex contention troubleshooting <a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL3d3dy50YW5lbHBvZGVyLmNvbS9maWxlcy9MYXRjaF9Mb2NrX0FuZF9NdXRleF9Db250ZW50aW9uX1Ryb3VibGVzaG9vdGluZy5wZGY=">conference slides</a> and if you want more, then my <a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL2Jsb2cudGFuZWxwb2Rlci5jb20vc2VtaW5hci8=">Advanced Oracle Troubleshooting Seminar</a>. I recently added Dallas, Denver, Salt Lake City and Hong-Kong to the list of cities where you can attend it this year!</p>
 <img src="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?view=1&post_id=263" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://blog.tanelpoder.com/2009/03/20/another-latchprofx-use-case/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
		<item>
		<title>Advanced Oracle Troubleshooting @OOW 2008 presentation slides and scripts</title>
		<link>http://blog.tanelpoder.com/2008/09/26/advanced-oracle-troubleshooting-oow-2008-presentation-slides-and-scripts/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=advanced-oracle-troubleshooting-oow-2008-presentation-slides-and-scripts</link>
		<comments>http://blog.tanelpoder.com/2008/09/26/advanced-oracle-troubleshooting-oow-2008-presentation-slides-and-scripts/#comments</comments>
		<pubDate>Thu, 25 Sep 2008 22:41:23 +0000</pubDate>
		<dc:creator>Tanel Poder</dc:creator>
				<category><![CDATA[Cool stuff]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Internals]]></category>
		<category><![CDATA[Oracle 11g]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Troubleshooting]]></category>

		<guid isPermaLink="false">http://blog.tanelpoder.com/2008/09/26/advanced-oracle-troubleshooting-oow-2008-presentation-slides-and-scripts/</guid>
		<description><![CDATA[My today morning&#8217;s Advanced Oracle Troubleshooting presentation was quite a success. The 400 people room was fully pre-booked and over 300 people showed up. This is a pretty good result considering that a) my presentation was an 9 am one on the last day of conference and b) there was a big party last night [...]]]></description>
			<content:encoded><![CDATA[<p>My today morning&#8217;s Advanced Oracle Troubleshooting presentation was quite a success. The 400 people room was fully pre-booked and over 300 people showed up. This is a pretty good result considering that a) my presentation was an 9 am one on the last day of conference and b) there was a big party last night :)</p>
<p>Thanks to everyone who attended!</p>
<p>I have uploaded the slides here: <a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL3d3dy50YW5lbHBvZGVyLmNvbS9maWxlcy9BZHZhbmNlZF9PcmFjbGVfVHJvdWJsZXNob290aW5nLnBkZg==">http://www.tanelpoder.com/files/Advanced_Oracle_Troubleshooting.pdf</a></p>
<p>PerfSheet (the Excel based automatic data visualization tool) is uploaded here: <a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL3d3dy50YW5lbHBvZGVyLmNvbS9maWxlcy9QZXJmU2hlZXQuemlw">http://www.tanelpoder.com/files/PerfSheet.zip</a></p>
<p>The TPT scripts (scripts like snapper, latchprofx, ostackprof etc I demoed and a lot more) are uploaded here: <a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL3d3dy50YW5lbHBvZGVyLmNvbS9maWxlcy9UUFRfcHVibGljLnppcA==">http://www.tanelpoder.com/files/TPT_public.zip</a></p>
<p>Few links to my past blog entries about how to use Snapper, LatchProfX and os_explain:</p>
<ul>
<li><a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL2Jsb2cudGFuZWxwb2Rlci5jb20vMjAwOC8wNy8yMy9hZHZhbmNlZC1vcmFjbGUtdHJvdWJsZXNob290aW5nLWd1aWRlLXBhcnQtOC1ldmVuLW1vcmUtZGV0YWlsZWQtbGF0Y2gtdHJvdWJsZXNob290aW5nLXVzaW5nLWxhdGNocHJvZngv">Advanced Oracle Troubleshooting Guide, part 8: LatchProfX</a></li>
<li><a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL2Jsb2cudGFuZWxwb2Rlci5jb20vMjAwNy8xMi8wNi9vcmFjbGUtc2Vzc2lvbi1zbmFwcGVyLXYxMDYtcmVsZWFzZWQv">Oracle Session Snapper v1.06 released</a></li>
<li><a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL2Jsb2cudGFuZWxwb2Rlci5jb20vMjAwOC8wNi8xNS9hZHZhbmNlZC1vcmFjbGUtdHJvdWJsZXNob290aW5nLWd1aWRlLXBhcnQtNi11bmRlcnN0YW5kaW5nLW9yYWNsZS1leGVjdXRpb24tcGxhbnMtd2l0aC1vc19leHBsYWluLw==">Advanced Oracle Troubleshooting Guide, part 6: os_explain</a></li>
</ul>
<p>I haven&#8217;t blogged about usage of OStackProf yet (as the tool&#8217;s very fresh), I will do so in coming week.</p>
<p>Feel free to send any further questions to tanel@tanelpoder.com</p>
<p>Also, if you liked my presentation, you will sure like my 2-day seminar on the same topic. I will go even deeper in there ;-)</p>
<p>Check out this page: <a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL2Jsb2cudGFuZWxwb2Rlci5jb20vc2VtaW5hci8=">http://blog.tanelpoder.com/seminar/</a>. I&#8217;ve arranged my seminar in 18 cities already for 2008 and early 2009, if your city is not in the list, let me know and I&#8217;ll see what I can do :)</p>
<p>Thanks once more for everyone attending my OOW presentation and also my blog readers for continued support!</p>
 <img src="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?view=1&post_id=141" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://blog.tanelpoder.com/2008/09/26/advanced-oracle-troubleshooting-oow-2008-presentation-slides-and-scripts/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
		<item>
		<title>Oracle hidden costs revealed, Part2 &#8211; Using DTrace to find why writes in SYSTEM tablespace are slower than in others</title>
		<link>http://blog.tanelpoder.com/2008/09/02/oracle-hidden-costs-revealed-part2-using-dtrace-to-find-why-writes-in-system-tablespace-are-slower-than-in-others/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=oracle-hidden-costs-revealed-part2-using-dtrace-to-find-why-writes-in-system-tablespace-are-slower-than-in-others</link>
		<comments>http://blog.tanelpoder.com/2008/09/02/oracle-hidden-costs-revealed-part2-using-dtrace-to-find-why-writes-in-system-tablespace-are-slower-than-in-others/#comments</comments>
		<pubDate>Mon, 01 Sep 2008 17:16:50 +0000</pubDate>
		<dc:creator>Tanel Poder</dc:creator>
				<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Administration]]></category>
		<category><![CDATA[DTrace]]></category>
		<category><![CDATA[Internals]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Tools]]></category>

		<guid isPermaLink="false">http://tanelpoder.wordpress.com/2008/09/02/oracle-hidden-costs-revealed-part2-using-dtrace-to-find-why-writes-in-system-tablespace-are-slower-than-in-others/</guid>
		<description><![CDATA[I have written two posts in one, about a performance issue with writes in system tablespace and introduction of a little DTrace stack sampling script. Have you noticed that DML on tables residing in SYSTEM tablespace is slower than tables in other tablespaces? Here&#8217;s an example, I&#8217;ll create two similar tables, one in USERS tablespace, [...]]]></description>
			<content:encoded><![CDATA[<p><em>I have written two posts in one, about a performance issue with writes in system tablespace and introduction of a little DTrace stack sampling script.<br />
</em></p>
<p>Have you noticed that DML on tables residing in SYSTEM tablespace is slower than tables in other tablespaces?</p>
<p>Here&#8217;s an example, I&#8217;ll create two similar tables, one in USERS tablespace, other in SYSTEM, and inset into the first one (Oracle 10.2.0.3 on Solaris x64):</p>
<pre><code>SQL&gt; create table <strong>t1</strong>(a int) tablespace <strong>USERS</strong>;

Table created.

SQL&gt; create table <strong>t2</strong>(a int) tablespace <strong>SYSTEM</strong>;

Table created.

SQL&gt; exec for i in 1..100000 loop insert into <strong>t1</strong> values(i); end loop;

PL/SQL procedure successfully completed.

Elapsed: 00:00:<strong>03.09</strong>

</code></pre>
<p>Insert into table in USERS tablespace took 3 seconds.</p>
<p>Ok, let&#8217;s commit and flush dirty buffers that they wouldn&#8217;t get on the way of next insert.</p>
<pre><code>SQL&gt; commit;

Commit complete.

Elapsed: 00:00:00.02

SQL&gt; alter system checkpoint; <em>-- checkpointing to flush dirty buffers from previous inserts</em>

System altered.

Elapsed: 00:00:01.34
SQL&gt;

</code></pre>
<p>And now to the insert into the SYSTEM tablespace table:</p>
<pre><code>SQL&gt; exec for i in 1..100000 loop insert into <strong>t2</strong> values(i); end loop;

PL/SQL procedure successfully completed.

Elapsed: 00:00:<strong>08.98</strong>
SQL&gt;

</code></pre>
<p>What?! The same insert took 3 times longer, almost 9 seconds?</p>
<p>Fine! Let&#8217;s troubleshoot it!</p>
<p><span id="more-132"></span></p>
<p><!-- more --></p>
<p>First (as usual) I start from my Snapper for getting a performance snapshot of the session. I ran both inserts again and used Snapper from another session to monitor the insert activity (for brevity I didn&#8217;t monitor all stat counters, but used the sinclude filter to leave only various &#8220;gets&#8221; in from V$SESSTAT).</p>
<p>This is the insert to USERS tablespace table:</p>
<pre><code>SQL&gt; @snapper out,gather=tsw,<strong>sinclude=gets</strong> 10 1 148

-- Session Snapper v1.07 by Tanel Poder ( http://www.tanelpoder.com )

---------------------------------------------------------------------------------------------------------------------------------------------
HEAD,     SID, SNAPSHOT START   ,  SECONDS, TYPE, STATISTIC                               ,         DELTA,  DELTA/SEC,     HDELTA, HDELTA/SEC
---------------------------------------------------------------------------------------------------------------------------------------------
DATA,     148, 20080901 17:20:53,       10, STAT, db block gets                           ,        103448,      10345,    103.45k,     10.34k
DATA,     148, 20080901 17:20:53,       10, STAT, db block gets from cache                ,        103448,      10345,    103.45k,     10.34k
DATA,     148, 20080901 17:20:53,       10, STAT, consistent gets                         ,           495,         49,        495,       49.5
DATA,     148, 20080901 17:20:53,       10, STAT, consistent gets from cache              ,           495,         49,        495,       49.5
DATA,     148, 20080901 17:20:53,       10, STAT, consistent gets - examination           ,           219,         22,        219,       21.9
DATA,     148, 20080901 17:20:53,       10, STAT, no work - consistent read gets          ,            89,          9,         89,        8.9
DATA,     148, 20080901 17:20:53,       10, STAT, rollbacks only - consistent read gets   ,            18,          2,         18,        1.8
DATA,     148, 20080901 17:20:53,       10, STAT, cluster key scan block gets             ,            65,          6,         65,        6.5
DATA,     148, 20080901 17:20:53,       10, TIME, hard parse elapsed time                 ,          4090,        409,     4.09ms,      409us
DATA,     148, 20080901 17:20:53,       10, TIME, PL/SQL compilation elapsed time         ,          3139,        314,     3.14ms,    313.9us
DATA,     148, 20080901 17:20:53,       10, TIME, parse time elapsed                      ,          5131,        513,     5.13ms,    513.1us
DATA,     148, 20080901 17:20:53,       10, TIME, PL/SQL execution elapsed time           ,        174288,      17429,   174.29ms,    17.43ms
<strong>DATA,     148, 20080901 17:20:53,       10, TIME, DB CPU                                  ,       3108026,     310803,      3.11s,    310.8ms
</strong>DATA,     148, 20080901 17:20:53,       10, TIME, sql execute elapsed time                ,       3104607,     310461,       3.1s,   310.46ms
DATA,     148, 20080901 17:20:53,       10, TIME, hard parse (sharing criteria) elapsed ti,          4110,        411,     4.11ms,      411us
<strong>DATA,     148, 20080901 17:20:53,       10, TIME, DB time                                 ,       3108026,     310803,      3.11s,    310.8ms
</strong>DATA,     148, 20080901 17:20:53,       10, WAIT, log file switch completion              ,         71104,       7110,     71.1ms,     7.11ms
DATA,     148, 20080901 17:20:53,       10, WAIT, SQL*Net message to client               ,             3,          0,        3us,       .3us
DATA,     148, 20080901 17:20:53,       10, WAIT, SQL*Net message from client             ,       6684030,     668403,      6.68s,    668.4ms
--  End of snap 1

PL/SQL procedure successfully completed.

</code></pre>
<p>You see, the DB CPU equals to DB time (in other words all the time spent servicing my request in database was spent on CPU, thus there were no waits, blocks involved). And the insert still took roughly 3 seconds.</p>
<p>This is the insert to SYSTEM tablespace table:</p>
<pre><code>SQL&gt; @snapper out,gather=tsw,sinclude=gets 10 1 148

-- Session Snapper v1.07 by Tanel Poder ( http://www.tanelpoder.com )

---------------------------------------------------------------------------------------------------------------------------------------------
HEAD,     SID, SNAPSHOT START   ,  SECONDS, TYPE, STATISTIC                               ,         DELTA,  DELTA/SEC,     HDELTA, HDELTA/SEC
---------------------------------------------------------------------------------------------------------------------------------------------
DATA,     148, 20080901 17:21:23,       11, STAT, db block gets                           ,        <strong>102616</strong>,       9329,    102.62k,      9.33k
DATA,     148, 20080901 17:21:23,       11, STAT, db block gets from cache                ,        102616,       9329,    102.62k,      9.33k
DATA,     148, 20080901 17:21:23,       11, STAT, consistent gets                         ,           <strong>384</strong>,         35,        384,      34.91
DATA,     148, 20080901 17:21:23,       11, STAT, consistent gets from cache              ,           384,         35,        384,      34.91
DATA,     148, 20080901 17:21:23,       11, STAT, consistent gets - examination           ,           218,         20,        218,      19.82
DATA,     148, 20080901 17:21:23,       11, STAT, no work - consistent read gets          ,            94,          9,         94,       8.55
DATA,     148, 20080901 17:21:23,       11, STAT, rollbacks only - consistent read gets   ,            17,          2,         17,       1.55
DATA,     148, 20080901 17:21:23,       11, STAT, cluster key scan block gets             ,            65,          6,         65,       5.91
DATA,     148, 20080901 17:21:23,       11, TIME, hard parse elapsed time                 ,          2350,        214,     2.35ms,   213.64us
DATA,     148, 20080901 17:21:23,       11, TIME, PL/SQL compilation elapsed time         ,          1751,        159,     1.75ms,   159.18us
DATA,     148, 20080901 17:21:23,       11, TIME, parse time elapsed                      ,          3359,        305,     3.36ms,   305.36us
DATA,     148, 20080901 17:21:23,       11, TIME, PL/SQL execution elapsed time           ,        172153,      15650,   172.15ms,    15.65ms
<strong>DATA,     148, 20080901 17:21:23,       11, TIME, DB CPU                                  ,       8751163,     795560,      8.75s,   795.56ms
</strong><strong>DATA,     148, 20080901 17:21:23,       11, TIME, sql execute elapsed time                ,       8749139,     795376,      8.75s,   795.38ms
</strong>DATA,     148, 20080901 17:21:23,       11, TIME, hard parse (sharing criteria) elapsed ti,          2367,        215,     2.37ms,   215.18us
<strong>DATA,     148, 20080901 17:21:23,       11, TIME, DB time                                 ,       8751163,     795560,      8.75s,   795.56ms
</strong>DATA,     148, 20080901 17:21:23,       11, WAIT, SQL*Net message to client               ,             3,          0,        3us,      .27us
DATA,     148, 20080901 17:21:23,       11, WAIT, SQL*Net message from client             ,      -1442849,    -131168,     -1.44s,  -131.17ms
--  End of snap 1

PL/SQL procedure successfully completed.

</code></pre>
<p>You see the DB Time is still equal to DB CPU, thus all response was spent on CPU. Moreover, the time model stats tell me that all the elapsed time was spent executing SQL statements (which makes sense as I&#8217;m running INSERT SQL in a tight loop).</p>
<p>So, for some reason the INSERT SQL statements take almost 3 times longer when inserting to SYSTEM tablespace.</p>
<p>In cases where there is no waiting going on (things like Oracle Wait Interface / WaitProf would not help here), but only difference in CPU time, I normally look into V$SESSTAT counter differences first. But as seen from above, there is no major difference in counters such logical IOs (db block gets, consistent gets) to justify 3x increase in CPU usage (also there was no major difference in other counters which I have excluded for brevity).</p>
<p>So, the obvious next step is to sample the process stack during execution of both statements and see where the difference is. In my previous <a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL2Jsb2cudGFuZWxwb2Rlci5jb20vMjAwNy8wOS8wNi9hZHZhbmNlZC1vcmFjbGUtdHJvdWJsZXNob290aW5nLWd1aWRlLXBhcnQtMy1tb3JlLWFkdmVudHVyZXMtaW4tcHJvY2Vzcy1zdGFjay8=" target=\"_blank\">posts</a> I&#8217;ve used pstack for getting a few stack samples out of a process and a pstack loop could be used here as well.</p>
<p>But for demo purposes (and convenience) I will use my <a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL3d3dy50YW5lbHBvZGVyLmNvbS9maWxlcy9zY3JpcHRzL3VuaXgvZHN0YWNrcHJvZi5zaA==">DTrace stack sampling script</a> today.</p>
<p>It&#8217;s dead easy to sample user process and kernel stacks using DTrace in Solaris 10 (and MacOS X and FreeBSD too). However the DTrace stack aggregator also takes account of the current program counter offset inside functions when distinguishing between stacks (this means that CPU program counter location <em>main() + 0&#215;10</em> will be different from <em>main() + 0&#215;12</em>). But I don&#8217;t want such granularity, I want to see where the execution was at function level, not instruction level.</p>
<p>So I wrote my script to strip off the PC function offsets from DTrace output and order the stacks the way that most frequently sampled stack &#8220;branch&#8221; is displayed last, in bottom of the output. So, whatever stack and function you see outputted last, is probably the &#8220;troublemaker&#8221;. I still recommend scrolling up and reviewing few other stacks too.</p>
<p>So, I ran the insert to USERS tablespace table again and ran my <a href="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?url=aHR0cDovL3d3dy50YW5lbHBvZGVyLmNvbS9maWxlcy9zY3JpcHRzL3VuaXgvZHN0YWNrcHJvZi5zaA==">dstackprof</a> script on the server process with SPID 859.</p>
<p>This is from the &#8220;well behaving&#8221; process, so normally you should just compare this stack profile to the &#8220;bad behaving&#8221; process stack profile and see if you can spot any major differences in these. I will explain few of the functions on the go.</p>
<pre><code><strong># ./dstackprof.sh 859
</strong>
DStackProf v1.02 by Tanel Poder ( http://www.tanelpoder.com )
Sampling pid 859 for 5 seconds with stack depth of 100 frames...

147 samples with stack below
__________________
kcbchg1_main
kcbchg1                 &lt;- buffer change function in buffer cache layer
ktuchg                    &lt;- undo layer generate change
ktbchg2                  &lt;- block (header) change
kdtchg                    &lt;- data layer generate change
kdtwrp
kdtInsRow              &lt;- data layer row insert function
insrowFastPath
insdrvFastPath
inscovexe
insExecStmtExecIniEngine
insexe                   &lt;- OPI insert execution entry point
opiexe                   &lt;- OPI execute (executing my recursive INSERT INTO statement)
opipls
opiodr
rpidrus                  &lt;- all the RPI functions belong to recursive program interface (which allows making recursive calls in db)
skgmstack
rpidru                    &lt;- these RPI functions (and few OPI ones above) are what PL/SQL context switches physically are
rpiswu2
rpidrv
psddr0                  &lt;- most of the functions starting with "p" are PL/SQL execution ones
psdnal
pevm_EXECC
pfrinstr_EXECC
pfrrun_no_tool
pfrrun
plsql_run
peicnt                    &lt;- PL/SQL execution engine entry point
kkxexe                  &lt;- cursor to PL/SQL execution interface
opiexe                   &lt;- OPI call execution dispatcher
kpoal8
opiodr
ttcpip
opitsk
opiino
opiodr
opidrv
sou2o
opimai_real
main
0xe54ffc

193 samples with stack below
__________________
libc.so.1`times         &lt;-- during these 193 samples the execution was in opiexe (note that
opiexe                   -- it's a recursive OPI call execution (as there's the RPI layer below in the
opipls                    -- stack and there's another opiexe below. The lower opiexe is for executing
opiodr                   -- my PL/SQL anonymous block and upper is for the recursive INSERT statement
rpidrus
skgmstack
rpidru
rpiswu2
rpidrv
psddr0
psdnal
pevm_EXECC
pfrinstr_EXECC
pfrrun_no_tool
pfrrun
plsql_run
peicnt
kkxexe
opiexe
kpoal8
opiodr
ttcpip
opitsk
opiino
opiodr
opidrv
sou2o
opimai_real
main
0xe54ffc

3147 Total samples captured

</code></pre>
<p>Anyway, the two stacks which I have copied here don&#8217;t show anything obvious. And also, see that there were total 3147 samples taken when the process was on CPU. But the execution point was in the &#8220;top&#8221; stack during only 193 samples, which is around 6% of all samples.</p>
<p>As the above stacks come from a well-behaving process, let&#8217;s leave their deeper analysis and just compare them to the bad-behaving process. So, I ran the SYSTEM tablespace insert again (you still remember, we were troubleshooting why inserts into system tablespace were slower, right ;)</p>
<p>The top stack in profile was following:</p>
<pre><code># ./dstackprof.sh <strong>859</strong>

DStackProf v1.02 by Tanel Poder ( http://www.tanelpoder.com )
Sampling pid 859 for 5 seconds with stack depth of 100 frames...

<em>[...some output snipped...]
</em>

917 samples with stack below
__________________
libc.so.1`qsort
kd4_entries_overlap
kdb4chk
kd4chk
kcbchk
<strong>kco_blkchk</strong>    <em>&lt;-- what the heck is this? the name suggests that it's block checking of some kind</em>
kcoapl
kcbapl
kcrfw_redo_gen             &lt;-- also note this function, looks like the block change function (kcbchg) has called
kcbchg1_main                   &lt;-- a function to generate redo on its behalf
kcbchg1
ktuchg
ktbchg2
kdtchg
kdtwrp
kdtInsRow
insrowFastPath
insdrvFastPath
inscovexe
insExecStmtExecIniEngine
insexe
opiexe
opipls
opiodr
rpidrus
skgmstack
rpidru
rpiswu2
rpidrv
psddr0
psdnal
pevm_EXECC
pfrinstr_EXECC
pfrrun_no_tool
pfrrun
plsql_run
peicnt
kkxexe
opiexe
kpoal8
opiodr
ttcpip
opitsk
opiino
opiodr
opidrv
sou2o
opimai_real
main
0xe54ffc

6202 Total samples captured

</code></pre>
<p>As most of the top stacks from the bad behaving process were spending their time somewhere in kco_blkchk() function, it rang a bell &#8211; database block checking!</p>
<p>However it was weird as the <em>db_block_checking</em> parameter was set false in my instance:</p>
<pre><code>SQL&gt; show parameter db_block_checking

NAME                                 TYPE        VALUE
------------------------------------ ----------- -----
db_block_checking                    string      FALSE
SQL&gt;

</code></pre>
<p>After a little digging, I found this parameter:</p>
<pre><code>NAME                          VALUE     DESCRIPTION
----------------------------- --------- -------------------------------------------------------------
_db_always_check_system_ts    TRUE      Always perform block check and checksum for System tablespace

</code></pre>
<p>This parameter makes Oracle (since version 8.1.6) do block structure checking in SYSTEM tablespace even if db_block_checking itself is set to FALSE. Considering the criticality of data dictionary objects, it&#8217;s a good idea.</p>
<p>So, now you might say: <em>so what that there&#8217;s a performance issue, you shouldn&#8217;t keep your application tables in SYSTEM tablespace anyway!</em></p>
<p>Well, I have three words for you: <strong>AUD$, FGA_LOG$, DEF$_AQCALL</strong> :)</p>
<p>If you rely on auditing functionality heavily and especially if you use advanced replication, having the crucial tables (and associated LOB segments) in SYSTEM tablespace can mean quite a hit to your CPUs and system throughput. In which cases you probably would want to explore the option of moving these tables out of system tablespace.</p>
<p><strong>Warning</strong>: before moving anything, consult Metalink or Oracle Support as only few strictly defined ways for moving these object are supported. See notes 731908.1 for Audit and 1037317.6 for Advanced Replication.</p>
<p>So, the key point of this article is that I could have started guessing or setting various events or undocumented parameters for probing Oracle here and there, for figuring out what&#8217;s going on. But I used a simple systematic approach instead. When Oracle Wait Interface and V$SESSTAT counters didn&#8217;t help, looked into process stack next. And this gave me the vital clue (kdo_blkchk) that there&#8217;s block checking going on. And if I hadn&#8217;t figured out the cause myself, I would have had some solid evidence for support.</p>
<p>Note that you don&#8217;t have to use DTrace for this kind of stack profiling (even though it is very convenient). On non-Solaris 10 OS&#8217;es I would use whatever safe stack dump mechanism is available (pstack for example) and post-process the results. I probably wouldn&#8217;t take thousands of samples though, but probably 100 or even less.</p>
<p>Also, if you want to get a complete overview of the process execution profile, you would need to do much more complex aggregations, as you&#8217;d need to aggregate &#8220;child&#8221; branches of stack samples with their &#8220;parents&#8221; (currently the samples of a backtrace &#8220;a-&gt;b-&gt;c&#8221; are not accounted in &#8220;a-&gt;b&#8221; for example). But for my case (just checking if there&#8217;s any major difference between execution profiles) this approach was good enough.</p>
<p>By the way (if you haven&#8217;t given up yet :), check the differences in LatchProfX output. The top is from well-behaving process and bottom one is the bad one:</p>
<pre><code>SQL&gt; @latchprofx sid,name,ksllwnam,ksllwlbl 128 % 1000000

-- LatchProfX 1.06 by Tanel Poder ( http://www.tanelpoder.com )

       SID NAME                                KSLLWNAM                                 KSLLWLBL                   Held       Gets  Held %     Held ms Avg hold ms
---------- ----------------------------------- ---------------------------------------- -------------------- ---------- ---------- ------- ----------- -----------
<strong>       128 redo copy                           kcrfw_redo_gen: nowait                   latch                     49399      48937    4.94     674.790        .014
</strong>       128 cache buffers chains                kcbchg1: kslbegin: bufs not pinned                                  4574       2320     .46      62.481        .027
       128 cache buffers chains                kcbchg1: kslbegin: call CR func                                     2101       1445     .21      28.700        .020
       128 cache buffers chains                kcbgcur_2                                                           1323       1060     .13      18.072        .017
       128 cache buffers chains                kcbnew: new latch again                                              269        243     .03       3.675        .015
       128 simulator lru latch                 kcbs_simulate: simulate set              setid                       170        170     .02       2.322        .014
       128 redo allocation                     kcrfw_redo_gen: redo allocation 1        strand #                     68         13     .01        .929        .071
       128 cache buffers chains                kcbrls_2                                                              29         29     .00        .396        .014
       128 cache buffers chains                kcbgtcr: fast path                                                    25         25     .00        .342        .014
       128 cache buffers chains                kcbgcur: fast path (shr)                                               8          8     .00        .109        .014
       128 cache buffers lru chain             kcbzgws_1                                                              7          7     .00        .096        .014

SQL&gt; @latchprofx sid,name,ksllwnam,ksllwlbl 128 % 1000000

-- LatchProfX 1.06 by Tanel Poder ( http://www.tanelpoder.com )

       SID NAME                                KSLLWNAM                                 KSLLWLBL                   Held       Gets  Held %     Held ms Avg hold ms
---------- ----------------------------------- ---------------------------------------- -------------------- ---------- ---------- ------- ----------- -----------
<strong>       128 redo copy                           kcrfw_redo_gen: nowait                   latch                    356465      79839   35.65    6088.422        .076
</strong>       128 cache buffers chains                kcbchg1: kslbegin: bufs not pinned                                  4704       2233     .47      80.344        .036
       128 cache buffers chains                kcbchg1: kslbegin: call CR func                                     3542       1997     .35      60.497        .030
       128 cache buffers chains                kcbgcur_2                                                           1929       1448     .19      32.947        .023
       128 redo allocation                     kcrfw_redo_gen: redo allocation 1        strand #                   1018         34     .10      17.387        .511
       128 shared pool simulator               kglsim_unpin_simhp: fast path                                        713        713     .07      12.178        .017
       128 simulator lru latch                 kcbs_simulate: simulate set              setid                       266        266     .03       4.543        .017
       128 cache buffers chains                kcbnew: new latch again                                              182        171     .02       3.109        .018
       128 cache buffers chains                kcbgtcr: fast path (cr pin)                                           29         29     .00        .495        .017
       128 cache buffers chains                kcbgtcr: fast path                                                    15         15     .00        .256        .017

</code></pre>
<p>See, how the redo copy latches were held for much longer (both total time and average time of individual hold) when the block checking was enabled. This indicates that in addition to keeping the buffer pinned, also a redo copy latch is held during block checking (if not using an IMU transaction). If you use high-throughput advanced replication, think that CPU usage is too high for work done and can&#8217;t get rid of the redo copy latch contention, this may be a reason.</p>
 <img src="http://blog.tanelpoder.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?view=1&post_id=132" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://blog.tanelpoder.com/2008/09/02/oracle-hidden-costs-revealed-part2-using-dtrace-to-find-why-writes-in-system-tablespace-are-slower-than-in-others/feed/</wfw:commentRss>
		<slash:comments>29</slash:comments>
		</item>
	</channel>
</rss>

