Search Results

Keyword: ‘latchprof latchprofx’

New versions of LatchProf and LatchProfX for latch contention troubleshooting and tuning

February 15th, 2010

The LatchProf and LatchProfX scripts allow you to be more systematic with latch contention troubleshooting and tuning. No more guesswork is needed as these scripts give you exact session IDs and in this version also SQLIDs of the troublemaking applications.

You can download the new versions here:

Example output (with SQLID info) is below:

SQL> @latchprof name,sid,sqlid % % 100000

-- LatchProf 1.21 by Tanel Poder ( http://www.tanelpoder.com )

NAME                                       SID SQLID               Held       Gets  Held %     Held ms Avg hold ms
----------------------------------- ---------- ------------- ---------- ---------- ------- ----------- -----------
cache buffers chains                       133 3jbwa65aqmkvm        349        348     .35      14.169        .041
simulator lru latch                        133 3jbwa65aqmkvm         51         51     .05       2.071        .041
row cache objects                          133 3jbwa65aqmkvm          5          5     .01        .203        .041
cache buffers chains                        24                        5          5     .01        .203        .041
cache buffers chains                       149 3jbwa65aqmkvm          3          3     .00        .122        .041
resmgr group change latch                   33 147a57cxq3w5y          2          2     .00        .081        .041
cache buffers chains                         9 5raw2bzx227wp          2          1     .00        .081        .081
In memory undo latch                       149 f3y38zthh270n          2          1     .00        .081        .081
active checkpoint queue latch                5                        2          1     .00        .081        .081
cache buffers chains                       149 75621g9y3xmvd          2          2     .00        .081        .041
cache buffers chains                         9 gvgdv2v90wfa7          2          2     .00        .081        .041
cache buffers chains                        33 75621g9y3xmvd          2          2     .00        .081        .041
checkpoint queue latch                       5                        1          1     .00        .041        .041
ksuosstats global area                       8                        1          1     .00        .041        .041
cache buffers lru chain                    133 3jbwa65aqmkvm          1          1     .00        .041        .041
multiblock read objects                    155 75ju2gn3s8009          1          1     .00        .041        .041
resmgr group change latch                    9 0w2qpuc6u2zsp          1          1     .00        .041        .041
resmgr group change latch                   33 apgb2g9q2zjh1          1          1     .00        .041        .041
resmgr group change latch                  133 apgb2g9q2zjh1          1          1     .00        .041        .041
space background task latch                 17                        1          1     .00        .041        .041
cache buffers chains                       149 5raw2bzx227wp          1          1     .00        .041        .041
cache buffers chains                        33 5raw2bzx227wp          1          1     .00        .041        .041
cache buffers chains                        33 05s4vdwsf5802          1          1     .00        .041        .041
cache buffers chains                        31 0yas01u2p9ch4          1          1     .00        .041        .041
cache buffers chains                         9 41zu158rqf4kf          1          1     .00        .041        .041
In memory undo latch                        33 0bzhqhhj9mpaa          1          1     .00        .041        .041
In memory undo latch                        31 gvgdv2v90wfa7          1          1     .00        .041        .041
In memory undo latch                         9 gvgdv2v90wfa7          1          1     .00        .041        .041
simulator lru latch                        149 3jbwa65aqmkvm          1          1     .00        .041        .041
row cache objects                          133 5yq51dtyc6qf2          1          1     .00        .041        .041
SQL memory manager workarea list la        133 3jbwa65aqmkvm          1          1     .00        .041        .041
enqueues                                   141                        1          1     .00        .041        .041
row cache objects                          132                        1          1     .00        .041        .041

33 rows selected.

LatchProf scripts allow you to easily identify which session and SQLID (or sqlhash in 9i) cause the latch(es) to be held the most.

Let’s check what’s the most “latch-holding” SQL reported by LatchProf:

SQL> @sqlid 3jbwa65aqmkvm

HASH_VALUE  CH# SQL_TEXT
---------- ---- ------------------------------------------------------------------------------------------------------------------------------------------------------
1432996723    0 SELECT O.ORDER_ID, LINE_ITEM_ID, PRODUCT_ID, UNIT_PRICE, QUANTITY, ORDER_MODE, ORDER_STATUS, ORDER_TOTAL, SALES_REP_ID, PROMOTION_ID, C.CUSTOMER_ID,
 CUST_FIRST_NAME, CUST_LAST_NAME, CREDIT_LIMIT, CUST_EMAIL, ORDER_DATE FROM ORDERS O , ORDER_ITEMS OI, CUSTOMERS C WHERE O.ORDER_ID = OI.ORDER_ID AND
 O.CUSTOMER_ID = C.CUSTOMER_ID AND O.ORDER_STATUS <= 4

If you want to read more about the capabilities of LatchProf and LatchProfX, go here:

Note that the latest version (v1.21) also fixes a problem with Oracle 11.2 where the script execution plan order was wrong, causing the sampling to not happen in correct order. I added a NO_TRANSFORM_DISTINCT_AGG hint to disable a new transformation happening in 11.2 to make the scripts behave correctly…

  • Share/Bookmark

Tanel Poder Administration, Cool stuff, Oracle, Performance, Troubleshooting

KGL simulator, shared pool simulator and buffer cache simulator – what are these?

September 13th, 2009

Note: I accidentially published this note while I was writing it, so you may have seen a partial version of it. It’s complete now.

If you have queried v$sgastat on recent Oracle versions (by which I mean 9i and above) you probably have seen allocations for some sort of simulators in Oracle instance. Here’s an example:

SQL> select * from v$sgastat where lower(name) like '%sim%' order by name;

POOL         NAME                            BYTES
------------ -------------------------- ----------
shared pool  kglsim alloc latch area          1700
shared pool  kglsim alloc latches               68
shared pool  kglsim count of pinned he        9248
shared pool  kglsim free heap list             204
shared pool  kglsim free obj list              204
shared pool  kglsim hash table                4104
shared pool  kglsim hash table bkts        2097152
shared pool  kglsim heap                    635536
shared pool  kglsim latch area                1700
shared pool  kglsim latches                     68
shared pool  kglsim main lru count           87040
shared pool  kglsim main lru size           174080
shared pool  kglsim object batch            909440
shared pool  kglsim pin list arr               816
shared pool  kglsim recovery area             2112
shared pool  kglsim sga                      22092
shared pool  kglsim size of pinned mem       18496
shared pool  ksim client list                   84
shared pool  log_simultaneous_copies           480
shared pool  sim cache nbufs                   640
shared pool  sim cache sizes                   640
shared pool  sim kghx free lists                 4
shared pool  sim lru segments                  640
shared pool  sim segment hits                 1280
shared pool  sim segment num bufs              640
shared pool  sim state object                   48
shared pool  sim trace buf                    5140
shared pool  sim trace buf context             120
shared pool  sim_knlasg                       1200
shared pool  simulator hash buckets          16512
shared pool  simulator latch/bucket st        4608

31 rows selected.

See, a bunch of “kgl sim” and then just “sim” allocations.

… or sometimes you can see latch contention on following latches:

SQL> select name from v$latch where name like '%sim%';

NAME
-------------------------------------------------------
ksim membership request latch
simulator lru latch
simulator hash latch
sim partition latch
shared pool simulator
shared pool sim alloc

6 rows selected.

Again, there seems to be some “simulation” work going on in Oracle instance.

So what are these simulators about?

Read more…

  • Share/Bookmark

Tanel Poder Administration, Internals, Oracle, Performance

latch: cache buffers chains latch contention – a better way for finding the hot block

August 27th, 2009

Here’s a treat for Oracle performance professionals and geeks who are looking for more systematic ways for cache buffers chains (CBC) latch contention troubleshooting. Cache buffers chains latches are taken when a process wants to walk through a cache buffer hash chain, looking if the block with required DBA (data block address) is in buffer cache. If the block happens to be in cache, then in most cases it has to be pinned first before use and unpinned after use, to make sure no-one else can perform an incompatible operation on that block at the same time. The modification of pin structures (pinning/unpinning) is also protected by CBC latches.

CBC latch contention can happen for multiple reasons, but one reason is that there is some really hot block in a SMP system with high number of CPUs (or CMT system with high number of threads like Sun T-series servers). Sometimes there happen to be multiple moderately hot blocks “under” the same CBC latch, which can result in latch contention again.

Traditionally DBAs used to look up the child latch address from V$SESSION_WAIT, sql_trace output or ASH and then look up all buffers protected by that latch from X$BH, using HLADDR column (HLADDR stands for Hash Latch Address). I also have a script for that, bhla.sql (Buffer Headers by Latch Address), which reports me all blocks currently in buffer cache, “under” that particular latch and the corresponding data block addresses and object names:

  • Share/Bookmark

Tanel Poder Cool stuff, Internals, Oracle, Performance, Tools, Troubleshooting

Done with my seminar in Netherlands – here’s some feedback and couple beta new versions of my scripts

April 5th, 2009

The Miracle Masterclass I delivered last week in Netherlands was awesome, I had lots of attendees and lots of good questions and feedback. As usual, I ran out of time in the end though. I will make this up with some articles about the stuff I didn’t have time to explain :)

I used Snapper v2.0 beta during my demos there and also an improved version of LatchProfX which can give some additional context information about reasons why latches were taken. As they are beta, not final versions, I have uploaded them into a temporary location:

Update: These scripts are out of beta now so here are their permanent locations:

Also, the Sesspack (session level statspack) v0.05 files are in my TPT script set, under tools/sesspack_v0.05 directory in the zip file:

Toine van Beckhoven attended the seminar as well and wrote a review of it, if you’re still hesitating if you should attend as well, read his review first :)

  • Share/Bookmark

Tanel Poder Cool stuff, Oracle, Performance, Training, Troubleshooting

Another LatchProfX use case

March 20th, 2009

Riyaj Shamsudeen wrote an excellent article about systematic latch contention troubleshooting.

Especially if the latch contention problem is ongoing, looking into system wide stats (like v$latch.sleep columns) is not the best idea in busy systems. This may sometimes lead you to fixing the wrong problem.

This is because sometimes the latch contention is not caused by some system wide inefficiency but rather by one or few sessions.

The right approach would be to measure the following things:

  1. Which latch (and exact child latches) the problem session is waiting for (query v$session_wait, ASH, sql_trace)
  2. Why is the problem session trying to get that latch so often (query v$sesstat counters for problem session and check execution plan if only single/few statements executed)
  3. Why this latch is busy: who’s holding it (query v$latchholder, LatchProf)
  4. Why is that someone holding the latch so much (query v$sesstat counters for that session or run LatchProfX)

Riyaj used this approach and successfully found out the troublemaker causing heavy library cache latch contention. He used my LatchProfX tool for part of the diagnosis (and I’m very happy to see that my advanced oracle troubleshooting tools find real world use in hands of other troubleshooters/tuners too!).

Here’s an excerpt from latchprofx output (from a little test case I put together), showing which session is holding which library cache latch how much, hold mode (shared/exclusive) and also the actual reason (function name) why the latch is held:

SQL> @latchprofx sid,name,laddr,ksllwnam,ksllwlbl,hmode 159 % 100000

-- LatchProfX 1.08 by Tanel Poder ( http://www.tanelpoder.com )

 SID NAME           LADDR            KSLLWNAM  KSLLWLBL  HMODE        Held  Gets  Held %  Held ms Avg hold ms
---- -------------- ---------------- --------- --------- ---------- ------ ----- ------- -------- -----------
 159 library cache  00000003A93513E0 kglic     child     exclusive   10044  9314   10.04  301.320        .032
 159 library cache  00000003A9351340 kglic     child     exclusive    9895  8458    9.90  296.850        .035
 159 library cache  00000003A9351660 kglic     child     exclusive    9761  8440    9.76  292.830        .035
 159 library cache  00000003A9351700 kglic     child     exclusive    9737  8924    9.74  292.110        .033
 159 library cache  00000003A9351480 kglic     child     exclusive    8999  7765    9.00  269.970        .035
 159 library cache  00000003A93515C0 kglic     child     exclusive    8553  7832    8.55  256.590        .033
 159 library cache  00000003A9351520 kglic     child     exclusive    6852  6828    6.85  205.560        .030

7 rows selected.

kglic means Kernel Generic Library cache Iterate Chain (AFAIK), it’s the function which is executed when you access most X$KGL tables. And this is where Riyaj got the evidence that the latch contention comes from inefficient scanning of library cache, caused by a session running queries against V$SQL_PLAN views with a bad plan.

So, check out Riyaj’s articele, my LatchProfX script, my Latch & Mutex contention troubleshooting conference slides and if you want more, then my Advanced Oracle Troubleshooting Seminar. I recently added Dallas, Denver, Salt Lake City and Hong-Kong to the list of cities where you can attend it this year!

  • Share/Bookmark

Tanel Poder Internals, Oracle, Performance, Tools, Troubleshooting

Advanced Oracle Troubleshooting @OOW 2008 presentation slides and scripts

September 26th, 2008

My today morning’s Advanced Oracle Troubleshooting presentation was quite a success. The 400 people room was fully pre-booked and over 300 people showed up. This is a pretty good result considering that a) my presentation was an 9 am one on the last day of conference and b) there was a big party last night :)

Thanks to everyone who attended!

I have uploaded the slides here: http://www.tanelpoder.com/files/Advanced_Oracle_Troubleshooting.pdf

PerfSheet (the Excel based automatic data visualization tool) is uploaded here: http://www.tanelpoder.com/files/PerfSheet.zip

The TPT scripts (scripts like snapper, latchprofx, ostackprof etc I demoed and a lot more) are uploaded here: http://www.tanelpoder.com/files/TPT_public.zip

Few links to my past blog entries about how to use Snapper, LatchProfX and os_explain:

I haven’t blogged about usage of OStackProf yet (as the tool’s very fresh), I will do so in coming week.

Feel free to send any further questions to tanel@tanelpoder.com

Also, if you liked my presentation, you will sure like my 2-day seminar on the same topic. I will go even deeper in there ;-)

Check out this page: http://blog.tanelpoder.com/seminar/. I’ve arranged my seminar in 18 cities already for 2008 and early 2009, if your city is not in the list, let me know and I’ll see what I can do :)

Thanks once more for everyone attending my OOW presentation and also my blog readers for continued support!

  • Share/Bookmark

Tanel Poder Cool stuff, Internals, Oracle, Oracle 11g, Performance, Troubleshooting

Oracle hidden costs revealed, Part2 – Using DTrace to find why writes in SYSTEM tablespace are slower than in others

September 2nd, 2008

I have written two posts in one, about a performance issue with writes in system tablespace and introduction of a little DTrace stack sampling script.

Have you noticed that DML on tables residing in SYSTEM tablespace is slower than tables in other tablespaces?

Here’s an example, I’ll create two similar tables, one in USERS tablespace, other in SYSTEM, and inset into the first one (Oracle 10.2.0.3 on Solaris x64):

SQL> create table t1(a int) tablespace USERS;

Table created.

SQL> create table t2(a int) tablespace SYSTEM;

Table created.

SQL> exec for i in 1..100000 loop insert into t1 values(i); end loop;

PL/SQL procedure successfully completed.

Elapsed: 00:00:03.09

Insert into table in USERS tablespace took 3 seconds.

Ok, let’s commit and flush dirty buffers that they wouldn’t get on the way of next insert.

SQL> commit;

Commit complete.

Elapsed: 00:00:00.02

SQL> alter system checkpoint; -- checkpointing to flush dirty buffers from previous inserts

System altered.

Elapsed: 00:00:01.34
SQL>

And now to the insert into the SYSTEM tablespace table:

SQL> exec for i in 1..100000 loop insert into t2 values(i); end loop;

PL/SQL procedure successfully completed.

Elapsed: 00:00:08.98
SQL>

What?! The same insert took 3 times longer, almost 9 seconds?

Fine! Let’s troubleshoot it!

Read more…

  • Share/Bookmark

Tanel Poder Administration, DTrace, Internals, Oracle, Performance, Tools

Flexible sampling of any V$ or X$ view with sample.sql

August 26th, 2008

In recent past I’ve blogged few scripts which use specially crafted ordered nested loop for sampling contents of V$ and X$ views fast, with plain SQL.

If you haven’t read them yet, here are the links:

I wrote the above scripts having special purposes in mind (e.g. profile session waits or latching activity).

Now I introduce a simple but powerful sqlplus script for ad-hoc sampling of any V$ view. It’s called…. (drumroll) …. sample.sql :)

When you look into it, the script is actually very simple. It’s just using power of sqlplus substitution variables, I can pass the sampled column and table names and sampling conditions in to the script dynamically.

The basic syntax is:

@sample column_name table_name filter_condition num_samples

For example, let say I have a session which executes lots of SQL statements in a loop and I want to have a quick overview of what’s the TOP SQL statement for a session right now.

Read more…

  • Share/Bookmark

Tanel Poder Internals, Oracle, Performance, Troubleshooting

Advanced Oracle Troubleshooting Guide, Part 8: Even more detailed latch troubleshooting using LatchProfX

July 23rd, 2008

In my last AOT post I published my LatchProf script which is able to sample detailed latchholder data from V$LATCHHOLDER.

Latchprof allows you to drill down into your latching problems at session level (which V$LATCH, V$LATCH_PARENT and V$LATCH_CHILDREN can’t do). It allows you to get valuable details about individual sessions who are holding a latch the most, therefore likely contributing to the latch contention problem the most.

However after you have discovered the troublemaking session, then what next? One way forward is looking into V$SESSTAT counters using Snapper tool. Depending on what latch is the problematic one, you would look for different stats like various buffer get stats for cache buffers chains latches and parsing/executing stats when looking into library cache latches. However if those stats look “normal”, is there any other way do drill down further?

Yeah, there is and lets look into it!

Read more…

  • Share/Bookmark

Tanel Poder Cool stuff, Internals, Oracle, Performance, Tools, Troubleshooting

Advanced Oracle Troubleshooting Guide, Part 7: Sampling latch holder statistics using LatchProf

July 9th, 2008

I have been too busy since getting back from vacation, thus no posts for a while. But I hope the waiting was worthwhile as I present you LatchProf, a tool for digging in to latch contention problems – using plain SQL and sqlplus!

As, I’m still busy, I make it short.

LatchProf is a script similar to WaitProf, only it samples latch holder statistics from V$LATCHHOLDER. As V$LATCHHOLDER contains a SID column (with session ID of a latch holder) it becomes possible to find who is hitting a latch the most (a way to prove that crappy monitoring tools which constantly scan through V$SQL DO cause library cache latch contention themselves).

Read more…

  • Share/Bookmark

Tanel Poder Cool stuff, Internals, Oracle, Tools, Troubleshooting