Advanced Oracle Troubleshooting Guide, Part 9 – Process stack profiling from sqlplus using OStackProf

I have mentioned ORADEBUG SHORT_STACK command in my blog posts before – it’s an easy way to get and see target processes stack backtrace directly in sqlplus. No need to log on to the Unix/Windows server and use OS tools for extracting the stack.

I have also written few tools which allow you to post-process stack traces taken using OS tools (like pstack) for better readability or performance profiling. For example os_explain and DStackProf – the DTrace stack profiler and function call aggregator.

Now I will introduce OStackProf which combines the ORADEBUG SHORT_STACK with a client side post-processing script for easy stack profiling directly from SQLPLUS – no need to log on to the server host at all!

I have demonstrated this script at conferences for couple of months now with a promise to blog about it “soon”, but it’s only now that I finally have a chance to write a blog entry about it (I have couple of hours left on a flight to help a client in London). Sorry to keep you waiting ;-)

So, this is what you see when you run oradebug short_stack on a process:

SQL> oradebug setospid 32200
Oracle pid: 2, Unix process pid: 32200, image: oracle@linux03 (PMON)
SQL>
SQL> oradebug short_stack
<-ksedsts()+275<-ksdxfstk()+22<-ksdxcb()+1599<-sspuser()+102<-__kernel_vsyscall()+2<-ntevpque()+89<-ntevqone()+34<-nsevwait()+10098<-ksnwait()+72<-ksliwat()+7249<-kslwaitctx()+135<-ksuclnwt()+249<-ksucln()+509<-ksbrdp()+1258<-opirip()+548<-opidrv()+500<-sou2o()+71<-opimai_real()+238<-ssthrdmain()+142<-main()+116<-__libc_start_main()+220<-_start()+33
SQL>

This is a stack trace, the leftmost function is where the execution currently was at the moment of taking the stack backtrace, the one to the right from it is the caller of the left function and so on. All the way to the right you see the “bottom” functions of a stack, like C main() function and also _start() which is Linux OS process loader helper function.

Note the two bold functions. Oradebug short_stack works the way that whenever it needs to let target process know about a debugger request, it sends a SIGUSR2 signal to it. sspuser() is the signal handler for SIGUSR2 and its task is to see what debug function to call. So, the sspuser() function and anything to the left from it is actually the codepath for processing the oradebug request, so we can ignore that for troubleshooting purposes (this also evidences that oradebug short_stack and dump errorstack as matter of fact do stray Oracle from its normal execution path – OS tools don’t do that, they just suspend the process and read what’s needed from process stack frames and mapped executable symbol sections).

So, the real “business” function the target was in, was __kernel_vsyscall() which in Linux means the process was doing some sort of syscall. I’m not going further in explaining the interpretation of function names here as I’ve done it in my previous AOT posts (and will continue so in the future).

I will continue on stack profiling topics. Taking only single stack backtrace can be helpful in cases when the target process is completely stuck, but if you want to diagnose the cause for just bad performance, then you need to take multiple stack backtraces, aggregate them and see into which execution “branch” of the codepath do the most stack samples fall in (as DStackProf does for example). Of course you would do this only after you’ve exhausted the step 1 and 2 in normal Oracle troubleshooting process (1. Look into wait interface data for the session, 2. Look into v$sesstat counters for the session).

Manually aggregating the stack traces is time consuming, error prone and did I mention boring, so now I finally introduce OStackProf v1.00!

OStackProf v1.00 can use an Oracle server version on any platform where the oradebug short_stack works (see Issues section in the end for some warnings) but the first version only works on Windows clients. The reason is that I mostly do my consulting and troubleshooting from a Windows workstation / sqlplus. Therefore I have written the stack aggregator in VBScript as this doesn’t require any additional scripting language installations at my client sites. Quick and easy. I will write a Unix version of this aggregator when I have more time (and need ;)

OStackProf is executed the following way:

@ostackprof SID INTERVAL #SAMPLES

Where SID is the target Session ID, INTERVAL is the number of seconds to sleep between samples (you can also specify zero or fractions like 0.1) and #SAMPLES is the number of samples to take.

I normally run it this way:

SQL> @ostackprof 136 0 100

-- oStackProf v1.00 - EXPERIMENTAL script by Tanel Poder ( http://www.tanelpoder.com )

WARNING! This script can crash the target process on Oracle 9.2 on Windows
and maybe other versions as well. Test this script out thorouhgly in your
dev environment first!
Hit CTRL+C to cancel, ENTER to continue...

Sampling...

Below is the stack prefix common to all samples:
------------------------------------------------------------------------
Frame->function()
------------------------------------------------------------------------
# 24 ->__libc_start_main()
# 23  ->main()
# 22   ->ssthrdmain()
# 21    ->opimai_real()
# 20     ->sou2o()
# 19      ->opidrv()
# 18       ->opiodr()
# 17        ->opiino()
# 16         ->opitsk()
# 15          ->ttcpip()
# 14           ->opiodr()
# 13            ->kpoal8()
# 12             ->opifch2()
# 11              ->qergsFetch()
# 10               ->qercoFetch()
#  9                ->qerjotFetch()
#  8                 ->qerjotFetch()
#  7                  ->qertbFetch()
#  6                   ->kdsttgr()
#  5                    ->kdstf0100101000km()
#  4                     ->__PGOSF476_qerjotRowProc()
#  3                      ->qertbFetch()
#  ...(see call profile below)
#
# -#--------------------------------------------------------------------
# - Num.Samples -> in call stack()
# ----------------------------------------------------------------------
    40 ->kdsttgr()->kdstf0100101000km()->kaf4reasrp0km()->->
    34 ->kdsttgr()->kdstf0100101000km()->->
     5 ->kdsttgr()->kdstf0100101000km()->kdst_fetch.()->ktrget2()->kcbgtcr()->->
     3 ->kdsttgr()->kdstf0100101000km()->kdst_fetch.()->ktrget2()->ktrgcm()->->
     3 ->kdsttgr()->kdstf0100101000km()->expeal()->evareo()->lmebco()->->
     3 ->kdsttgr()->kdstf0100101000km()->expeal()->evareo()->->
     2 ->kdsttgr()->kdstf0100101000km()->kdst_fetch.()->kcbrls()->kcbzfs()->->
     2 ->kdsttgr()->kdstf0100101000km()->expeal()->lmebco()->->
     1 ->kdsttgr()->kdstf0100101000km()->qeaeCn1Serial()->->
     1 ->kdsttgr()->kdstf0100101000km()->qeaeCn1Serial()->->
     1 ->kdsttgr()->kdstf0100101000km()->kdst_fetch.()->ktrget2()->kcbgtcr()->kcbs_simulate()->kcbsacc()->->
     1 ->kdsttgr()->kdstf0100101000km()->kdst_fetch.()->ktrget2()->->
     1 ->kdsttgr()->kdstf0100101000km()->kdst_fetch.()->->
     1 ->kdsttgr()->kdstf0100101000km()->expeal()->->
     1 ->kdsttgr()->kaf4reasrp0km()->->
     1 ->kdsttgr()->__PGOSF470_qercoRop()->->

SQL>

The output comes in two sections, first is the “bottom” part of stack trace common to all samples taken. Therefore we can immediately see that during 100% of the samples Oracle was executing qertbFetch() function (or some of its children as the second output section shows).

And for the bottom section you see a profile of child function calls all executed by the function qertbFetch() which is the “deepest” one in the upper stack section. You see that 40 samples of 100 (roughly 40% of time) the execution was in following branch of function call tree: “->kdsttgr()->kdstf0100101000km()->kaf4reasrp0km()” and during 34% of the time the execution happened to be in
branch “->kdsttgr()->kdstf0100101000km()”. It doesn’t make much sense looking into the lower lines as the top 2 account for 74% of the samples.

Nevertheless, you can visually identify that vast majority of execution time during the sampling was spent in kdstf0100101000km() functions which was called by kdsttgr() (which meaks Kernel Data Scan Table Get Row). The command that server process was executing was my lotslios.sql script which does lots of full table scans using nested loops. So that explains the numerous qertbFetch ( Query Execution Rowsource Fetch ) functions and kdst* functions which do table scans).

Here’s another example:

SQL> @ostackprof 136 0 100

-- oStackProf v1.00 - EXPERIMENTAL script by Tanel Poder ( http://www.tanelpoder.com )

WARNING! This script can crash the target process on Oracle 9.2 on Windows
and maybe other versions as well. Test this script out thorouhgly in your
dev environment first!
Hit CTRL+C to cancel, ENTER to continue...

Sampling...

Below is the stack prefix common to all samples:
------------------------------------------------------------------------
Frame->function()
------------------------------------------------------------------------
# 56 ->__libc_start_main()
# 55  ->main()
# 54   ->ssthrdmain()
# 53    ->opimai_real()
# 52     ->sou2o()
# 51      ->opidrv()
# 50       ->opiodr()
# 49        ->opiino()
# 48         ->opitsk()
# 47          ->ttcpip()
# 46           ->opiodr()
# 45            ->kpoal8()
# 44             ->opiexe()
# 43              ->kkxexe()
# 42               ->peicnt.()
# 41                ->plsql_run()
# 40                 ->pfrrun()
# 39                  ->pfrrun_no_tool()
# 38                   ->pfrinstr_BFTCHC()
# 37                    ->pevm_BFTCHC()
# 36                     ->psdnal()
# 35                      ->psddr0()
# 34                       ->rpidrv.()
# 33                        ->rpiswu2.()
# 32                         ->rpidru()
# 31                          ->skgmstack()
# 30                           ->__PGOSF126_rpidrus()
# 29                            ->opiodr()
# 28                             ->opipls()
# 27                              ->opiosq0()
# 26                               ->kksParseCursor()
# 25                                ->kkspbd0()
# 24                                 ->kksfbc()
# 23                                  ->kxsGetRuntimeLock()
# 22                                   ->kksLoadChild()
# 21                                    ->rpiswu2.()
# 20                                     ->kksFullTypeCheck()
# 19                                      ->opitca()
# 18                                       ->apadrv()
# 17                                        ->apaqbd.()
# 16                                         ->apaqbdListReverse()
# 15                                          ->apaqbdDescendents()
# 14                                           ->apakkoqb.()
# 13                                            ->kkoqbc.()
# 12                                             ->kkooqb()
#  ...(see call profile below)
#
# -#--------------------------------------------------------------------
# - Num.Samples -> in call stack()
# ----------------------------------------------------------------------
     8 ->kkocnp()->kkojnp()->kkotap.()->kkonxc.()->kketac()->kkeidc()->->
     4 ->kkocnp()->kkojnp()->->
     3 ->kkocnp()->kkojnp()->kkotap.()->kkonxc.()->kkofmx()->->
     3 ->kkocnp()->kkojnp()->kkejnc()->->
     3 ->kkocnp()->kkojnp()->kkeisj()->kkescl()->kkewat.()->slmtnpow()->pow()->->
     2 ->kkocnp()->kkojnp()->kkotap.()->kkonxc.()->kkofmx()->appget()->appopd()->qksfroChild.()->->
     2 ->kkocnp()->kkojnp()->kkotap.()->kkonxc.()->kkofmx()->appget()->appopd()->->
     2 ->kkocnp()->kkojnp()->kkotap.()->kkonxc.()->kketac()->kkeidc()->kkeIdxSelectivity()->kkepsl.()->kkeapr.()->kkewat.()->slmtnpow()->pow()->->
     2 ->kkocnp()->kkojnp()->kkotap.()->kkonxc.()->kketac()->kkeidc()->kkeIdxSelectivity()->kkeapr.()->kkewat.()->slmtnpow()->pow()->->
     2 ->kkocnp()->kkojnp()->kkotap.()->kkonxc.()->kketac()->kkeidc()->kkeGetMinNotNullSel()->ceil()->->
     2 ->kkocnp()->kkojnp()->kkotap.()->kkoitbp()->appget()->->
     2 ->kkocnp()->kkojnp()->kkejnc()->kkesrc.()->kkeSortCosts()->->
     2 ->kkocnp()->kkojnp()->kkeisj()->kkescl()->kkewat.()->slmtnpow()->__ieee754_pow()->->
     1 ->kkognp.()->kkogfp.()->kkoojt.()->->
     1 ->kkocnp()->kkojnp()->qkkFreeKey()->kghfre()->kghfrmrg()->->
     1 ->kkocnp()->kkojnp()->kkotap.()->kkotsc()->kketac()->->
     1 ->kkocnp()->kkojnp()->kkotap.()->kkoocl()->appget()->->
     1 ->kkocnp()->kkojnp()->kkotap.()->kkonxc.()->kkofmx()->appopd()->->
     1 ->kkocnp()->kkojnp()->kkotap.()->kkonxc.()->kkofmx()->appget()->applog()->applog()->appopd()->appopd()->appqbc()->->
     1 ->kkocnp()->kkojnp()->kkotap.()->kkonxc.()->kkofmx()->appget()->applog()->applog()->applog()->appopd()->->
     1 ->kkocnp()->kkojnp()->kkotap.()->kkonxc.()->kkofmx()->appget()->->
     1 ->kkocnp()->kkojnp()->kkotap.()->kkonxc.()->kketac()->kkeidc()->kkesScaleIO()->__isnan()->->
     1 ->kkocnp()->kkojnp()->kkotap.()->kkonxc.()->kketac()->kkeidc()->kkepsl.()->kkestRecColUsage.()->->
     1 ->kkocnp()->kkojnp()->kkotap.()->kkonxc.()->kketac()->kkeidc()->kkepsl.()->kkeOperCost.()->kkeOperCost1()->->
     1 ->kkocnp()->kkojnp()->kkotap.()->kkonxc.()->kketac()->kkeidc()->kkepsl.()->kkeapr.()->kkewat.()->slmtnpow()->pow()->->
     1 ->kkocnp()->kkojnp()->kkotap.()->kkonxc.()->kketac()->kkeidc()->kkeMarkIdxPreds()->->
     1 ->kkocnp()->kkojnp()->kkotap.()->kkonxc.()->kketac()->kkeidc()->kkeIdxSelectivity()->kkofsq.()->->
     1 ->kkocnp()->kkojnp()->kkotap.()->kkonxc.()->kketac()->kkeidc()->kkeIdxSelectivity()->kkestBuildMCCtxFromKeys()->kkestBuildMCCtx()->kkestAddMCPredToCtx()->kkestGetMCOpns()->->
     1 ->kkocnp()->kkojnp()->kkotap.()->kkonxc.()->kketac()->kkeidc()->kkeIdxSelectivity()->kkepsl.()->kkofsq.()->->
     1 ->kkocnp()->kkojnp()->kkotap.()->kkonxc.()->kketac()->kkeidc()->kkeIdxSelectivity()->kkepsl.()->kkestRecColUsage.()->ksl_get_shared_latch()->->
     1 ->kkocnp()->kkojnp()->kkotap.()->kkonxc.()->kketac()->kkeidc()->kkeIdxSelectivity()->kkepsl.()->kkestRecColUsage.()->->
     1 ->kkocnp()->kkojnp()->kkotap.()->kkonxc.()->kketac()->kkeidc()->kkeIdxSelectivity()->kkepsl.()->kkeEqJoinSel.()->kkeGetNDVForJoinSel()->->
     1 ->kkocnp()->kkojnp()->kkotap.()->kkonxc.()->kketac()->kkeidc()->kkeIdxSelectivity()->kkepsl.()->kkeapr.()->kkewat.()->__errno_location()->->
     1 ->kkocnp()->kkojnp()->kkotap.()->kkonxc.()->kketac()->kkeidc()->kkeIdxSelectivity()->kkepsl.()->kkeapr.()->->
     1 ->kkocnp()->kkojnp()->kkotap.()->kkonxc.()->kketac()->kkeidc()->kkeIdxSelectivity()->kkepsl.()->->
     1 ->kkocnp()->kkojnp()->kkotap.()->kkonxc.()->kketac()->kkeidc()->kkeIdxSelectivity()->kkeAdjIndexSel.()->qkshtGetOptEstimate()->->
     1 ->kkocnp()->kkojnp()->kkotap.()->kkonxc.()->kketac()->kkeidc()->kkeIdxSelectivity()->ceil()->->
     1 ->kkocnp()->kkojnp()->kkotap.()->kkonxc.()->kketac()->kkeidc()->kkeIdxSelectivity()->0AC591FB->->
     1 ->kkocnp()->kkojnp()->kkotap.()->kkonxc.()->kketac()->kkeidc()->kkeGetMinNotNullSel()->kkeapr.()->kkewat.()->slmtnpow()->pow()->->
     1 ->kkocnp()->kkojnp()->kkotap.()->kkonxc.()->kketac()->kkeidc()->kkeGetMinNotNullSel()->kkeapr.()->kkewat.()->slmtnpow()->__ieee754_pow()->->
     1 ->kkocnp()->kkojnp()->kkotap.()->kkonxc.()->appget()->->
     1 ->kkocnp()->kkojnp()->kkotap.()->kkonxc.()->->
     1 ->kkocnp()->kkojnp()->kkotap.()->kkoitbp()->appget()->appopd()->appopd()->->
     1 ->kkocnp()->kkojnp()->kkotap.()->kkoitbp()->appget()->applog()->appopd()->appopd()->->
     1 ->kkocnp()->kkojnp()->kkotap.()->kkoiff()->kkonxc.()->kkoixc()->->
     1 ->kkocnp()->kkojnp()->kkotap.()->kkoiff()->kkonxc.()->kkofmx()->->
     1 ->kkocnp()->kkojnp()->kkotap.()->kkecostb()->->
     1 ->kkocnp()->kkojnp()->kkoSwapOk.()->qkshtQBInfOrg()->qkshtTokenToEnum()->->
     1 ->kkocnp()->kkojnp()->kkosmsel()->qkkAllocFkk.()->->
     1 ->kkocnp()->kkojnp()->kkoPWJfull()->kkoqcp.()->kkoppar.()->qksfroGetPara.()->kkfdPaPrm.()->->
     1 ->kkocnp()->kkojnp()->kkoppar.()->qksfroGetPara.()->kkfdPaPrm.()->->
     1 ->kkocnp()->kkojnp()->kkogjro()->qkkFreeKey()->kghfre()->kghfrmrg()->->
     1 ->kkocnp()->kkojnp()->kkofhsmjk()->appopd()->->
     1 ->kkocnp()->kkojnp()->kkofhsmjk()->appget()->appopd()->->
     1 ->kkocnp()->kkojnp()->kkofhsmjk()->appget()->->
     1 ->kkocnp()->kkojnp()->kkejnc()->qerhjComputeFanoutAndBPS.()->ktatminextsz()->kqrprl()->kqreqd()->kglFreeSO()->->
     1 ->kkocnp()->kkojnp()->kkejnc()->qerhjComputeFanoutAndBPS.()->ktatminextsz()->kqrpre1()->kqrLockAndPinPo()->kglGetSO()->->
     1 ->kkocnp()->kkojnp()->kkejnc()->qerhjComputeFanoutAndBPS.()->->
     1 ->kkocnp()->kkojnp()->kkejnc()->kkesrc.()->ceil()->->
     1 ->kkocnp()->kkojnp()->kkejnc()->floor()->->
     1 ->kkocnp()->kkojnp()->kkejcd()->kkestGetJoinMCSel()->kkestProductSel()->qksopOptASJLeftOpn()->->
     1 ->kkocnp()->kkojnp()->kkejcd()->kkestGetJoinMCSel()->kkestProductSel()->qksopOptASJLeftOpn()->->
     1 ->kkocnp()->kkojnp()->kkejcd()->kkestGetJoinMCSel()->kkestProductSel()->kkepsl.()->kkeEqJoinSel.()->->
     1 ->kkocnp()->kkojnp()->kkejcd()->kkestGetJoinMCSel()->kkestProductSel()->kkepsl.()->kkeapr.()->kkewat.()->slmtnpow()->__ieee754_pow()->->
     1 ->kkocnp()->kkojnp()->kkejcd()->kkestBuildJoinMCCtx()->kkestGetJoinColGroupHtEl()->kgghteFind()->kgghteGetEle()->->
     1 ->kkocnp()->kkojnp()->kkejcd()->kkestBuildJoinMCCtx()->kkestGetJoinColGroupHtEl()->kgghteFind()->->
     1 ->kkocnp()->kkojnp()->kkejcd()->kkestBuildJoinMCCtx()->kkestGetJoinColGroupHtEl()->->
     1 ->kkocnp()->kkojnp()->kkejcd()->kkepsl.()->ksl_get_shared_latch()->->
     1 ->kkocnp()->kkojnp()->kkejcd()->kkepsl.()->kkeEqJoinSel.()->kkeast()->->
     1 ->kkocnp()->kkojnp()->kkejcd()->kkepsl.()->kkeEqJoinSel.()->->
     1 ->kkocnp()->kkojnp()->kkejcd()->kkepsl.()->kkeapr.()->kkewat.()->slmtnpow()->pow()->->
     1 ->kkocnp()->kkojnp()->kkejcd()->kkepsl.()->kkeapr.()->kkewat.()->->
     1 ->kkocnp()->kkojnp()->kkejcd()->kkeAdjJoinPrefCard()->qkshtGetOptEstimate()->->
     1 ->kkocnp()->kkojnp()->kkejcd()->kkeAdjJoinPrefCard()->kghstack_alloc()->->
     1 ->kkocnp()->kkojnp()->kkeisj()->kkescl()->kkewat.()->->
     1 ->kkocnp()->kkojnp()->kkeisj()->->

Again from bottom of the upper section we see that all stack samples had a common stack backtrace from function kkooqb() all the way to the beginning of stack (the main() function).

Metalink note 175982.1 tells us that kko means Kernel Kompile Optimize and I guess kkooqb means kernel compile optimize Optimize Query Block. So there’s some query optimization (part of hard parsing process) going on. I was running a PL/SQL loop constantly hard parsing new SQL for that test case.

Note how this loop under kkooqb() looks to be “looser” or “not that tight loop” as there’s a large number of functions each having only 1 or few samples recorded. That’s just interesting information as the top section has identified the mail function call stack where all the process execution was all the time anyway. So, the top section is always the first part to look into and if that leaves the answer too open (e.g. if it shows opiexe() indicating that we are executing something) then it makes sens to look into bottom section profile to see if there are some functions where most of the time was spent.

Note that you should really use this tool only when the standard instrumentation (wait interface and v$sesstat performance counters) has failed to identify the issue. That’s the reason why the title of this blog entry is called “Advanced” Oracle Troubleshooting. Of course for learning purposes you can run this any time you want as long as you don’t do it in any production database.

As this tool is experimental and not a real fully developed and tested product, it has some issues (or rather ORADEBUG SHORT_STACK has).

Issues:

Sometimes you don’t see any functions at all or just a single, such __kernel_vsyscall(). This is when Oracle happened to be in a system all during all the samples and in some platforms, the short_stack command doesn’t return the full userland stack backtrace when the process is in a system call.

SQL> @ostackprof 136 0 10

-- oStackProf v1.00 - EXPERIMENTAL script by Tanel Poder ( http://www.tanelpoder.com )

WARNING! This script can crash the target process on Oracle 9.2 on Windows
and maybe other versions as well. Test this script out thorouhgly in your
dev environment first!
Hit CTRL+C to cancel, ENTER to continue...

Sampling...

Below is the stack prefix common to all samples:
------------------------------------------------------------------------
Frame->function()
------------------------------------------------------------------------
#  0 ->__kernel_vsyscall()
#  ...(see call profile below)
#
# -#--------------------------------------------------------------------
# - Num.Samples -> in call stack()
# ----------------------------------------------------------------------
    10 ->->

The other related issue (below) is that if you have even a single “truncated” stack trace as above in all taken stacks, then obviously there will be no common subset of functions calls seen in all the samples (as all the other functions start with something like main() but there’s one __kernel_vsyscall()). Thus all the stacks are fully printed in bottom section of output and the output will be wide ( I hope your browser can handle it ;)

SQL> @ostackprof 132 0 10

-- oStackProf v1.00 - EXPERIMENTAL script by Tanel Poder ( http://www.tanelpoder.com )

WARNING! This script can crash the target process on Oracle 9.2 on Windows
and maybe other versions as well. Test this script out thorouhgly in your
dev environment first!
Hit CTRL+C to cancel, ENTER to continue...

Sampling...

Below is the stack prefix common to all samples:
------------------------------------------------------------------------
Frame->function()
------------------------------------------------------------------------
#  ...(see call profile below)
#
# -#--------------------------------------------------------------------
# - Num.Samples -> in call stack()
# ----------------------------------------------------------------------
     2 ->__kernel_vsyscall()->->
     1 ->__libc_start_main()->main()->ssthrdmain()->opimai_real()->sou2o()->opidrv()->opiodr()->opiino()->opitsk()->ttcpip()->opiodr()->kpoal8()->opiexe()->kkxexe()->peicnt.()->plsql_run()->pfrrun()->pfrrun_no_tool()->pfrinstr_EXIM()->pevm_EXIM()->psdnal()->psddr0()->rpidrv.()->rpiswu2.()->rpidru()->skgmstack()->__PGOSF126_rpidrus()->opiodr()->opipls()->opiosq0()->kksParseCursor()->kkspsc0()->kksfbc()->kxsGetRuntimeLock()->kksLoadChild()->rpiswu2.()->kksParseChildCursor()->opiprs()->opiParse()->prsdrv()->prscmd()->qcpicmd()->qcpiqry()->qcpiqex()->qcpiqbk()->qcpisel()->qcpihnt()->qcplgnt()->qcplgte()->qcplgte_multibyte()->->
     1 ->__libc_start_main()->main()->ssthrdmain()->opimai_real()->sou2o()->opidrv()->opiodr()->opiino()->opitsk()->ttcpip()->opiodr()->kpoal8()->opiexe()->kkxexe()->peicnt.()->plsql_run()->pfrrun()->pfrrun_no_tool()->pfrinstr_EXIM()->pevm_EXIM()->psdnal()->psddr0()->rpidrv.()->rpiswu2.()->rpidru()->skgmstack()->__PGOSF126_rpidrus()->opiodr()->opipls()->opiosq0()->kksParseCursor()->kkspsc0()->kksfbc()->kxsGetRuntimeLock()->kksLoadChild()->rpiswu2.()->kksFullTypeCheck()->opitca()->qkadrv2()->qkeAnalyzeAllExprs()->qkeaan()->qkerfex()->qkerf0ex()->->
     1 ->__libc_start_main()->main()->ssthrdmain()->opimai_real()->sou2o()->opidrv()->opiodr()->opiino()->opitsk()->ttcpip()->opiodr()->kpoal8()->opiexe()->kkxexe()->peicnt.()->plsql_run()->pfrrun()->pfrrun_no_tool()->pfrinstr_EXIM()->pevm_EXIM()->psdnal()->psddr0()->rpidrv.()->rpiswu2.()->rpidru()->skgmstack()->__PGOSF126_rpidrus()->opiodr()->opipls()->opiosq0()->kksParseCursor()->kkspsc0()->kksfbc()->kxsGetRuntimeLock()->kksLoadChild()->rpiswu2.()->kksFullTypeCheck()->opitca()->qkadrv2()->qkadrv2Post()->qkaEnableWideTableSelect()->qkaRowShippingAnnotation()->->
     1 ->__libc_start_main()->main()->ssthrdmain()->opimai_real()->sou2o()->opidrv()->opiodr()->opiino()->opitsk()->ttcpip()->opiodr()->kpoal8()->opiexe()->kkxexe()->peicnt.()->plsql_run()->pfrrun()->pfrrun_no_tool()->pfrinstr_EXIM()->pevm_EXIM()->psdnal()->psddr0()->rpidrv.()->rpiswu2.()->rpidru()->skgmstack()->__PGOSF126_rpidrus()->opiodr()->opipls()->opiosq0()->kksParseCursor()->kkspsc0()->kksfbc()->kxsGetRuntimeLock()->kksLoadChild()->rpiswu2.()->kksFullTypeCheck()->opitca()->opiDeferredSem()->opiSem()->kkqgbpdrv()->->
     1 ->__libc_start_main()->main()->ssthrdmain()->opimai_real()->sou2o()->opidrv()->opiodr()->opiino()->opitsk()->ttcpip()->opiodr()->kpoal8()->opiexe()->kkxexe()->peicnt.()->plsql_run()->pfrrun()->pfrrun_no_tool()->pfrinstr_EXIM()->pevm_EXIM()->psdnal()->psddr0()->rpidrv.()->rpiswu2.()->rpidru()->skgmstack()->__PGOSF126_rpidrus()->opiodr()->opipls()->opiosq0()->kksParseCursor()->kkspsc0()->kksfbc()->kxsGetRuntimeLock()->kksLoadChild()->rpiswu2.()->kksFullTypeCheck()->opitca()->apadrv()->apaqbd.()->apaqbdListReverse()->apaqbdDescendents()->apakkoqb.()->kkoqbc.()->kkooqb()->kkoiqb()->qksshMemAllocPerm()->kghalp()->0AC59043->->
     1 ->__libc_start_main()->main()->ssthrdmain()->opimai_real()->sou2o()->opidrv()->opiodr()->opiino()->opitsk()->ttcpip()->opiodr()->kpoal8()->opiexe()->kkxexe()->peicnt.()->plsql_run()->pfrrun()->pfrrun_no_tool()->pfrinstr_EXIM()->pevm_EXIM()->psdnal()->psddr0()->rpidrv.()->rpiswu2.()->rpidru()->skgmstack()->__PGOSF126_rpidrus()->opiodr()->opipls()->opiosq0()->kksParseCursor()->kkspsc0()->kksfbc()->kxsGetRuntimeLock()->kksLoadChild()->qksceCopyEnv.()->->
     1 ->__libc_start_main()->main()->ssthrdmain()->opimai_real()->sou2o()->opidrv()->opiodr()->opiino()->opitsk()->ttcpip()->opiodr()->kpoal8()->opiexe()->kkxexe()->peicnt.()->plsql_run()->pfrrun()->pfrrun_no_tool()->pfrinstr_EXIM()->pevm_EXIM()->psdnal()->psddr0()->rpidrv.()->rpiswu2.()->rpidru()->skgmstack()->__PGOSF126_rpidrus()->opiodr()->opipls()->opiosq0()->kksParseCursor()->kkspsc0()->kksfbc()->kxsGetRuntimeLock()->kksLoadChild()->qksceCopyEnv.()->->
     1 ->__libc_start_main()->main()->ssthrdmain()->opimai_real()->sou2o()->opidrv()->opiodr()->opiino()->opitsk()->ttcpip()->opiodr()->kpoal8()->opiexe()->kkxexe()->peicnt.()->plsql_run()->pfrrun()->pfrrun_no_tool()->pfrinstr_EXIM()->pevm_EXIM()->psdnal()->psddr0()->rpidrv.()->rpiswu2.()->rpidru()->skgmstack()->__PGOSF126_rpidrus()->opiodr()->opipls()->opiosq0()->kksParseCursor()->kkspsc0()->kksfbc()->kghalo()->kghprmalo()->kghfnd()->kghgex()->kghalo()->kghfnd()->kghfrunp()->->

SQL>

Anyway, this tool has made my life easier as I don’t always need to log on to OS and run some stack processing script when just testing something out. Also the oStackProf takes SID as parameter (and not SPID) so it saves a little manual lookup time.

Few notes:

1) This script requires the sqlplus client to be on Windows as it uses a VB script for post-processing the stack traces. Why VB and Windows? That’s the configuration most of my customers have for their workstations. And VB execution capability is installed there by default whereas everything else, including Perl, Python and AWK would require an additional download.

Why not do this on Unix as some of these tools are installed there by default? Again Windows is more convenient and sometimes it’s too time-consuming or impossible to get access to DB server at OS level (especially if you want to download stuff there).

2) This script is experimental! Read the notes in script header, on some platforms the oradebug short_stack crashes target process under investigation.

The installation:

1) Download ostackprof.sql from http://www.tanelpoder.com/files/scripts/ostackprof.sql
2) Download stack_helper.vbs from http://www.tanelpoder.com/files/scripts/stack_helper.vbs
3) Set your SQLPATH environment variable to the directory you downloaded the scripts
4) Run sqlplus
5) Run @ostackprof SID 0 10

I remind once more, this script is experimental and don’t use it on your production DB.

I still have an article in pipeline on how to interpret the stack trace right… I hope to complete it in a month or so as I’m extremely busy right now (I’ll be delivering my Advanced Oracle Troubleshooting seminar in 7 cities in November so if you want to see more of oStackProf, then sign up ;)

Happy stack sampling!

Note that this year’s only Advanced Oracle Troubleshooting class takes place in the end of April/May 2014, so sign up now if you plan to attend this year!

This entry was posted in Cool stuff, Oracle and tagged , , . Bookmark the permalink.

12 Responses to Advanced Oracle Troubleshooting Guide, Part 9 – Process stack profiling from sqlplus using OStackProf

  1. kurt says:

    I like the vbscript choice ;-) Many people seem to think using vbscript is below their standards and prefer the ‘see how clever I am writing this piece of unreadable perl script’.
    I always choose vbscript as my weapon on windows in favor of perl – for the reason you mention ;-)

  2. Thanks for Great Idea…

  3. tanelp says:

    @Kurt

    Yeah, VBScript is “the right tool for the right problem” in this case. No matter how clever the perl/python/awk script is, you won’t get around the fact that you likely need to install additional software on windows workstations to get it running.

    @Surachart
    Thanks!

  4. Thanks a lot Tanel – enjoyed your 3-day session at Düsseldorf/Germany a lot;

    Karl

  5. Tanel Poder says:

    Thanks Karl! Yeah, it was more of a 3 or day 4 session packed into two days ;-)

  6. coskan says:

    After UKOUG masterclass, I am proud to say that you are %100 genious. According to me, yours was the best technical presentation in Server Technologies category and I can easily say it was the longest end session, applause period I heard :)

    The way you add to many dimensions to my future troubleshooting experiences was amazing.
    Thank you for sharing your knowledge with the community.

    I hope I will be at your seminar in London, I really dont want to miss it.

  7. Pingback: Simple and stupid example on troubleshooting using callstack « Dion Cho – Oracle Performance Storyteller

  8. Pingback: Pure SQL version of ostakprof(from Tanel Poder) « Dion Cho – Oracle Performance Storyteller

  9. Jaimin says:

    Hi Tanel,

    While running ostackprof script, I am getting following error.

    SQL> @ostackprof 141 0 10

    – oStackProf v1.00 – EXPERIMENTAL script by Tanel Poder ( http://www.tanelpoder
    .com )

    WARNING! This script can crash the target process on Oracle 9.2 on Windows
    and maybe other versions/platforms as well. Test this script out thorouhgly
    in your dev environment first!
    Hit CTRL+C to cancel, ENTER to continue…

    Sampling…

    Below is the stack prefix common to all samples:
    ————————————————————————
    Frame->function()
    ————————————————————————
    D:\Oracle_10g_Instant_Client\Oracle_10g_Instant_Client\stack_helper.vbs(107, 9)
    Microsoft VBScript runtime error: Invalid procedure call or argument

    Please help me to resolve above mentioned error.

    Thanks
    Jaimin

  10. zhangqiaoc says:

    You should modify “SET TERMOUT OFF FEEDBACK OFF” to “SET TERMOUT OFF FEEDBACK OFF VERIFY OFF” in ostackprof.sql,line 46

  11. Hi Tanel,

    FYI, i stumbled upon a new MOS Note warning of an issue on HP with oradebug short_stack.

    Process Hangs After Issuing Oradebug Short_Stack on HP Platforms [ID 1271173.1]

    Obviously, there are patches for the issue.

    Regards,
    Martin

  12. Pingback: Advanced Oracle Troubleshooting Guide – Part 10: Index unique scan doing multiblock reads?! | Tanel Poder's blog: IT & Mobile for Geeks and Pros

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>