Read OS environment variables using DBMS_SYSTEM.GET_ENV()

Check out this article by Vikram Das about how to read OS environment variables using PL/SQL.

I did not know that!

I’m an Oracle ACE Director now :)

Many people have asked me that how come I’m not an Oracle ACE yet. From this week I am an Oracle ACE Director.

Here’s a link to my profile.

Its pretty big honor to be recognized by Oracle Corp at such high level.

I started working with Oracle software in 1997 and got really interested in that stuff after realizing how powerful the database was and how complex some internal details could be. Lots of opportunities to learn and learn I did!

I have been an Oracle Certified Master DBA for almost seven years now ( I took the exam in 2002 ). This is the highest certificate of technical skill Oracle gives out.

I’m also an OakTable Network member since 2004. In my eyes this is the highest recognition by industry peers an Oracle specialist can get. I’m very honored to be part of the team, not only because of the kick-ass knowledge everybody there has, but every member I’ve personally met is actually a cool person!

So, I think Oracle ACE status nicely completes the previous two “credentials” I had (and worked hard for!!!), OCM is the official technical skill recognition by Oracle, OakTable membership is the industry peer recognition and now the ACE director is Oracle’s recognition for my contributions so far.

I’d like to say thanks to Porus Havewala who recommended me for the ACE director program. Check out his blog if you’re interested in Enterprise Manager & such, he’s a Grid Control expert ( http://enterprise-manager.blogspot.com )

Ok, enough of ego-tripping for this year, back to work. I have a mutex contention troubleshooting article to finish ;-)

Oracle 11g: Reading alert log via SQL

Oracle has done some major improvements in the diagnosability infrastructure in version 11g. Here’s one little detail.

Before Oracle 11g it is possible to access the alert log via SQL using an external table or a pipelined function which in turn uses utl_file.
After reading the text you need to parse it to extract the information you need from there.

Starting from 11g Oracle does all this work for you. There is a fixed table X$DBGALERTEXT, when you query it, Oracle reads the log.xml from alert directory (which contains all the data what alert.log does), parses it and returns the details back as rows:

SQL> select message_text from X$DBGALERTEXT where rownum <= 20;

MESSAGE_TEXT
-----------------------------------------------------------------------------------------------------------------
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Shared memory segment for instance monitoring created
Picked latch-free SCN scheme 2
Using LOG_ARCHIVE_DEST_10 parameter default value as USE_DB_RECOVERY_FILE_DEST
Autotune of undo retention is turned on.
IMODE=BR
ILAT =18
LICENSE_MAX_USERS = 0
SYS auditing is disabled
Starting up ORACLE RDBMS Version: 11.1.0.7.0.
Using parameter settings in client-side pfile /u01/app/oracle/admin/LIN11G/pfile/init.ora on machine linux03
System parameters with non-default values:
  processes                = 150
  memory_target            = 404M
  control_files            = "/u01/oradata/LIN11G/control01.ctl"
  control_files            = "/u01/oradata/LIN11G/control02.ctl"
  control_files            = "/u01/oradata/LIN11G/control03.ctl"
  db_block_size            = 8192

20 rows selected.

This is the text representation, but you can get individual details from other columns as listed below:

[Read more...]

Another LatchProfX use case

Riyaj Shamsudeen wrote an excellent article about systematic latch contention troubleshooting.

Especially if the latch contention problem is ongoing, looking into system wide stats (like v$latch.sleep columns) is not the best idea in busy systems. This may sometimes lead you to fixing the wrong problem.

This is because sometimes the latch contention is not caused by some system wide inefficiency but rather by one or few sessions.

The right approach would be to measure the following things:

  1. Which latch (and exact child latches) the problem session is waiting for (query v$session_wait, ASH, sql_trace)
  2. Why is the problem session trying to get that latch so often (query v$sesstat counters for problem session and check execution plan if only single/few statements executed)
  3. Why this latch is busy: who’s holding it (query v$latchholder, LatchProf)
  4. Why is that someone holding the latch so much (query v$sesstat counters for that session or run LatchProfX)

Riyaj used this approach and successfully found out the troublemaker causing heavy library cache latch contention. He used my LatchProfX tool for part of the diagnosis (and I’m very happy to see that my advanced oracle troubleshooting tools find real world use in hands of other troubleshooters/tuners too!).

Here’s an excerpt from latchprofx output (from a little test case I put together), showing which session is holding which library cache latch how much, hold mode (shared/exclusive) and also the actual reason (function name) why the latch is held:

SQL> @latchprofx sid,name,laddr,ksllwnam,ksllwlbl,hmode 159 % 100000

-- LatchProfX 1.08 by Tanel Poder ( http://www.tanelpoder.com )

 SID NAME           LADDR            KSLLWNAM  KSLLWLBL  HMODE        Held  Gets  Held %  Held ms Avg hold ms
---- -------------- ---------------- --------- --------- ---------- ------ ----- ------- -------- -----------
 159 library cache  00000003A93513E0 kglic     child     exclusive   10044  9314   10.04  301.320        .032
 159 library cache  00000003A9351340 kglic     child     exclusive    9895  8458    9.90  296.850        .035
 159 library cache  00000003A9351660 kglic     child     exclusive    9761  8440    9.76  292.830        .035
 159 library cache  00000003A9351700 kglic     child     exclusive    9737  8924    9.74  292.110        .033
 159 library cache  00000003A9351480 kglic     child     exclusive    8999  7765    9.00  269.970        .035
 159 library cache  00000003A93515C0 kglic     child     exclusive    8553  7832    8.55  256.590        .033
 159 library cache  00000003A9351520 kglic     child     exclusive    6852  6828    6.85  205.560        .030

7 rows selected.

kglic means Kernel Generic Library cache Iterate Chain (AFAIK), it’s the function which is executed when you access most X$KGL tables. And this is where Riyaj got the evidence that the latch contention comes from inefficient scanning of library cache, caused by a session running queries against V$SQL_PLAN views with a bad plan.

So, check out Riyaj’s articele, my LatchProfX script, then my Advanced Oracle Troubleshooting Seminar :)

The real history of Oracle database revealed!

Anyone who’s looked into Oracle X$ tables, knows that their names are really complicated and quite unreadable (and non-pronouncable), such X$KZSRT, X$KCPXPL, X$KQFSZ and so on.

Few years ago at some conference someone came up with a thought that the reason why Oracle has so unreadable names for its X$ tables is that the leading edge database source code was actually stolen in the 80′s from a Soviet Union intelligence agency.

And evidence started appearing. For example, here’s an X$ table which clearly references USSR (in russian) below:

SQL> select name from v$fixed_table where upper(name) like '%CCCP%';

NAME
------------------------------
X$KCCCP

The above view has been there for many years (it’s hidden behind a layer of V$THREAD view so no-one would suspect anything).

And now, more evidence has showed up that even in the latest versions (11.1.0.7) Oracle just reuses code originally written by the Soviet intelligence agency decades ago:

SQL> select name from v$latch where upper(name) like '%KGB%';

NAME
----------------------------------------------------------------
kgb latch
kgb parent

So there’s apparently some KGB latch and KGB “parent” built in into every database from 11g.
I don’t know what exactly these do, but the code locations which make use of these latches make me worry the most:

SQL> select "WHERE" from v$latch_misses where parent_name = 'kgb latch';

WHERE
--------------------------------------------------------------------------
kgb_create_instance
kgb_destroy_instance
kgb_lock_instance

Scary….

Oracle, Timesten and PL/SQL support

I thought to post about another new interest of mine, TimestTen, as I’ve worked with it in past and I have become a fan of it, especially after Oracle bought the company.

Oracle has announced that TimesTen in-memory database will support PL/SQL in the upcoming release. That’s in 11gR2, where TimesTen is named the “in-memory database cache”.

I’m happy to see the deep level of integration Oracle is doing with it. It looks like both classic Oracle RDBMS and the TimesTen based code will have a (partially) shared PL/SQL code base.
There’s already a package called UTL_IDENT in Oracle 11.1.0.7 which stores couple variables used for conditional compilation depedent on the database product the packages are installed:

SQL> select dbms_metadata.get_ddl('PACKAGE', 'UTL_IDENT') from dual;

DBMS_METADATA.GET_DDL('PACKAGE','UTL_IDENT')
--------------------------------------------------------------------

  CREATE OR REPLACE PACKAGE "SYS"."UTL_IDENT" is

  /* A typical usage of these boolean constants is

         $if utl_ident.is_oracle_server $then
           code supported for Oracle Database
         $elsif utl_ident.is_timesten $then
           code supported for TimesTen Database
         $end
   */

  is_oracle_server     constant boolean := TRUE;
  is_oracle_client     constant boolean := FALSE;
  is_timesten          constant boolean := FALSE;

end utl_ident;
/

It looks that there will be at least some utility packages which share common code base with classic Oracle RDBMS and just use different code internally if needed, based on the UTL_IDENT variables.

This was a nice surprise as such approach promises to have full PL/SQL utilities in TimesTen cache and no discrepancy between the PL/SQL utility capabilities available in different platforms. So (I hope) there will be just one PL/SQL, not one PL/SQL version for Oracle and other for the cache.
Back before 9i days, it was quite annoying to hit the SQL interface discrepancies between PL/SQL and native SQL engines. So there’s hope that there are gonna be less porting issues between Oracle classic RDBMS and TimesTen cache (both for application developers and TimesTen coders as only low level interfaces need to be changed).

Well, at least in theory it’s like that… But I hope Oracle/TimesTen coders will do a good job with this as I’ve worked with some algorithmic trading apps doing extreme transaction processing and in this world you don’t even like wasted microseconds as this directly translates to lost trading opportunities and lost potential revenue. The legacy solutions for such apps are usually some in-memory data stores, often built in house and lots of C++ code around it for procedural work. The data is then loaded / dumped to some disk-based RDBMS asynchronously (as the multi-millisecond response times of a classic RDBMS take ages!!!)

I’m waiting to get my hands on the 11gR2 (as everyone who’s reading this blog I suspect :), but especially the integrated in-memory database cache engine as it promises to make the life easier for a lot of algorithmic trading systems developers (and also the ability adapt quicker to ever-changing markets and growing complexity of products for investment banks and trading houses).

The full power of Oracle’s diagnostic events, part 1: Syntax for KSD debug event handling

There’s a recent thread in Oracle-L about deadlocks and a recommendation to dump various instance information when the deadlock happens. A deadlock trace dumps some useful things automatically, but sometimes you want more, especially in RAC environment.

So is it possible to make Oracle dump additional things when the deadlock event happens? Yes it is and it’s doable with Oracle diagnostic event handling infrastructure.

First I’ll take a step back and explain, what this command below means:

SQL> alter session set events '10046 trace name context forever, level 12';

Session altered.

Of course you know what it does, it enables extended SQL trace. But why such cumbersome syntax?

This syntax actually reveals some of the power of KSD diagnostic event syntax (KSD=kernel service debug):

10046

The first word in event string (the 10046) specifies the when some action should be taken.
And everything after that (trace name context forever, level 12) is that action.

trace

The first word in action (trace) specifies what type of action Oracle needs to take if that event (10046) occurs. By “event occurs” I mean that let say an Oracle parsing function calls ksdpec() function with 10046 as parameter, which in turn may recursively call some action set for that event and then returns that event level (12) back to the caller. Its up to the caller to act on that returned value, in 10046 case some ksd* tracing function is called then.

In event syntax, “trace” is most generic action, which is used for tracing, behavior changing and Oracle dumps. In this post I will concentrate on the “trace” action as it’s most common one to use.

name

The “name” specifies that the name of what to dump/trace will follow. The “name” is always present when “trace” option is used (as far as I know).

context

Now the next keyword (context) is the one where you can define whether you want Oracle to dump something when the event is hit or just do context specific trace. If you replace the “context” with “errorstack” for example, you wouldn’t get SQL trace output, but rather an Oracle errorstack dump whenever event 10046 is hit.

You can use “oradebug dumplist” to find all the possible dump commands what you can set as actions for an event.

forever

The next keyword (forever) is actually an option to the action, not an action keyword itself. Forever means that keep invoking the action when the 10046 event is hit, forever (or until explicitly disabled). If we don’t specify forever, then the action would be invoked only once and the event will disable itself after that.

level 12

The “level 12″ is also just another option to the action, specifying that the value for that event handler in given session should be 12. This means that whenever some Oracle function is checking whether that event is set, they will be returned value 12, the calling function interprets the value and acts appropriately (traces both binds and waits in our case).

As both the “forever” and “level 12″ are just options for the same action, separated by commas, we can replace their order like that:

[Read more...]