Check out this article by Vikram Das about how to read OS environment variables using PL/SQL.
I did not know that!
Oracle, Exadata, Linux, Performance, Troubleshooting - Mobile Life and Productivity.
Check out this article by Vikram Das about how to read OS environment variables using PL/SQL.
I did not know that!
Many people have asked me that how come I’m not an Oracle ACE yet. From this week I am an Oracle ACE Director.
Here’s a link to my profile.
Its pretty big honor to be recognized by Oracle Corp at such high level.
I started working with Oracle software in 1997 and got really interested in that stuff after realizing how powerful the database was and how complex some internal details could be. Lots of opportunities to learn and learn I did!
I have been an Oracle Certified Master DBA for almost seven years now ( I took the exam in 2002 ). This is the highest certificate of technical skill Oracle gives out.
I’m also an OakTable Network member since 2004. In my eyes this is the highest recognition by industry peers an Oracle specialist can get. I’m very honored to be part of the team, not only because of the kick-ass knowledge everybody there has, but every member I’ve personally met is actually a cool person!
So, I think Oracle ACE status nicely completes the previous two “credentials” I had (and worked hard for!!!), OCM is the official technical skill recognition by Oracle, OakTable membership is the industry peer recognition and now the ACE director is Oracle’s recognition for my contributions so far.
I’d like to say thanks to Porus Havewala who recommended me for the ACE director program. Check out his blog if you’re interested in Enterprise Manager & such, he’s a Grid Control expert ( http://enterprise-manager.blogspot.com )
Ok, enough of ego-tripping for this year, back to work. I have a mutex contention troubleshooting article to finish ;-)
Oracle has done some major improvements in the diagnosability infrastructure in version 11g. Here’s one little detail.
Before Oracle 11g it is possible to access the alert log via SQL using an external table or a pipelined function which in turn uses utl_file.
After reading the text you need to parse it to extract the information you need from there.
Starting from 11g Oracle does all this work for you. There is a fixed table X$DBGALERTEXT, when you query it, Oracle reads the log.xml from alert directory (which contains all the data what alert.log does), parses it and returns the details back as rows:
SQL> select message_text from X$DBGALERTEXT where rownum <= 20; MESSAGE_TEXT ----------------------------------------------------------------------------------------------------------------- Starting ORACLE instance (normal) LICENSE_MAX_SESSION = 0 LICENSE_SESSIONS_WARNING = 0 Shared memory segment for instance monitoring created Picked latch-free SCN scheme 2 Using LOG_ARCHIVE_DEST_10 parameter default value as USE_DB_RECOVERY_FILE_DEST Autotune of undo retention is turned on. IMODE=BR ILAT =18 LICENSE_MAX_USERS = 0 SYS auditing is disabled Starting up ORACLE RDBMS Version: 18.104.22.168.0. Using parameter settings in client-side pfile /u01/app/oracle/admin/LIN11G/pfile/init.ora on machine linux03 System parameters with non-default values: processes = 150 memory_target = 404M control_files = "/u01/oradata/LIN11G/control01.ctl" control_files = "/u01/oradata/LIN11G/control02.ctl" control_files = "/u01/oradata/LIN11G/control03.ctl" db_block_size = 8192 20 rows selected.
This is the text representation, but you can get individual details from other columns as listed below:
Riyaj Shamsudeen wrote an excellent article about systematic latch contention troubleshooting.
Especially if the latch contention problem is ongoing, looking into system wide stats (like v$latch.sleep columns) is not the best idea in busy systems. This may sometimes lead you to fixing the wrong problem.
This is because sometimes the latch contention is not caused by some system wide inefficiency but rather by one or few sessions.
The right approach would be to measure the following things:
Riyaj used this approach and successfully found out the troublemaker causing heavy library cache latch contention. He used my LatchProfX tool for part of the diagnosis (and I’m very happy to see that my advanced oracle troubleshooting tools find real world use in hands of other troubleshooters/tuners too!).
Here’s an excerpt from latchprofx output (from a little test case I put together), showing which session is holding which library cache latch how much, hold mode (shared/exclusive) and also the actual reason (function name) why the latch is held:
SQL> @latchprofx sid,name,laddr,ksllwnam,ksllwlbl,hmode 159 % 100000 -- LatchProfX 1.08 by Tanel Poder ( http://www.tanelpoder.com ) SID NAME LADDR KSLLWNAM KSLLWLBL HMODE Held Gets Held % Held ms Avg hold ms ---- -------------- ---------------- --------- --------- ---------- ------ ----- ------- -------- ----------- 159 library cache 00000003A93513E0 kglic child exclusive 10044 9314 10.04 301.320 .032 159 library cache 00000003A9351340 kglic child exclusive 9895 8458 9.90 296.850 .035 159 library cache 00000003A9351660 kglic child exclusive 9761 8440 9.76 292.830 .035 159 library cache 00000003A9351700 kglic child exclusive 9737 8924 9.74 292.110 .033 159 library cache 00000003A9351480 kglic child exclusive 8999 7765 9.00 269.970 .035 159 library cache 00000003A93515C0 kglic child exclusive 8553 7832 8.55 256.590 .033 159 library cache 00000003A9351520 kglic child exclusive 6852 6828 6.85 205.560 .030 7 rows selected.
kglic means Kernel Generic Library cache Iterate Chain (AFAIK), it’s the function which is executed when you access most X$KGL tables. And this is where Riyaj got the evidence that the latch contention comes from inefficient scanning of library cache, caused by a session running queries against V$SQL_PLAN views with a bad plan.
Anyone who’s looked into Oracle X$ tables, knows that their names are really complicated and quite unreadable (and non-pronouncable), such X$KZSRT, X$KCPXPL, X$KQFSZ and so on.
Few years ago at some conference someone came up with a thought that the reason why Oracle has so unreadable names for its X$ tables is that the leading edge database source code was actually stolen in the 80′s from a Soviet Union intelligence agency.
And evidence started appearing. For example, here’s an X$ table which clearly references USSR (in russian) below:
SQL> select name from v$fixed_table where upper(name) like '%CCCP%'; NAME ------------------------------ X$KCCCP
The above view has been there for many years (it’s hidden behind a layer of V$THREAD view so no-one would suspect anything).
And now, more evidence has showed up that even in the latest versions (22.214.171.124) Oracle just reuses code originally written by the Soviet intelligence agency decades ago:
SQL> select name from v$latch where upper(name) like '%KGB%'; NAME ---------------------------------------------------------------- kgb latch kgb parent
So there’s apparently some KGB latch and KGB “parent” built in into every database from 11g.
I don’t know what exactly these do, but the code locations which make use of these latches make me worry the most:
SQL> select "WHERE" from v$latch_misses where parent_name = 'kgb latch'; WHERE -------------------------------------------------------------------------- kgb_create_instance kgb_destroy_instance kgb_lock_instance
I thought to post about another new interest of mine, TimestTen, as I’ve worked with it in past and I have become a fan of it, especially after Oracle bought the company.
Oracle has announced that TimesTen in-memory database will support PL/SQL in the upcoming release. That’s in 11gR2, where TimesTen is named the “in-memory database cache”.
I’m happy to see the deep level of integration Oracle is doing with it. It looks like both classic Oracle RDBMS and the TimesTen based code will have a (partially) shared PL/SQL code base.
There’s already a package called UTL_IDENT in Oracle 126.96.36.199 which stores couple variables used for conditional compilation depedent on the database product the packages are installed:
SQL> select dbms_metadata.get_ddl('PACKAGE', 'UTL_IDENT') from dual; DBMS_METADATA.GET_DDL('PACKAGE','UTL_IDENT') -------------------------------------------------------------------- CREATE OR REPLACE PACKAGE "SYS"."UTL_IDENT" is /* A typical usage of these boolean constants is $if utl_ident.is_oracle_server $then code supported for Oracle Database $elsif utl_ident.is_timesten $then code supported for TimesTen Database $end */ is_oracle_server constant boolean := TRUE; is_oracle_client constant boolean := FALSE; is_timesten constant boolean := FALSE; end utl_ident; /
It looks that there will be at least some utility packages which share common code base with classic Oracle RDBMS and just use different code internally if needed, based on the UTL_IDENT variables.
This was a nice surprise as such approach promises to have full PL/SQL utilities in TimesTen cache and no discrepancy between the PL/SQL utility capabilities available in different platforms. So (I hope) there will be just one PL/SQL, not one PL/SQL version for Oracle and other for the cache.
Back before 9i days, it was quite annoying to hit the SQL interface discrepancies between PL/SQL and native SQL engines. So there’s hope that there are gonna be less porting issues between Oracle classic RDBMS and TimesTen cache (both for application developers and TimesTen coders as only low level interfaces need to be changed).
Well, at least in theory it’s like that… But I hope Oracle/TimesTen coders will do a good job with this as I’ve worked with some algorithmic trading apps doing extreme transaction processing and in this world you don’t even like wasted microseconds as this directly translates to lost trading opportunities and lost potential revenue. The legacy solutions for such apps are usually some in-memory data stores, often built in house and lots of C++ code around it for procedural work. The data is then loaded / dumped to some disk-based RDBMS asynchronously (as the multi-millisecond response times of a classic RDBMS take ages!!!)
I’m waiting to get my hands on the 11gR2 (as everyone who’s reading this blog I suspect :), but especially the integrated in-memory database cache engine as it promises to make the life easier for a lot of algorithmic trading systems developers (and also the ability adapt quicker to ever-changing markets and growing complexity of products for investment banks and trading houses).
There’s a recent thread in Oracle-L about deadlocks and a recommendation to dump various instance information when the deadlock happens. A deadlock trace dumps some useful things automatically, but sometimes you want more, especially in RAC environment.
So is it possible to make Oracle dump additional things when the deadlock event happens? Yes it is and it’s doable with Oracle diagnostic event handling infrastructure.
First I’ll take a step back and explain, what this command below means:
SQL> alter session set events '10046 trace name context forever, level 12'; Session altered.
Of course you know what it does, it enables extended SQL trace. But why such cumbersome syntax?
This syntax actually reveals some of the power of KSD diagnostic event syntax (KSD=kernel service debug):
The first word in event string (the 10046) specifies the when some action should be taken.
And everything after that (trace name context forever, level 12) is that action.
The first word in action (trace) specifies what type of action Oracle needs to take if that event (10046) occurs. By “event occurs” I mean that let say an Oracle parsing function calls ksdpec() function with 10046 as parameter, which in turn may recursively call some action set for that event and then returns that event level (12) back to the caller. Its up to the caller to act on that returned value, in 10046 case some ksd* tracing function is called then.
In event syntax, “trace” is most generic action, which is used for tracing, behavior changing and Oracle dumps. In this post I will concentrate on the “trace” action as it’s most common one to use.
The “name” specifies that the name of what to dump/trace will follow. The “name” is always present when “trace” option is used (as far as I know).
Now the next keyword (context) is the one where you can define whether you want Oracle to dump something when the event is hit or just do context specific trace. If you replace the “context” with “errorstack” for example, you wouldn’t get SQL trace output, but rather an Oracle errorstack dump whenever event 10046 is hit.
You can use “oradebug dumplist” to find all the possible dump commands what you can set as actions for an event.
The next keyword (forever) is actually an option to the action, not an action keyword itself. Forever means that keep invoking the action when the 10046 event is hit, forever (or until explicitly disabled). If we don’t specify forever, then the action would be invoked only once and the event will disable itself after that.
The “level 12″ is also just another option to the action, specifying that the value for that event handler in given session should be 12. This means that whenever some Oracle function is checking whether that event is set, they will be returned value 12, the calling function interprets the value and acts appropriately (traces both binds and waits in our case).
As both the “forever” and “level 12″ are just options for the same action, separated by commas, we can replace their order like that: