There’s a recent thread in Oracle-L about deadlocks and a recommendation to dump various instance information when the deadlock happens. A deadlock trace dumps some useful things automatically, but sometimes you want more, especially in RAC environment.
So is it possible to make Oracle dump additional things when the deadlock event happens? Yes it is and it’s doable with Oracle diagnostic event handling infrastructure.
First I’ll take a step back and explain, what this command below means:
SQL> alter session set events '10046 trace name context forever, level 12'; Session altered.
Of course you know what it does, it enables extended SQL trace. But why such cumbersome syntax?
This syntax actually reveals some of the power of KSD diagnostic event syntax (KSD=kernel service debug):
The first word in event string (the 10046) specifies the when some action should be taken.
And everything after that (trace name context forever, level 12) is that action.
The first word in action (trace) specifies what type of action Oracle needs to take if that event (10046) occurs. By “event occurs” I mean that let say an Oracle parsing function calls ksdpec() function with 10046 as parameter, which in turn may recursively call some action set for that event and then returns that event level (12) back to the caller. Its up to the caller to act on that returned value, in 10046 case some ksd* tracing function is called then.
In event syntax, “trace” is most generic action, which is used for tracing, behavior changing and Oracle dumps. In this post I will concentrate on the “trace” action as it’s most common one to use.
The “name” specifies that the name of what to dump/trace will follow. The “name” is always present when “trace” option is used (as far as I know).
Now the next keyword (context) is the one where you can define whether you want Oracle to dump something when the event is hit or just do context specific trace. If you replace the “context” with “errorstack” for example, you wouldn’t get SQL trace output, but rather an Oracle errorstack dump whenever event 10046 is hit.
You can use “oradebug dumplist” to find all the possible dump commands what you can set as actions for an event.
The next keyword (forever) is actually an option to the action, not an action keyword itself. Forever means that keep invoking the action when the 10046 event is hit, forever (or until explicitly disabled). If we don’t specify forever, then the action would be invoked only once and the event will disable itself after that.
The “level 12″ is also just another option to the action, specifying that the value for that event handler in given session should be 12. This means that whenever some Oracle function is checking whether that event is set, they will be returned value 12, the calling function interprets the value and acts appropriately (traces both binds and waits in our case).
As both the “forever” and “level 12″ are just options for the same action, separated by commas, we can replace their order like that:
SQL> alter session set events '10046 trace name context level 12, forever'; Session altered.
The outcome is exactly the same, tracing is enabled at level 12, for forever.
Also we can leave both options out to make Oracle use default values ( invoke the action only once and tracing level is set to 1 ):
SQL> alter session set events '10046 trace name context'; Session altered.
There are other options available in addition to just “forever”. If we want the action to be invoked let say 10000 times only and then automatically disable that event, we can use the LIFETIME option:
SQL> alter session set events '10046 trace name context level 12, lifetime 10000'; Session altered.
And if we want to ignore that event for some occurrences and start acting only after that, we can also specify AFTER x OCCURRENCES option to tell Oracle not to act on first x occurrences of that event and only activate itself after that:
SQL> alter session set events '10046 trace name context level 12, lifetime 10000, after 5000 occurrences'; Session altered.
You see I have combined 3 options to the action, level, lifetime and after 5000 occurrences. They can be specified in any order and need to be separated by commas (only if there are multiple options to an action, the first option must not be preceded by comma).
So far I spoke about 10046 event which is used for tracing, but there are other uses for diagnostic events.
Every ORA- error code in Oracle checks for an event with the same number. In fact Oracle diagnostic events and ORA- errors share the same range of numbers from 0 to 65535 for their codes (you can see most of them in $ORACLE_HOME/rdbms/mesg/oraus.msg file)
So, for example ORA-60 which is “deadlock detected” error will also check for event number 60 and if it is set, it will invoke its action(s) before sending the deadlock exception up the application stack.
For example, I could invoke hanganalyze at level 4 when ORA-60 is hit in my session:
SQL> alter session set events '60 trace name hanganalyze level 4'; Session altered.
As mentioned previously, you can run “oradebug dumplist” to see all actions available in your Oracle version (of course be careful when setting any undocumented events as some can even corrupt your database).
Sometimes we want more than just one type of dump when an error occurs. So it is possible to invoke multiple trace actions for a specific event. You need to separate these actions with semicolons.
SQL> alter session set events '60 trace name hanganalyze_global ; name systemstate level 266'; Session altered.
In the example above I tell Oracle to invoke a RAC global hang analyze and a systemstate dump with short stacks of processes ( level 256 + 10 = 266 ) whenever an ORA-60 deadlock error is raised.
It is possible to specify different options such level, event lifetime etc at each action level, for example we might not want to dump systemstate multiple times if the deadlock keeps reoccurring:
SQL> alter session set events '60 trace name hanganalyze_global level 4, forever; - > name systemstate level 266, lifetime 1; - > name processstate level 2, forever' 2 / Session altered.
So far I’ve defined the events upon which to take action using their numbers. Event 60 (which corresponds to ORA-00060 error message) is the “deadlock detected while waiting for resource” error message.
10046 on the other hand is not an actual error, but just a condition which is checked in few locations in Oracle kernel code (such OPI parse, execute, fetch and wait interface functions). Whenever these functions run, they check whether an event number 10046 is enabled for current process or session and if yes, then take appropriate action (which normally means tracing).
Note that it is possible to see most ORA- error messages and tracing/debug event numbers in $ORACLE_HOME/rdbms/admin/oraus.msg file on Unix platforms. Never set an event without well understanding what this event is doing and what do you want to achieve with it (and ideally you should have blessing from Oracle support/Metalink for using it).
A few numeric events have also text aliases. For example, instead of using ORA-60 as deadlock event definition, you could run the following statement:
SQL> alter session set events 'deadlock trace name hanganalyze_global'; Session altered.
Few more useful text aliases are:
- PARSE_SQL_STATEMENT – corresponds to event 10035 which allows to write parse failures to alert.log file
- SORT_END – corresponds to event 10033, “sort statistics (SOR*)”
- SORT_RUN – corresponds to event 10032, “sort run information (SRD*/SRS*)”
So, armed with the knowledge for setting events, we can define fairly complex conditions:
For example we can use the following statement to follow some memory allocation of a sort operation. We take level 29 heapdumps (PGA+UGA+Call heaps) every sort pass (sort_run event) and also in the end of a sort operation (sort_end event). And also we want to take a workareatab_dump when sort ends but only during the first sort pass (thus the lifetime 1 condition).
SQL> alter session set events ' - > sort_end trace name workareatab_dump level 3, forever; - > name heapdump level 29, forever: - > sort_run trace name workareatab_dump level 3, lifetime 1; - > name heapdump level 29, forever' 2 / Session altered.
Note that the lifetime and after x occurrences conditions are accounted at event level in session or process state, not statement execution level.
So if event’s lifetime expires during some statement execution, then that event will be disabled until manually initialized with another alter session command.
There is also a special named event – IMMEDIATE. This means that don’t wait for some event to happen in the future, but execute the required action immediately, right now.
SQL> alter session set events 'immediate trace name processstate level 10'; Session altered.
This would invoke a process state dump immediately. It’s the same as running ORADEBUG DUMP PROCESSSTATE 10 command.
Few more interesting bits:
Event syntax can understand numbers in hex as well if you prefix them with “0x”:
SQL> alter session set events '0x273E trace name context forever, level 0xC'; Session altered.
Also, in 11g you can run following command too:
SQL> alter session set events '10046 off'; Session altered.
The above clears any set actions for that event.
Before 11g you can either explicitly set all event handlers to off for an event or you can use the “all” qualifier:
SQL> alter session set events '60 trace name all off'; Session altered.
This would clear all tracing actions for that event in session. However crash and debug actions would still remain (more about these later).
That should be enough for part 1, the syntax of KSD debug events. In next part I will demo couple examples where the advanced event syntax is useful. Also, there’s a reason why I keep mentioning KSD diagnostic events, there’s another diagnostic event handling mechanism in Oracle as well and Oracle has introduced pretty big improvements in 11g, but more about this in the future ;-)