Oracle Troubleshooting: How to read Oracle ERRORSTACK output?!

I have written the first article to the troubleshooting section of my new website tech.E2SN.com:

It’s about a very valuable Oracle troubleshooting tool -> ERRORSTACK trace.

I cover 4 frequently asked questions there:

  1. Reading the current executing SQL statement text from errorstack trace
  2. Reading the current executing PL/SQL package and PL/SQL source code line number from errorstack trace
  3. Reading the current bind variable values from errostack trace
  4. Identifying how much private (UGA) memory a cursor is using

You can read it here:

By the way, if you like my new website, feel free to link to it !!! ;-)

Recursive sessions…

I have published a new article to tech.e2sn.com about recursive sessions and ORA-00018: maximum number of sessions exceeded error message:

http://tech.e2sn.com/oracle/oracle-internals-and-architecture/recursive-sessions-and-ora-00018-maximum-number-of-sessions-exceeded

Note that I’m working on setting up RSS feed for tech.e2sn too, coming soon :)

Sometimes things are easy (Part 1): How to fix wrapped execution plan text?

What you see below is a common problem. Someone sends you (or posts to a forum) a wide execution plan, which is unreadable because of wrapped lines. For example, this one below:

--------------------------------------------------------------------------------
-------------------

| Id  | Operation                   | Name                    | E-Rows |  OMem |
 1Mem | Used-Mem |

--------------------------------------------------------------------------------
-------------------

|   0 | SELECT STATEMENT            |                         |        |       |
 |          |

|   1 |  SORT AGGREGATE             |                         |      1 |       |
 |          |

|*  2 |   HASH JOIN                 |                         |     13 |  1102K|
 1102K|  355K (0)|

|*  3 |    HASH JOIN                |                         |     13 |   988K|
 988K|  367K (0)|

|*  4 |     HASH JOIN               |                         |     13 |   921K|
 921K|  621K (0)|

|*  5 |      HASH JOIN OUTER        |                         |     13 |   836K|
 836K| 1224K (0)|

|*  6 |       HASH JOIN             |                         |     13 |   821K|
 821K|  501K (0)|

|*  7 |        HASH JOIN            |                         |     13 |  1102K|
 1102K|  501K (0)|

|   8 |         MERGE JOIN CARTESIAN|                         |      1 |       |
 |          |

|*  9 |          TABLE ACCESS FULL  | PROFILE$                |      1 |       |
 |          |

|  10 |          BUFFER SORT        |                         |      1 | 73728 |
 73728 |          |

|* 11 |           TABLE ACCESS FULL | PROFILE$                |      1 |       |
 |          |

|* 12 |         TABLE ACCESS FULL   | USER$                   |     36 |       |
 |          |

|  13 |        TABLE ACCESS FULL    | PROFNAME$               |      1 |       |
 |          |

|* 14 |       TABLE ACCESS FULL     | RESOURCE_GROUP_MAPPING$ |      1 |       |
 |          |

|  15 |      TABLE ACCESS FULL      | TS$                     |      7 |       |
 |          |

|  16 |     TABLE ACCESS FULL       | TS$                     |      7 |       |
 |          |

|  17 |    TABLE ACCESS FULL        | USER_ASTATUS_MAP        |      9 |       |
 |          |

--------------------------------------------------------------------------------
-------------------

So now you either try to manually edit and fix the execution plan text so you could read it or ask the developer to send the execution plan again. Both approaches take time.

Well, sometimes things are easy – in this particular case I saved the above into a file called /tmp/x and ran the following command:

$ cat /tmp/x | awk '{ printf "%s", $0 ; if (NR % 3 == 0) print } END { print }'
---------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name                    | E-Rows |  OMem | 1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                         |        |       | |          |
|   1 |  SORT AGGREGATE             |                         |      1 |       | |          |
|*  2 |   HASH JOIN                 |                         |     13 |  1102K| 1102K|  355K (0)|
|*  3 |    HASH JOIN                |                         |     13 |   988K| 988K|  367K (0)|
|*  4 |     HASH JOIN               |                         |     13 |   921K| 921K|  621K (0)|
|*  5 |      HASH JOIN OUTER        |                         |     13 |   836K| 836K| 1224K (0)|
|*  6 |       HASH JOIN             |                         |     13 |   821K| 821K|  501K (0)|
|*  7 |        HASH JOIN            |                         |     13 |  1102K| 1102K|  501K (0)|
|   8 |         MERGE JOIN CARTESIAN|                         |      1 |       | |          |
|*  9 |          TABLE ACCESS FULL  | PROFILE$                |      1 |       | |          |
|  10 |          BUFFER SORT        |                         |      1 | 73728 | 73728 |          |
|* 11 |           TABLE ACCESS FULL | PROFILE$                |      1 |       | |          |
|* 12 |         TABLE ACCESS FULL   | USER$                   |     36 |       | |          |
|  13 |        TABLE ACCESS FULL    | PROFNAME$               |      1 |       | |          |
|* 14 |       TABLE ACCESS FULL     | RESOURCE_GROUP_MAPPING$ |      1 |       | |          |
|  15 |      TABLE ACCESS FULL      | TS$                     |      7 |       | |          |
|  16 |     TABLE ACCESS FULL       | TS$                     |      7 |       | |          |
|  17 |    TABLE ACCESS FULL        | USER_ASTATUS_MAP        |      9 |       | |          |
---------------------------------------------------------------------------------------------------

All I did here was that I stripped out line feeds from all lines except every 3rd line (which is the real end of the original line).

Note that if your linesize is very wide (and trimspool/trimout settings are ON) then this script would need some adjustment…

I’m sure this trivial approach doesn’t work in all situations, but with this article I wanted to illustrate that sometimes things which seem hard can be made much easier with a little scripting knowledge. If you are thinking which technology you should learn next – then better check out a Perl, Python or some shell+AWK book :)

By the way, if you want real flexibility displaying your execution plans (from library cache), then check this out:

http://blog.tanelpoder.com/2009/05/26/scripts-for-showing-execution-plans-via-plain-sql-and-also-in-oracle-9i/

Explain Plan For command may show you the wrong execution plan – Part 1

In Oracle-L mailing list a question was asked about under which conditions can the explain plan report a wrong execution plan (not the one which was actually used when a problem happened).

I replied this, but thought to show an example test case of this problem too:

1) The optimizer statistics the EXPLAIN PLAN ends up using are different
from the statistics the other session ended up using

2) Explain plan does not use bind variable peeking thus will not optimize
for current bind variable values

3) Explain plan treats all bind variables as VARCHAR2, thus you ma have
implicit datatype conversion happening during the plan execution, (meaning
to_char,to_number functions are added around variables/columns) and this for
example may make optimizer to ignore some indexes if you get unlucky.

…Of course explain plan doesn’t really
execute the plan so the implicit datatype conversion you see is in the
explained plan only, but if you actually execute the statement (with correct
bind datatypes) then there’s no implicit datatype conversion. And that’s
where the difference comes from…

And here comes an example of condition number 3 above. Lets use a little bit of bad design out there and put numeric values into varchar2 columns:

SQL> create table t (id varchar2(10), name varchar2(100));

Table created.

SQL> insert into t select to_char(object_id), object_name from dba_objects;

51449 rows created.

Now we add a little index for lookup performance and gather stats:

SQL> create index i on t(id);

Index created.

SQL> exec dbms_stats.gather_table_stats(user,'T',cascade=>true);

PL/SQL procedure successfully completed.

Now lets define a bind variable of NUMBER type and set a value for it:

SQL> var x number
SQL>
SQL> exec :x:=99999

PL/SQL procedure successfully completed.

Now lets use “explain plan for” to estimate the execution plan:

SQL> explain plan for
 2  select sum(length(name)) from t where id >  :x;

Explained.

SQL> select * from table(dbms_xplan.display) ;

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------
Plan hash value: 3694077449

-------------------------------------------------------------------------------------
| Id  | Operation                    | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |      |     1 |    29 |    56   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE              |      |     1 |    29 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| T    |  2572 | 74588 |    56   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | I    |   463 |       |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

 3 - access("ID">:X)

15 rows selected.

Explain plan command nicely reports that we’d be using an index range scan, which would be a good thing to do given my test data and search condition.

Now lets actually run the statement and see the REAL execution plan actually used for the execution. I’ll use dbms_xplan.display_CURSOR for this. If you don’t pass SQL_ID/child into that function it will just report the last SQL statement executed in your current session. But the key difference between the dbms_xplan.DISPLAY and DISPLAY_CURSOR is that the latter goes to library cache and fetches the actual SQL plan used from there. The explain plan command just reparses the statement and estimates a plan, ignoring any bind variable values and assuming that all bind variables are of type varchar2:

SQL> select sum(length(name)) from t where id >  :x;

SUM(LENGTH(NAME))
-----------------

SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------
SQL_ID  7zm570j6kj597, child number 0
-------------------------------------
select sum(length(name)) from t where id >  :x

Plan hash value: 2966233522

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |    60 (100)|          |
|   1 |  SORT AGGREGATE    |      |     1 |    29 |            |          |
|*  2 |   TABLE ACCESS FULL| T    |  2572 | 74588 |    60   (5)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

 2 - filter(TO_NUMBER("ID")>:X)

19 rows selected.

Whatta? We actually used a full table scan!

Also check out a related article by Kerry Osborne

Detect chained and migrated rows in Oracle – Part 1

I received a question about migrated rows recently.

It was about how to detect migrated rows in a 200TB data warehouse, with huge tables – as the ANALYZE TABLE xyz LIST CHAINED ROWS INTO command can not be automatically parallelized at table level (as DBMS_STATS can be, but oh, DBMS_STATS doesn’t gather the migrated/chained row info). Therefore the analyze command would pretty much run forever before returning (and committing) the chained row info in the output table. Also as there are regular maintenance jobs running on these tables (I suspect partition maintentance for example), then it wouldn’t be nice to keep running ANALYZE on the whole table constantly.

So, is there any faster or better way for finding the amount of migrated rows?

Ihave two answers to this.

Answer 1:

As we are dealing with a huge 200+ TB data warehouse its tables/indexes are most likely partitioned. Thus you could use the ANALYZE TABLE xyz PARTITION (abc) LIST CHAINED ROWS command to analyze individual partitions, even in parallel (sqlplus sessions) if you like. This would allow you to focus only on the partitions of interest (the latest ones, with the heaviest activity perhaps).

SQL> create table CHAINED_ROWS (
2    owner_name         varchar2(30),
3    table_name         varchar2(30),
4    cluster_name       varchar2(30),
5    partition_name     varchar2(30),
6    subpartition_name  varchar2(30),
7    head_rowid         rowid,     -- actual chained row's head piece address in the segment
8    analyze_timestamp  date
9  );
Table created.
SQL>
SQL> analyze table tmp partition (sys_p501) list chained rows; -- the default table name used for output is "CHAINED_ROWS"
Table analyzed.
SQL> analyze table tmp partition (sys_p502) list chained rows;
Table analyzed.
SQL> select partition_name, count(*) from chained_rows group by partition_name;
PARTITION_NAME                   COUNT(*)
------------------------------ ----------
SYS_P502                              252
SYS_P501                             5602

SQL>

So, from above you see its possible to find out partition (or even sub-partition level row chaining).

However this above command lists you both CHAINED rows and MIGRATED rows (even though Oracle calls them all chained rows internally, as the chaining mechanism is the same for both cases).

Chained row is a row which is too large to fit into a block, so will always have to be split between multiple different blocks – with an exception of intra-block chaining which is used for rows with more than 255 columns. Migrated row on the other hand is a row which has been updated larger than it initially was – and if as a result it doesn’t fit into its original block, the row itself is moved to a new block, but the header (kind of a stub pointer) of the row remains in original location. This is needed so that any indexes on the table would still be able to find that row using original ROWIDs stored in them). If Oracle didn’t leave the row head piece in place then it would always go and update all indexes which have the ROWID of the migrating row in them.

Why should we care whether a row is a real chained row or just a migrated row?

It’s because if the row is chained, then any reorgs would not help you – if a row is too big to fit into a block, its too big to fit into a block no matter how many times you move around the table. (Note that if you have large tables full of rows longer than 8KB there’s likely something wrong with your design).

But migrated rows on the other hand are “chained” into another block due some update which made them not fit into existing block anymore. This happens when PCTFREE is set too low compared to real row growth factor and sometimes you may want to fix it by reorganizing the table/partition with ALTER TABLE/PARTITION MOVE or by backing the rows up, deleting them and reinserting them back to the table (that one makes sense when only a small amount of rows in a table are migrated).

If you are completely sure that you don’t have any rows longer than the free space in an empty block (thus all individual rows would fit into a block and would need to be split among multiple blocks) then you can conclude that all the rows reported were migrated due their growth.

Another option would be to query out all or a sample of these chained/migrated rows and actually measure how long they are if all columns are put together. This could be done using vsize() function (or also dump() and lengthb() in some cases). Of course the column and row header overhead would need to be accounted in as well.

So, this already gets pretty complex and there are more tiny details which we should take into account… thus I will introduce another way to look into the row migration/chaining thing:

Answer 2: (Alternatively called “should we care?”)

[Read more...]

What’s a good way to learn some Oracle internals every day?

Sometimes when an attendee describes me some totally weird problem during a seminar, I am immediately able to answer something like “Hey this looks like a bug related to this Oracle configuration and can be influenced by xyz”.

And then people ask me “How the hell do you know all this stuff?”

Well, I haven’t been bitten by all of these bugs myself, but I have been doing something for many years, almost every day… reading my email!

Oh, and additionally I have configured Metalink to send me daily updates about new/updated notes, forum articles and… bug descriptions!

The last part is very important. Bug descriptions tell you something about new bugs found (and old bugs rediscovered) and sometimes their details tell you an interesting piece or two about Oracle internals related to them.

[Read more...]

KGL simulator, shared pool simulator and buffer cache simulator – what are these?

If you have queried v$sgastat on recent Oracle versions (by which I mean 9i and above) you probably have seen allocations for some sort of simulators in Oracle instance. Here’s an example:

SQL> select * from v$sgastat where lower(name) like '%sim%' order by name;

POOL         NAME                            BYTES
------------ -------------------------- ----------
shared pool  kglsim alloc latch area          1700
shared pool  kglsim alloc latches               68
shared pool  kglsim count of pinned he        9248
shared pool  kglsim free heap list             204
shared pool  kglsim free obj list              204
shared pool  kglsim hash table                4104
shared pool  kglsim hash table bkts        2097152
shared pool  kglsim heap                    635536
shared pool  kglsim latch area                1700
shared pool  kglsim latches                     68
shared pool  kglsim main lru count           87040
shared pool  kglsim main lru size           174080
shared pool  kglsim object batch            909440
shared pool  kglsim pin list arr               816
shared pool  kglsim recovery area             2112
shared pool  kglsim sga                      22092
shared pool  kglsim size of pinned mem       18496
shared pool  ksim client list                   84
shared pool  log_simultaneous_copies           480
shared pool  sim cache nbufs                   640
shared pool  sim cache sizes                   640
shared pool  sim kghx free lists                 4
shared pool  sim lru segments                  640
shared pool  sim segment hits                 1280
shared pool  sim segment num bufs              640
shared pool  sim state object                   48
shared pool  sim trace buf                    5140
shared pool  sim trace buf context             120
shared pool  sim_knlasg                       1200
shared pool  simulator hash buckets          16512
shared pool  simulator latch/bucket st        4608

31 rows selected.

See, a bunch of “kgl sim” and then just “sim” allocations.

… or sometimes you can see latch contention on following latches:

SQL> select name from v$latch where name like '%sim%';

NAME
-------------------------------------------------------
ksim membership request latch
simulator lru latch
simulator hash latch
sim partition latch
shared pool simulator
shared pool sim alloc

6 rows selected.

Again, there seems to be some “simulation” work going on in Oracle instance.

So what are these simulators about?

[Read more...]

Alter system kill session and ORA-00031: session marked for kill

I’m sure you are all familiar with this situation:

SQL> alter system kill session '152,33';
alter system kill session '152,33'
*
ERROR at line 1:
ORA-00031: session marked for kill

The session trying to issue the kill will hang for 60 seconds and then return this “session marked for kill” message. And the target session does not get killed at all.

So why is that?

The issue is in what this alter system kill command is doing. It’s not actually killing the target session (like kill -9 would do for OS processes). It just sets a bit in the target sessions state object, which marks that the target session should end. But its entirely up the target session to check this bit and act on it!

So, intead of ALTER SYSTEM KILL SESSION, the command should look something like ALTER SYSTEM ASK SESSION TO COMMIT SUICIDE.

All the kill session command is doing is ASK the target session to clean up and exit – via setting that bit.

Now, normally the target sessions are nice and check that bit often enough in their code, act on it and die.

But sometimes when the target session happens to be busy looping in some tight loop (due a bug perhaps) or is hung, then it never gets to check that “please die” bit and never exits.

[Read more...]

When was a table last changed?

I frequently get a question about how to find out when was a table last modified. I’m talking about table data, not table structure, the latter would be detectable from dba_objects.last_ddl_time.

Unless you have some table level DML auditing already turned on, then as one option you could use LogMiner and “just” work through the redo/archivelogs in the range of interest with it. This could be very time consuming (especially if the last change was done a while back), so here’s another option for getting last table modification info very easily, however it comes with some restrictions.

http://www.tanelpoder.com/files/scripts/lastchanged.sql

(read instructions and limitations from the script header).

The idea is following:

  1. Oracle has an ORA_ROWSCN pseudocolumn which reports the last known change time for a row in a table. The “time” shows a commit SCN number of last transaction modifying the row, not a real timestamp though. It is important to note that unless the ROWDEPENDECIES are enabled, then the last SCN is known only at data block level, not row level, rowscn’s for all rows in a block would report whatever SCN is in the last change SCN in block header.
  2. SCN is ever-increasing internal “time” used by Oracle recovery and transaction layers and it is possible to map this to real time with reasonable accuracy using few Oracle’s tables which store SCN to wallclock time mappings. My script reports the time range in which the last change to datablock/row occurred so you’ll know how accurate it is.
    I use sys.smon_scn_time and v$log_history views for SCN to real time mapping. So my script reports two times, first one may be more accurate, second one has longer history on the other hand.You can use whatever other datasource for doing this mapping, as long as it has SCNs and corresponding timestamps in it. For example, if you have log_checkpoints_to_alert parameter set to true, you can grep the SCN/timestamp pairs out with command like this:

    cat alert_win10g.log | egrep -e "^Beginning.*checkpoint|[[:alpha:]]{3} [[:alpha:]]{3} [[:digit:]]{2} "
    
    

Here are few usage examples.

First I’ll check some data which was probably last changed a long time ago:

SQL> @lastchanged sys.obj$ name='DBMS_STANDARD'

-- LastChanged.sql v1.0 by Tanel Poder ( http://www.tanelpoder.com )

Running this query:

.   select MAX(ora_rowscn)
.   from sys.obj$
.   where name='DBMS_STANDARD';

DATA_SOURCE       LAST_CHANGED
----------------- -------------------------------------------------------
sys.smon_scn_time Before  2008-12-31 16:05:25 (earlier than 21 days ago)
v$log_history     Before  2008-10-27 03:58:16 (earlier than 86 days ago)

2 rows analyzed.

As both sys.smon_scn_time and v$log_history don’t have records dating back to that old time when the database was created (when DBMS_STANDARD and other object records in that datablock were created), they just show that the change happened before the oldest SCN to time mapping they have.

Here’s a check on a regular table, but we are only interested in rows/datablocks where the “sal” column is bigger than 1000:

[Read more...]

Identifying shared memory segment users using lsof

Lsof (list open files) is a really useful tool for troubleshooting open file decriptors which prevent a deleted file from being released or a shared memory segment from being removed.

Here’s a little situation on Linux where an Oracle shared memory segment was not released as someone was still using it.

$ ipcs -ma

------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status
0x00000000 393216     oracle    640        289406976  1          dest
0xbfb94e30 425985     oracle    640        289406976  18
0x3cf13430 557058     oracle    660        423624704  22

------ Semaphore Arrays --------
key        semid      owner      perms      nsems
0xe2260ff0 1409024    oracle    640        154
0x9df96b74 1671169    oracle    660        154

------ Message Queues --------
key        msqid      owner      perms      used-bytes   messages

The bold line should have disappeared after instance shutdown, but it didn’t. From “natcch” (number of attached processes) column I see there is still some process using the shared memory segment. Thus the segment was not released and even ipcrm command did not remove it (just like with normal files if someone has them open).

So, I needed to identify which process was still using the memory segment. If that had been a normal existing file, I’d could have used /sbin/fuser command to see which process still holds it open, but this only works for existing files with existing directory entries.

However for deleted files, sockets and shared memory segments, you can use lsof command (it’s normally installed by default on Linux, but for Unixes you need to separately download and install).

The SHM ID of that segment was 393216 as ipcs -ma showed, so I simply run lsof to show all open file descriptors and grep for that SHM ID:

$ lsof | egrep "393216|COMMAND"
COMMAND     PID      USER   FD      TYPE     DEVICE       SIZE       NODE NAME
python    18811    oracle  DEL       REG        0,8                393216 /SYSVbfb94e30

See how the NODE column corresponds to SHM ID in ipcs output.

So I kill the PID 18811 which is still attached to the SHM segment:

$ kill 18811

$ ipcs -ma

------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status
0xbfb94e30 425985     oracle    640        289406976  18
0x3cf13430 557058     oracle    660        423624704  25

------ Semaphore Arrays --------
key        semid      owner      perms      nsems
0xe2260ff0 1409024    oracle    640        154
0x9df96b74 1671169    oracle    660        154

------ Message Queues --------
key        msqid      owner      perms      used-bytes   messages


Now the shared memory segment is gone and its memory released.

Note that the lsof command is very useful for many other tasks as well. For example it allows you to list open sockets by network protocol, IP, port etc. For example you can determine to which client some server process is talking to, from OS level:

$ lsof -i:1521
COMMAND   PID   USER   FD   TYPE DEVICE SIZE NODE NAME
tnslsnr  6212 oracle   11u  IPv4  49486       TCP *:1521 (LISTEN)
tnslsnr  6212 oracle   13u  IPv4 276708       TCP linux03:1521->linux03:37277 (ESTABLISHED)
tnslsnr  6212 oracle   14u  IPv4 264894       TCP linux03:1521->linux03:41122 (ESTABLISHED)
oracle  22687 oracle   20u  IPv4 264893       TCP linux03:41122->linux03:1521 (ESTABLISHED)
oracle  25250 oracle   15u  IPv4 276707       TCP linux03:37277->linux03:1521 (ESTABLISHED)
oracle  25530 oracle   15u  IPv4 279910       TCP linux03:1521->192.168.247.1:nimsh (ESTABLISHED)

Unfortunately lsof is not installed by default in classic Unixes, but in some shops the sysadmins have chosen to install it. But even then, it may not work for regular users as lsof requires access to kernel memory structures through /dev/kmem or similar. If you can’t get access to lsof then there may be other tools available which can do some tricks lsof can do. For example on Solaris, there’s an useful command pfiles which can show open files of a process and since Solaris 9 ( I think ) it can also report the TCP connection endpoints of network sockets…