Archive

Archive for the ‘Unix/Linux’ Category

Sometimes things are easy (Part 1): How to fix wrapped execution plan text?

January 18th, 2010

What you see below is a common problem. Someone sends you (or posts to a forum) a wide execution plan, which is unreadable because of wrapped lines. For example, this one below:

--------------------------------------------------------------------------------
-------------------

| Id  | Operation                   | Name                    | E-Rows |  OMem |
 1Mem | Used-Mem |

--------------------------------------------------------------------------------
-------------------

|   0 | SELECT STATEMENT            |                         |        |       |
 |          |

|   1 |  SORT AGGREGATE             |                         |      1 |       |
 |          |

|*  2 |   HASH JOIN                 |                         |     13 |  1102K|
 1102K|  355K (0)|

|*  3 |    HASH JOIN                |                         |     13 |   988K|
 988K|  367K (0)|

|*  4 |     HASH JOIN               |                         |     13 |   921K|
 921K|  621K (0)|

|*  5 |      HASH JOIN OUTER        |                         |     13 |   836K|
 836K| 1224K (0)|

|*  6 |       HASH JOIN             |                         |     13 |   821K|
 821K|  501K (0)|

|*  7 |        HASH JOIN            |                         |     13 |  1102K|
 1102K|  501K (0)|

|   8 |         MERGE JOIN CARTESIAN|                         |      1 |       |
 |          |

|*  9 |          TABLE ACCESS FULL  | PROFILE$                |      1 |       |
 |          |

|  10 |          BUFFER SORT        |                         |      1 | 73728 |
 73728 |          |

|* 11 |           TABLE ACCESS FULL | PROFILE$                |      1 |       |
 |          |

|* 12 |         TABLE ACCESS FULL   | USER$                   |     36 |       |
 |          |

|  13 |        TABLE ACCESS FULL    | PROFNAME$               |      1 |       |
 |          |

|* 14 |       TABLE ACCESS FULL     | RESOURCE_GROUP_MAPPING$ |      1 |       |
 |          |

|  15 |      TABLE ACCESS FULL      | TS$                     |      7 |       |
 |          |

|  16 |     TABLE ACCESS FULL       | TS$                     |      7 |       |
 |          |

|  17 |    TABLE ACCESS FULL        | USER_ASTATUS_MAP        |      9 |       |
 |          |

--------------------------------------------------------------------------------
-------------------

So now you either try to manually edit and fix the execution plan text so you could read it or ask the developer to send the execution plan again. Both approaches take time.

Well, sometimes things are easy – in this particular case I saved the above into a file called /tmp/x and ran the following command:

$ cat /tmp/x | awk '{ printf "%s", $0 ; if (NR % 3 == 0) print } END { print }'
---------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name                    | E-Rows |  OMem | 1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                         |        |       | |          |
|   1 |  SORT AGGREGATE             |                         |      1 |       | |          |
|*  2 |   HASH JOIN                 |                         |     13 |  1102K| 1102K|  355K (0)|
|*  3 |    HASH JOIN                |                         |     13 |   988K| 988K|  367K (0)|
|*  4 |     HASH JOIN               |                         |     13 |   921K| 921K|  621K (0)|
|*  5 |      HASH JOIN OUTER        |                         |     13 |   836K| 836K| 1224K (0)|
|*  6 |       HASH JOIN             |                         |     13 |   821K| 821K|  501K (0)|
|*  7 |        HASH JOIN            |                         |     13 |  1102K| 1102K|  501K (0)|
|   8 |         MERGE JOIN CARTESIAN|                         |      1 |       | |          |
|*  9 |          TABLE ACCESS FULL  | PROFILE$                |      1 |       | |          |
|  10 |          BUFFER SORT        |                         |      1 | 73728 | 73728 |          |
|* 11 |           TABLE ACCESS FULL | PROFILE$                |      1 |       | |          |
|* 12 |         TABLE ACCESS FULL   | USER$                   |     36 |       | |          |
|  13 |        TABLE ACCESS FULL    | PROFNAME$               |      1 |       | |          |
|* 14 |       TABLE ACCESS FULL     | RESOURCE_GROUP_MAPPING$ |      1 |       | |          |
|  15 |      TABLE ACCESS FULL      | TS$                     |      7 |       | |          |
|  16 |     TABLE ACCESS FULL       | TS$                     |      7 |       | |          |
|  17 |    TABLE ACCESS FULL        | USER_ASTATUS_MAP        |      9 |       | |          |
---------------------------------------------------------------------------------------------------

All I did here was that I stripped out line feeds from all lines except every 3rd line (which is the real end of the original line).

Note that if your linesize is very wide (and trimspool/trimout settings are ON) then this script would need some adjustment…

I’m sure this trivial approach doesn’t work in all situations, but with this article I wanted to illustrate that sometimes things which seem hard can be made much easier with a little scripting knowledge. If you are thinking which technology you should learn next – then better check out a Perl, Python or some shell+AWK book :)

By the way, if you want real flexibility displaying your execution plans (from library cache), then check this out:

http://blog.tanelpoder.com/2009/05/26/scripts-for-showing-execution-plans-via-plain-sql-and-also-in-oracle-9i/

  • Share/Bookmark

Tanel Poder Administration, Oracle, Productivity, Tools, Unix/Linux

Identifying shared memory segment users using lsof

January 22nd, 2009

Lsof (list open files) is a really useful tool for troubleshooting open file decriptors which prevent a deleted file from being released or a shared memory segment from being removed.

Here’s a little situation on Linux where an Oracle shared memory segment was not released as someone was still using it.

$ ipcs -ma

------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status
0x00000000 393216     oracle    640        289406976  1          dest
0xbfb94e30 425985     oracle    640        289406976  18
0x3cf13430 557058     oracle    660        423624704  22

------ Semaphore Arrays --------
key        semid      owner      perms      nsems
0xe2260ff0 1409024    oracle    640        154
0x9df96b74 1671169    oracle    660        154

------ Message Queues --------
key        msqid      owner      perms      used-bytes   messages

The bold line should have disappeared after instance shutdown, but it didn’t. From “natcch” (number of attached processes) column I see there is still some process using the shared memory segment. Thus the segment was not released and even ipcrm command did not remove it (just like with normal files if someone has them open).

So, I needed to identify which process was still using the memory segment. If that had been a normal existing file, I’d could have used /sbin/fuser command to see which process still holds it open, but this only works for existing files with existing directory entries.

However for deleted files, sockets and shared memory segments, you can use lsof command (it’s normally installed by default on Linux, but for Unixes you need to separately download and install).

The SHM ID of that segment was 393216 as ipcs -ma showed, so I simply run lsof to show all open file descriptors and grep for that SHM ID:

$ lsof | egrep "393216|COMMAND"
COMMAND     PID      USER   FD      TYPE     DEVICE       SIZE       NODE NAME
python    18811    oracle  DEL       REG        0,8                393216 /SYSVbfb94e30

See how the NODE column corresponds to SHM ID in ipcs output.

So I kill the PID 18811 which is still attached to the SHM segment:

$ kill 18811

$ ipcs -ma

------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status
0xbfb94e30 425985     oracle    640        289406976  18
0x3cf13430 557058     oracle    660        423624704  25

------ Semaphore Arrays --------
key        semid      owner      perms      nsems
0xe2260ff0 1409024    oracle    640        154
0x9df96b74 1671169    oracle    660        154

------ Message Queues --------
key        msqid      owner      perms      used-bytes   messages

Now the shared memory segment is gone and its memory released.

Note that the lsof command is very useful for many other tasks as well. For example it allows you to list open sockets by network protocol, IP, port etc. For example you can determine to which client some server process is talking to, from OS level:

$ lsof -i:1521
COMMAND   PID   USER   FD   TYPE DEVICE SIZE NODE NAME
tnslsnr  6212 oracle   11u  IPv4  49486       TCP *:1521 (LISTEN)
tnslsnr  6212 oracle   13u  IPv4 276708       TCP linux03:1521->linux03:37277 (ESTABLISHED)
tnslsnr  6212 oracle   14u  IPv4 264894       TCP linux03:1521->linux03:41122 (ESTABLISHED)
oracle  22687 oracle   20u  IPv4 264893       TCP linux03:41122->linux03:1521 (ESTABLISHED)
oracle  25250 oracle   15u  IPv4 276707       TCP linux03:37277->linux03:1521 (ESTABLISHED)
oracle  25530 oracle   15u  IPv4 279910       TCP linux03:1521->192.168.247.1:nimsh (ESTABLISHED)

Unfortunately lsof is not installed by default in classic Unixes, but in some shops the sysadmins have chosen to install it. But even then, it may not work for regular users as lsof requires access to kernel memory structures through /dev/kmem or similar. If you can’t get access to lsof then there may be other tools available which can do some tricks lsof can do. For example on Solaris, there’s an useful command pfiles which can show open files of a process and since Solaris 9 ( I think ) it can also report the TCP connection endpoints of network sockets…

  • Share/Bookmark

Tanel Poder Administration, Networking, Oracle, Tools, Troubleshooting, Unix/Linux

Reliable latch waits and a new blog

January 20th, 2009

Here’s a link to Alex Fatkulin’s blog if you haven’t seen it already: http://afatkulin.blogspot.com/

He has some good Oracle internals information in there, I also like his research style.

Alex just blogged about a finding (on Oracle 11g on Linux) that when Oracle process doesn’t get a latch after spinning, it goes to sleep using semop() system call, which never wakes up unless this semaphore is posted by another process. From past versions we remember that Oracle processes go to sleep for a short period of time, wake up, try to get the latch and sleep again for a longer period of time if unsuccessful (up to _max_exponential_sleep centiseconds). This kind of sleeping with timeout is done using semtimedop() syscall on Linux.

Also, for some latches the latch waiter posting was available. If a process failed to get a latch, it put a pointer to its state object into the waiter list for that latch and went to sleep for some centiseconds. If the latch holder released this latch, it scanned through a waiter list for that latch and posted the waiters, so that they would not have to sleep until the end of this x centisecond sleep. This used to be controllable using _latch_wait_posting parameter, but since 9i this parameter has been removed and most latches do have wait posting enabled by default.

With semaphores and posting, on most Unixes there have been problems in past with missed posts/wakeups, sometimes due bugs, sometimes just due the implementation of signals and semaphore operations in Unix kernel. So that’s why the past Oracle versions always have some kind of timeout for latch sleeps (and most enqueue sleeps and buffer busy waits as well, as a matter of fact). If a process manages to miss the wakeup call, it will wake up after some timeout anyway. Performance suffers, but at least the process won’t hang infinitely. And this was achieved using semtimedop() systemcall (on Linux).

So, how come Alex saw just semop() calls in his test?

The answer is that apparently the minimum required Linux kernel used for Oracle 10.2+ does support reliable posting using semaphores, so Oracle is taking use of this.

There is a new parameter called _enable_reliable_latch_waits in 10.2+ and (at least) on Linux it is true. When it’s true, Oracle trusts that all wakeup calls (through semaphores) are received by the sleeping processes, thus there’s no need to periodically wake up to check whether the semaphore has become “available”.

Here’s a little test case:
Read more…

  • Share/Bookmark

Tanel Poder Internals, Oracle, Performance, Unix/Linux

Oracle memory troubleshooting, Part 1: Heapdump Analyzer

January 2nd, 2009

When troubleshooting Oracle process memory issues like ORA-4030’s or just excessive memory usage, you may want to get a detailed breakdown of PGA, UGA and Call heaps to see which component in there is the largest one.

The same goes for shared pool memory issues and ORA-4031’s – sometimes you need to dump the shared pool heap metadata for understanding what kind of allocations take most of space in there.

The heap dumping can be done using a HEAPDUMP event, see http://www.juliandyke.com/Diagnostics/Dumps/Dumps.html for syntax.

NB! Note that when dumping SGA heaps (like shared, large, java and streams pools), your process holds shared pool latches for the entire dump duration so this should be used only as a last resort in busy production instances. Dumping a big shared pool could hang your instance for quite some time. Dumping private process heaps is safer as that way only the target process is affected.

The heapdump output file structure is actually very simple, all you need to look at is the HEAP DUMP header to see in which heap the following chunks of memory belong (as there may be multiple heaps dumped into a single tracefile).

HEAP DUMP heap name="sga heap(1,1)"  desc=04EA22D0
 extent sz=0xfc4 alt=108 het=32767 rec=9 flg=-125 opc=0
 parent=00000000 owner=00000000 nex=00000000 xsz=0x400000
EXTENT 0 addr=20800000
  Chunk 20800038 sz=   374904    free      "               "
  Chunk 2085b8b0 sz=      540    recreate  "KGL handles    "  latch=00000000
  Chunk 2085bacc sz=      540    recreate  "KGL handles    "  latch=00000000
  Chunk 2085bce8 sz=     1036    freeable  "parameter table"
  Chunk 2085c0f4 sz=     1036    freeable  "parameter table"
  Chunk 2085c500 sz=     1036    freeable  "parameter table"
  Chunk 2085c90c sz=     1036    freeable  "parameter table"
  Chunk 2085cd18 sz=     1036    freeable  "parameter table"
  Chunk 2085d124 sz=      228    recreate  "KGL handles    "  latch=00000000
  Chunk 2085d208 sz=      228    recreate  "KGL handles    "  latch=00000000
  Chunk 2085d2ec sz=      228    recreate  "KGL handles    "  latch=00000000
  Chunk 2085d3d0 sz=      228    recreate  "KGL handles    "  latch=00000000
  Chunk 2085d4b4 sz=      228    recreate  "KGL handles    "  latch=00000000
  Chunk 2085d598 sz=      540    recreate  "KQR PO         "  latch=2734AA00
  Chunk 2085d7b4 sz=      540    recreate  "KQR PO         "  latch=2734AA00
  Chunk 2085d9d0 sz=      228    recreate  "KGL handles    "  latch=00000000
...

The first list of chunks after HEAP DUMP (the list above) is the list of all chunks in the heap. There are more lists such as freelists and LRU lists in a regular heap, but lets ignore those for now, I’ll write more about heaps in an upcoming post.

After identifying heap name from HEAP DUMP line, you can see all individual chunks from the “Chunk” lines. The second column after Chunk shows the start address of a chunk, sz= means chunk size, the next column shows the type of a chunk (free, freeable, recreate, perm, R-free, R-freeable).

The next column is important one for troublehsooting, it shows the reason why a chunk was allocated (such KGL handles for library cache handles, KGR PO for dictionary cache parent objects etc). Every chunk in a heap has a fixed 16 byte area in the chunk header which stores the allocation reason (comment) of a chunk. Whenever a client layer (calling a kghal* chunk allocation function) allocates heap memory, it needs to pass in a comment up to 16 bytes and it’s stored in the newly allocated chunk header.

This is a trivial technique for troubleshooting memory leaks and other memory allocation problems. When having memory issues you can just dump all the heap’s chunks sizes and aggregate these by allocation reason/comment. That would show you the biggest heap occupier and give further hints where to look next.

As there can be lots of chunks in large heaps, aggregating the data manually would be time consuming (and boring). Here’s a little shell script which can summarize Oracle heapdump output tracefile contents for you:

Read more…

  • Share/Bookmark

Tanel Poder Internals, Oracle, Tools, Troubleshooting, Unix/Linux

Advanced Oracle Troubleshooting Guide, Part 6: Understanding Oracle execution plans with os_explain

June 15th, 2008

Get ready for some more adventures in Oracle process stack!

Before proceeding though, please read this post about safety of different stack sampling approaches.

I have had few non-trivial Oracle troubleshooting cases, related to query hangs and bad performance, where I’ve wanted to know where exactly in execution plan the current execution is.
Remember, Oracle is just another program executing instructions clustered in functions on your server, so stack sampling can help out here as well.

So, I was looking into the following stack trace taken from an Oracle 10.1 database on Solaris SPARC, running a SQL with this execution plan.

Read more…

  • Share/Bookmark

Tanel Poder Cool stuff, Internals, Oracle, Performance, SQL, Tools, Troubleshooting, Unix/Linux

Oracle hidden costs revealed, part 1 – Does a batch job run faster when executed locally?

February 5th, 2008

This series is about revealing some Oracle’s internal execution costs and inefficiencies. I will analyze few situations and special cases where you can experience a performance hit where you normally wouldn’t expect to.

The first topic is about a question I saw in a recent Oracle Forum thread.

The question goes like this: “Is there any benefit if I run long sql queries from the server (by using telnet,etc) or from the remote by sql client.”

In order to leave out the network transfer cost of resultset for simplicity, I will rephrase the question like that: “Do I get better performance when I execute my server-side batch jobs (which don’t return any data to client) locally from the database server versus a remote application server or workstation?”

The obvious answer would be “NO, it does not matter where from you execute your batch job, as Oracle is a client server database system. All execution is done locally regardless of the client’s location, thus the performance is the same”.

While this sounds plausible in theory, there is (at least) one practical issue which can affect Oracle server performance depending on the clients platform and client libaries version.

It is caused by regular in-band break checking in client server communication channel where out of band break signalling is not available. A test case is below:

Read more…

  • Share/Bookmark

Tanel Poder Internals, Performance, Troubleshooting, Unix/Linux

Systematic application troubleshooting in Unix

January 5th, 2008

How many times have you seen a following case, where a user or developer complains that their Oracle session is stuck or running very slowly and the person who starts investigating the issue does following:

  1. Checks the database for locks
  2. Checks free disk space
  3. Checks alert log
  4. Goes back to the client saying “we did a healthcheck and everything looks ok” and closes the case or asks the user/developer to contact application support team or tune their SQL

The point here is that what the heck do the database locks, alert log or disk space have to do with first round session troubleshooting, when Oracle provides just about everything you need in one simple view?

Yes, I am talking about sampling V$SESSION_WAIT here. Database locks, free space and potential errors in alert log may have something to do with your users problems, but not necessarily. As there are many more causes, like network issues etc which could affect your user (and the whole database), it doesn’t make sense to go through all those random “healthchecks” every time you receive a user phone call. Moreover, even if you identify that there is shortage of disk space or there are many database locks – so what? They may not have anything to do with the users problem.

The issue here is that still many people do not know about V$SESSION_WAIT which in most cases shows your problem immediately or at least points you to right direction (e.g. there’s no need to check for locks if your session is waiting on “log file switch (archiving needed)” wait – and vice versa). Even if “these people” have heard of V$SESSION_WAIT and may be able to drop this in during their job interview, they may not know how to use it in systematic troubleshooting context. Many hours of service downtime and user frustration would be saved if all DBAs knew this extremely simple concept of looking at V$SESSION_WAIT.

This blog entry is not about Oracle though, so I will leave this rant for a future blog post.

This post is about a similar problem in Unix world. Having been involved with resolving some serious production issues lately I have been surprised quite many times by the corporate Unix support people who seem to do behave in similar manner. For example, there is a user calling in saying that their scheduled Unix job, which normally takes 5 minutes, has been running for hours now. The “senior unix support analyst” will do following:

Read more…

  • Share/Bookmark

Tanel Poder Troubleshooting, Unix/Linux

Advanced Oracle Troubleshooting Guide, Part 3: More adventures in process stack

September 6th, 2007

…or rather thread stack as nowadays decent operating systems execute threads (or tasks as they’re called in Linux kernel).

Anyway, stack trace gives you the ultimate truth on what your program is doing, exactly right now. There are couple of but’s like stack corruptions and missing symbol information which may make the traces less useful for us, but for detailed hang & performance troubleshooting the stack traces are a goldmine.

So, I present another case study – how to diagnose a complete database hang when you can’t even log on to the database.

Read more…

  • Share/Bookmark

Tanel Poder Internals, Oracle, Performance, Tools, Troubleshooting, Unix/Linux

Operating systems are lazy allocating memory

August 28th, 2007

There was a discussion about whether Oracle really allocates all memory for SGA immediately on instance startup or not. And further, whether Oracle allocates memory beyond the SGA_TARET if SGA_MAX_SIZE is larger than it.
It’s worth reading this thread first: http://forums.oracle.com/forums/thread.jspa?threadID=535400&tstart=0

I will paste an edited version of my reply to here as well:

Read more…

  • Share/Bookmark

Tanel Poder Internals, Oracle, Unix/Linux, Windows

Advanced Oracle Troubleshooting Guide, Part 2: No magic is needed, systematic approach will do

August 27th, 2007

There are two ways for diagnosing problems:

  1. Checking for usual suspects and hoping to find a matching one
  2. Following a systematic approach

Checking for usual suspects and hoping to find a matching one

The first approach relies on previous experience (both in particular subject area/technology and about the context/environment the problem occurs). For example if a patient comes to doctor complaining about pain in chest, then for doctor (and also for the patient) it would definitely be beneficial to know more relevant info about the patient – the context. If the patient had just fell off a 10-foot ladder, then it’d be more suitable to look for broken ribs. On the other hand, if the patient has been a long-time smoker and was watching TV on a couch when the pain started, then perhaps it’d be more suitable to start with an EKG (note that I’m not an expert on how human body works so should anybody complain about any pain in their chest to you, send them to real doctor immediately!)

Anyway, if you’ve been administering a database full time for last 5 years, you will probably know where to look immediately when a specific problem occurs. Note that I wrote “where to look immediately” here, not “what to change immediately”. Using previous experience to identify root causes of problems is obviously a perfectly valid approach which may get you to the solution very fast (again and again) – but it can work well only if you do have lots of previous experience in solving problems for that technology and that particular environment (in other words, you know the context). The big risk here is that if a “new” problem expresses itself in similar symptoms like the “old-and-well-known-problem”, we could easily end up looking for and fixing the wrong issue. And if that doesn’t work, continuing to try out a solution which helped with another problem last year. And if that doesn’t help then the cycle continues, we dig up even more unlikely fixes which have been useful once in past and we apply them. And then we resort to googling and trying out whatever solutions anyone has suggested for problems others have experienced.

We have ended up in Desperate Switch Flipping state. Usually this leads to flipping even larger switches, starting from Oracle session/instance parameters and adding random SQL hints up to restarting servers, upgrading databases, operating systems, hardware – usually without any luck. This wastes time, doesn’t solve our problem and may cause even more trouble. This is not good. Checking out the usual suspects may help solving common recurring problems (hey, shouldn’t good specialists avoid recurring problems rather than fixing them again and again?) . However it is very important to draw the line between checking for usual suspects and falling into DSF state, as looks like happened to someone at Oracle-L.

Quoting:

“I ran Statspack reports at the highest level of detail until I was blue in the face. I ran traces. I set events. But I also am by nature intuitive and tend often to use intuition to solve a problem with facts to back up my intuitive conclusion. So after providing all of this stuff to Oracle Support, they were at a loss, well, they were very eager to look at corruption as a cause, because they didn’t have another solution.”

(Btw, I think the DSF state would be a suitable addition to pathological DBA problems list which Gaja once started with CTD)

The other way to diagnose problems is to follow a systematic approach, an appropriate methodology based on knowledge how computers work.

Read more…

  • Share/Bookmark

Tanel Poder Internals, Oracle, Performance, Troubleshooting, Unix/Linux