Sqlplus is my second home, part 2: Running SQL scripts from remote locations using HTTP

As you probably already know, the Session Snapper has been designed to be a very easy-to-use performance tool. It is especially useful in database environments where there are no decent performance tools pre-installed and available.

Snapper doesn’t require any setup, all you need is to log on to the database using sqlplus and download snapper.sql script to your computer.

Well, actually the second part is not required, as Oracle sqlplus allows you to run scripts from http and ftp locations!

C:>sqlplus "sys/oracle@ora92 as sysdba"

SQL*Plus: Release 10.2.0.3.0 - Production on Thu Aug 30 23:00:10 2007

Copyright (c) 1982, 2006, Oracle.  All Rights Reserved.

Connected to:
Oracle9i Enterprise Edition Release 9.2.0.8.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.8.0 - Production

SQL> @http://blog.tanelpoder.com/files/scripts/i.sql

Tanel's sqlplus http test...

http://blog.tanelpoder.com/files/scripts/i.sql

USER                           SYSDATE
------------------------------ ---------
SYS                            30-AUG-07

NAME
---------
ORA92

INSTANCE_NAME    HOST_NAME
---------------- ----------------------------------------------------------------
ora92            WINDOWS01

You should be *very* sure that noone can change the scripts on the server without your knowing!!!

SQL>

Cool stuff or what? :)

Let’s see how this relates to everyday DBA life…

[Read more...]

Operating systems are lazy allocating memory

There was a discussion about whether Oracle really allocates all memory for SGA immediately on instance startup or not. And further, whether Oracle allocates memory beyond the SGA_TARET if SGA_MAX_SIZE is larger than it.
It’s worth reading this thread first: http://forums.oracle.com/forums/thread.jspa?threadID=535400&tstart=0

I will paste an edited version of my reply to here as well:

[Read more...]

Advanced Oracle Troubleshooting Guide, Part 2: No magic is needed, systematic approach will do

There are two ways for diagnosing problems:

  1. Checking for usual suspects and hoping to find a matching one
  2. Following a systematic approach

Checking for usual suspects and hoping to find a matching one

The first approach relies on previous experience (both in particular subject area/technology and about the context/environment the problem occurs). For example if a patient comes to doctor complaining about pain in chest, then for doctor (and also for the patient) it would definitely be beneficial to know more relevant info about the patient – the context. If the patient had just fell off a 10-foot ladder, then it’d be more suitable to look for broken ribs. On the other hand, if the patient has been a long-time smoker and was watching TV on a couch when the pain started, then perhaps it’d be more suitable to start with an EKG (note that I’m not an expert on how human body works so should anybody complain about any pain in their chest to you, send them to real doctor immediately!)

Anyway, if you’ve been administering a database full time for last 5 years, you will probably know where to look immediately when a specific problem occurs. Note that I wrote “where to look immediately” here, not “what to change immediately”. Using previous experience to identify root causes of problems is obviously a perfectly valid approach which may get you to the solution very fast (again and again) – but it can work well only if you do have lots of previous experience in solving problems for that technology and that particular environment (in other words, you know the context). The big risk here is that if a “new” problem expresses itself in similar symptoms like the “old-and-well-known-problem”, we could easily end up looking for and fixing the wrong issue. And if that doesn’t work, continuing to try out a solution which helped with another problem last year. And if that doesn’t help then the cycle continues, we dig up even more unlikely fixes which have been useful once in past and we apply them. And then we resort to googling and trying out whatever solutions anyone has suggested for problems others have experienced.

We have ended up in Desperate Switch Flipping state. Usually this leads to flipping even larger switches, starting from Oracle session/instance parameters and adding random SQL hints up to restarting servers, upgrading databases, operating systems, hardware – usually without any luck. This wastes time, doesn’t solve our problem and may cause even more trouble. This is not good. Checking out the usual suspects may help solving common recurring problems (hey, shouldn’t good specialists avoid recurring problems rather than fixing them again and again?) . However it is very important to draw the line between checking for usual suspects and falling into DSF state, as looks like happened to someone at Oracle-L.

Quoting:

“I ran Statspack reports at the highest level of detail until I was blue in the face. I ran traces. I set events. But I also am by nature intuitive and tend often to use intuition to solve a problem with facts to back up my intuitive conclusion. So after providing all of this stuff to Oracle Support, they were at a loss, well, they were very eager to look at corruption as a cause, because they didn’t have another solution.”

(Btw, I think the DSF state would be a suitable addition to pathological DBA problems list which Gaja once started with CTD)

The other way to diagnose problems is to follow a systematic approach, an appropriate methodology based on knowledge how computers work.

[Read more...]

Oracle Session Snapper, part 2: Getting most out of Snapper

The main design goal of Session Snapper was that it should not require any changes to be made into database.
And to achieve this goal, I was even willing to sacrifice some functionality.

So, for example there is no sorting capability in Snapper output. It would have been easy to create an SQL Type to database, use that as session statistics storage and query results out using an order by on statistics delta column – giving you (probably) most significant resource consumers first. But I didn’t do this as it would have violated the no-database-change-whatsoever design goal. (This problem could however be solved using manual sorting in PL/SQL code as done in Adrian Billington’s variation of runstats utility: http://www.oracle-developer.net/utilities.php )

The second design goal was that snapper should work with as little privileges as possible. So far it requires execute access only on DBMS_LOCK, DBMS_OUTPUT and read access on few V$ views.

However if you run the snapper using “out” option then it will use DBMS_OUTPUT.PUT_LINE for sending data back to the client. Unfortunately the client (SQLPlus in this case) calls DBMS_OUTPUT.GET_LINES to retrieve the output only when the previous database call has finished. Of course this makes sense, because otherwise some asynchronous call capability to Oracle server would be needed, which either has to interrupt the server process somehow for processing the GET_LINES, or another server process should be started for output feed.

Anyway, we don’t have such functionality in Oracle, so this means that if you run snapper in a loop, you will get the output only when the loop finishes and returns control to sqlplus. This is not good enough for having continuous real-time view of your session(s) performance.

Those who have looked into the header or sourcecode of my Session Snapper script have seen that if you use “trace” option and tail -f on that tracefile, you can get continuous real-time Snapper output to your screen (in addition to having that info saved in logfile). This functionality uses DBMS_SYSTEM.KSDWRT() procedure though. This means again that in order to get real-time output, you need to either grant execute priv on DBMS_SYSTEM to your monitoring user or run Snapper as SYS. Both those options violate the design goals of Snapper and may be unacceptable in real life production systems.

So, to combat that, I wrote a script which still gives us real-time Snapper output without need to access DBMS_SYSTEM.

[Read more...]

Oracle 11g internals part 1: Automatic Memory Management

This is my attempt for getting cheap popularity out of recent Oracle 11g release. This is not going to be another Oracle 11g new features list, I’ll be just posting any of my research findings here, in a semi-organized way.

The first post is is about Automatic Memory Management. AMM manages all SGA + PGA memory together, allowing it to shift memory from SGA to PGAs and vice versa. You only need to set a MEMORY_TARGET (and if you like, MEMORY_MAX_TARGET parameter).

You can read rest of the general details from documentation, I will talk about how this feature has been implemented on OSD / OS level (or at least how it looks to be implemented).

When I heard about MEMORY_TARGET , then the first question that came into my mind was that how can Oracle shift shared SGA memory to private PGA memory on Unix? This would mean somehow deallocating space from existing SGA shared memory segment and releasing it for PGA use. To my knowledge the traditional SysV SHM interface is not that flexible that it could downsize and release memory from a single shared memory segment. So I started checking out how Oracle had implemented this.

[Read more...]

Oracle Session Snapper – real-time session-level performance stats for DBAs

A post by Jonathan Lewis inspired me to finally complete my version of the Oracle session performance snapper script, which main characteristics are

  1. it reports Oracle session level performance counter and wait information in real time
  2. it does NOT require any database objects to be created

If you are a DBA or consultant working on ad-hoc performance issues, you will like it!

Are you familiar with following situation?

(Monday morning)

Customer: Hey, we need your help! We have serious performance issues in our production environment. It’s a database with x000 online users, several parallel data feeds, continuous batch jobs and reporting going on.
Users have started experiencing occasional bad performance and some batch jobs as well.

You: Ok, lets see it. Do you have any performance monitoring tools installed?

Customer: Yes, we have Xxxxx Xxxxxxxxx installed which shows us a nice green or red light on big screen depending whether there are any problems or not.

You: What color is it showing now?

Customer: Green

You: But you still have the performance problem you described earlier?

Customer: Yes

You: Do you have ANY performance monitoring tools installed?

Customer: Well, we also have a statspack snapshot taken every morning and evening to capture the business workload.

You: Um… Ok… you know forget about it. Can I create a small package to capture some useful performance info on the problematic sessions?

Customer: Yes, but we need to put the DDL scripts through the change review board which gathers every Thursday… but we can’t wait that long!

You: Can we enable tracing?

Customer: Enabling tracing is a change and all changes must go through review board. Also, we don’t really know which exact sessions are affected, it just happens for seemingly random ones… and we can’t just trace every session, can we?

You: Ok, give me a sqlplus window and Excel, we’ll figure something out.

…And now follows a tedious manual SQL execution and copy & paste exercise from various V$ views for getting some meaningful performance information out of Oracle.

Well, not anymore, because The Oracle Session Snapper is in town!

If you know vmstat for Unix, you know it reports you various system level statistic counter deltas over a period you choose.

Well, the Oracle Session Snapper output looks somewhat similar, but it reports you session level deltas of Oracle v$sesstat counters, wait events and starting from 10g also session time model statistics.

All info can be reported in real time, without a need for running and timing multiple SQL scripts and manual calculation of deltas.

And the key unique point of the Session Snapper is – it does not require creation of any database objects, thus no changes in the database at all! Everything is done from within a sqlplus script or anonymous PL/SQL block.
This means that you will be able to get quick session-level performance snapshots even in heavily change-controlled environments, where no object creation whatsoever is allowed without going through a long process.

You can read all the usage details from the script header, but here’s one example of its output:

Tanel@Sol01> @snapper out 1 3 475

-- Session Snapper v1.03 by Tanel Poder ( http://www.tanelpoder.com )

--------------------------------------------------------------------------------------------------------------------------------------------
--        SID, SNAPSHOT START   , SECONDS  , TYPE, STATISTIC                               ,         DELTA,      D/SEC,     HDELTA,   HD/SEC
--------------------------------------------------------------------------------------------------------------------------------------------
DATA,     475, 20070820 01:17:47,         1, STAT, session logical reads                   ,         88232,      88232,     88.23k    88.23k
DATA,     475, 20070820 01:17:47,         1, STAT, consistent gets                         ,         88233,      88233,     88.23k    88.23k
DATA,     475, 20070820 01:17:47,         1, STAT, consistent gets from cache              ,         88232,      88232,     88.23k    88.23k
DATA,     475, 20070820 01:17:47,         1, STAT, calls to get snapshot scn: kcmgss       ,           556,        556,        556       556
DATA,     475, 20070820 01:17:47,         1, STAT, no work - consistent read gets          ,         87677,      87677,     87.68k    87.68k
DATA,     475, 20070820 01:17:47,         1, STAT, table scans (short tables)              ,           139,        139,        139       139
DATA,     475, 20070820 01:17:47,         1, STAT, table scan rows gotten                  ,       7429598,    7429598,      7.43M     7.43M
DATA,     475, 20070820 01:17:47,         1, STAT, table scan blocks gotten                ,         87676,      87676,     87.68k    87.68k
DATA,     475, 20070820 01:17:47,         1, STAT, buffer is pinned count                  ,           138,        138,        138       138
--  End of snap 1
DATA,     475, 20070820 01:17:48,         1, STAT, session logical reads                   ,         87779,      87779,     87.78k    87.78k
DATA,     475, 20070820 01:17:48,         1, STAT, consistent gets                         ,         87772,      87772,     87.77k    87.77k
DATA,     475, 20070820 01:17:48,         1, STAT, consistent gets from cache              ,         87772,      87772,     87.77k    87.77k
DATA,     475, 20070820 01:17:48,         1, STAT, calls to get snapshot scn: kcmgss       ,           552,        552,        552       552
DATA,     475, 20070820 01:17:48,         1, STAT, no work - consistent read gets          ,         87210,      87210,     87.21k    87.21k
DATA,     475, 20070820 01:17:48,         1, STAT, table scans (short tables)              ,           138,        138,        138       138
DATA,     475, 20070820 01:17:48,         1, STAT, table scan rows gotten                  ,       7389897,    7389897,      7.39M     7.39M
DATA,     475, 20070820 01:17:48,         1, STAT, table scan blocks gotten                ,         87211,      87211,     87.21k    87.21k
DATA,     475, 20070820 01:17:48,         1, STAT, buffer is pinned count                  ,           136,        136,        136       136
--  End of snap 2
DATA,     475, 20070820 01:17:49,         1, STAT, session logical reads                   ,         87580,      87580,     87.58k    87.58k
DATA,     475, 20070820 01:17:49,         1, STAT, consistent gets                         ,         87587,      87587,     87.59k    87.59k
DATA,     475, 20070820 01:17:49,         1, STAT, consistent gets from cache              ,         87587,      87587,     87.59k    87.59k
DATA,     475, 20070820 01:17:49,         1, STAT, calls to get snapshot scn: kcmgss       ,           552,        552,        552       552
DATA,     475, 20070820 01:17:49,         1, STAT, no work - consistent read gets          ,         87046,      87046,     87.05k    87.05k
DATA,     475, 20070820 01:17:49,         1, STAT, table scans (short tables)              ,           138,        138,        138       138
DATA,     475, 20070820 01:17:49,         1, STAT, table scan rows gotten                  ,       7375781,    7375781,      7.38M     7.38M
DATA,     475, 20070820 01:17:49,         1, STAT, table scan blocks gotten                ,         87041,      87041,     87.04k    87.04k
DATA,     475, 20070820 01:17:49,         1, STAT, buffer is pinned count                  ,           137,        137,        137       137
--  End of snap 3

PL/SQL procedure successfully completed.

The output contains 3 x 1 second snapshot of session 475 doing heavy nested looping. Note that even though the CPU time used was not updated, the logical IO counts for that session had still increased.
So this tool can be very valuable diagnosing what’s going on when the session seems to be 100% on CPU doing something.

[Read more...]

Sqlplus is my second home, Part 1: HTMLizing your sqlplus output

I have not managed to post anything for a while, but I intend to make it up by starting this series of posts made specially for Oracle enthusiasts, geeks and maniacs among us. Here I plan to post the coolest Oracle stuff I’ve just found out and some of it may actually be useful to you!

Lets start. This post is about removing the last major problem with sqlplus in everyday database and application administration work.

I would say the commnd line sqlplus, combined with its script execution and Windows cmd.exe’s command history navigation capabilities, is a very powerful and fast tool for database administration and troubleshooting. This is of course if you use a set of database administration scripts, either downloaded from some reliable source or accumulated over the years of working with Oracle ( you do have such scripts, right? ;-)

Now to the main weakness of sqlplus: I may have scripts carefully formatted for my screen size, however when adding more columns, I run out of screen width. When working on an application data quality troubleshooting task, I need to run some quick ad-hoc queries. Or run a query which just returns lots of data.

What usually happens in such cases is illustrated very well with the output of following query:

[Read more...]