SQL*Net message to client wait isn’t really what it’s thought to be

In a recent Oracle Forum thread a question came up how to use SQL*Net message to client wait events for measuring network latency between server and client. The answer is that you can’t use it for network latency measurements at all, due how TCP stack works and how Oracle uses it.

I’ll paste my answer here too, for people who don’t follow Oracle Forums:

As I wrote in that reply, “SQL*Net message to client” does NOT measure network latency! It merely measures how long it took to put the response message into TCP send buffer on the server!

Once the response packet is put into TCP send buffer, Oracle server process continues on and starts waiting for “SQL*Net message FROM client” again. It’s up to TCP stack to deliver this packet from this point and Oracle server process has no way for measuring directly when did the packet arrive (it would have to start intercepting TCP ACK packets at kernel level for that).

This behaviour also explains, why the “SQL*Net message TO client” waits are usually unbelievably short – like 1 microsecond etc. The light in vacuum travels only 300 meters in one microsecond, the signal in a cable travels much less – yet when sending a single packet from London to Hong-Kong, we see few microsecond “response times”…

Example:

solaris02$ fgrep “SQL*Net message to client” sol10g_ora_837.trc | head
WAIT #8: nam=’SQL*Net message to client’ ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16418611294
WAIT #7: nam=’SQL*Net message to client’ ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16418644515
WAIT #5: nam=’SQL*Net message to client’ ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16428803213
WAIT #5: nam=’SQL*Net message to client’ ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16428880945
WAIT #5: nam=’SQL*Net message to client’ ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16428927443
WAIT #5: nam=’SQL*Net message to client’ ela= 0 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16428973661
WAIT #5: nam=’SQL*Net message to client’ ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16429019250
WAIT #5: nam=’SQL*Net message to client’ ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16429066742
WAIT #5: nam=’SQL*Net message to client’ ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16429114761
WAIT #5: nam=’SQL*Net message to client’ ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16429162471

Note that you can see longer times spent waiting for “SQL*Net message to client” when sending large amounts of data. This happens when your TCP send buffer gets full, thus TCP stack can not accept further packets. Sending will be blocked until the remote site sends back further ACK packets which state up to which byte in the TCP transmission stream it has received the data.

So, if you’re sending loads of data over a slow link or misconfigured TCP connection, the “SQL*Net message to client” wait time can be used as a low-confidence indicator of your SQL*Net throughput (in conjuction with “bytes sent via SQL*Net to client”), but never a measure of network latency!

Note that this is not OS platform specific, this is just how TCP and SQL*Net work.

Oracle hidden costs revealed, part 1 – Does a batch job run faster when executed locally?

This series is about revealing some Oracle’s internal execution costs and inefficiencies. I will analyze few situations and special cases where you can experience a performance hit where you normally wouldn’t expect to.

The first topic is about a question I saw in a recent Oracle Forum thread.

The question goes like this: “Is there any benefit if I run long sql queries from the server (by using telnet,etc) or from the remote by sql client.”

In order to leave out the network transfer cost of resultset for simplicity, I will rephrase the question like that: “Do I get better performance when I execute my server-side batch jobs (which don’t return any data to client) locally from the database server versus a remote application server or workstation?”

The obvious answer would be “NO, it does not matter where from you execute your batch job, as Oracle is a client server database system. All execution is done locally regardless of the client’s location, thus the performance is the same”.

While this sounds plausible in theory, there is (at least) one practical issue which can affect Oracle server performance depending on the clients platform and client libaries version.

It is caused by regular in-band break checking in client server communication channel where out of band break signalling is not available. A test case is below:

[Read more...]

Excellent article on Oracle 11g PL/SQL function result cache

I have so far avoided writing such pointer blog posts which only refer you to another article, but I have to do it with this one. Adrian Billington has written an excellent article on performance of Oracle 11g PL/SQL function result cache. His article is a good example of a thorough, well organized and well written technical content. I really enjoyed reading it and thanks to his thoroughness, he has just saved me some precious time doing that research on my own.

Get the article here: http://www.oracle-developer.net/display.php?id=504

Systematic application troubleshooting in Unix

How many times have you seen a following case, where a user or developer complains that their Oracle session is stuck or running very slowly and the person who starts investigating the issue does following:

  1. Checks the database for locks
  2. Checks free disk space
  3. Checks alert log
  4. Goes back to the client saying “we did a healthcheck and everything looks ok” and closes the case or asks the user/developer to contact application support team or tune their SQL

The point here is that what the heck do the database locks, alert log or disk space have to do with first round session troubleshooting, when Oracle provides just about everything you need in one simple view?

Yes, I am talking about sampling V$SESSION_WAIT here. Database locks, free space and potential errors in alert log may have something to do with your users problems, but not necessarily. As there are many more causes, like network issues etc which could affect your user (and the whole database), it doesn’t make sense to go through all those random “healthchecks” every time you receive a user phone call. Moreover, even if you identify that there is shortage of disk space or there are many database locks – so what? They may not have anything to do with the users problem.

The issue here is that still many people do not know about V$SESSION_WAIT which in most cases shows your problem immediately or at least points you to right direction (e.g. there’s no need to check for locks if your session is waiting on “log file switch (archiving needed)” wait – and vice versa). Even if “these people” have heard of V$SESSION_WAIT and may be able to drop this in during their job interview, they may not know how to use it in systematic troubleshooting context. Many hours of service downtime and user frustration would be saved if all DBAs knew this extremely simple concept of looking at V$SESSION_WAIT.

This blog entry is not about Oracle though, so I will leave this rant for a future blog post.

This post is about a similar problem in Unix world. Having been involved with resolving some serious production issues lately I have been surprised quite many times by the corporate Unix support people who seem to do behave in similar manner. For example, there is a user calling in saying that their scheduled Unix job, which normally takes 5 minutes, has been running for hours now. The “senior unix support analyst” will do following:

[Read more...]

How to resolve SQL object and column names all the way to base tables and columns in Oracle?

If you have been involved in tuning SQL code which you have never seen before, you are probably familiar with the challenges of understanding what the code is trying to do. This can be especially time consuming when the SQL references lots of views, which reference views, which reference more views etc. So there may be a large information gap between the SQL statement (like select * from some_crazy_10_level_view) and the actual execution plan (referencing 10s of tables, with evidence of query transformations).

So unless you see something really obvious from the execution plan, you need to start mapping the SQL query and view texts back to the physical base tables which Oracle eventually has to access. This can be a tedious and boring (!) process.

The good news is that in Oracle 10.2+ there’s a hidden parameter that can do this mapping task for us.

Update: It looks like 9.2.0.8 patchset also has this parameter (so I guess later patchsets of 10.1.0.x have it as well now)

Let’s see an example:

I create a view on a view to illustrate the point:

SQL> create view myview as select * from all_users;

View created.

Now let’s set that parameter _dump_qbc_tree to 1 and run a query against the view:

SQL> alter session set "_dump_qbc_tree"=1;

Session altered.

SQL> select count(*) from myview;

  COUNT(*)
----------
        31

Now let’s look into the server process tracefile:

*** ACTION NAME:() 2007-09-16 12:19:57.500
*** MODULE NAME:(SQL*Plus) 2007-09-16 12:19:57.500
*** SERVICE NAME:(SYS$USERS) 2007-09-16 12:19:57.500
*** SESSION ID:(146.1984) 2007-09-16 12:19:57.500
QCSDMP: -------------------------------------------------------
QCSDMP:  SELECT: (qbc=2B8D1C28)
QCSDMP:    . (COUNT(*)) (opntyp=2 opndty=0)
QCSDMP:  FROM:
QCSDMP:    .MYVIEW
QCSDMP:      VQB:
QCSDMP:        SELECT: (qbc=2B8D163C)
QCSDMP:          .USERNAME
QCSDMP:        FROM:
QCSDMP:          .ALL_USERS
QCSDMP:            VQB:
QCSDMP:              SELECT: (qbc=2B8CAF78)
QCSDMP:                U.NAME (USERNAME)
QCSDMP:              FROM:
QCSDMP:                SYS.TS$ (TTS)
QCSDMP:                SYS.TS$ (DTS)
QCSDMP:                SYS.USER$ (U)

Here it is, the query text generated directly from parse tree, showing the base tables regardless that they had been hidden behind multiple views.

Also there’s few interesting things to note:

[Read more...]

Session-level statspack

Statspack is a useful tool for easily gathering and reporting some Oracle’s historical workload statistics. However it has its limitations and problems:

One of them that in past it used to record only Oracle statistics, measured from inside Oracle. This made Oracle and people using Oracle ignore other crucial statistics like OS workload below Oracle. For example statspack may show you heavy latch contention as the performance problem, while looking at vmstat output one would see that the server just has been heavily overloaded with numerous other jobs (like multiple backup, export and compress jobs overrunning their run-window) and the latch contention is just a symptom of CPU starvation.

This has been somewhat addressed by adding V$OSSTAT to Oracle 10g and statspack now gathers this info as well. Now we need people to start looking into those stats.

Another problem with statspack is which is described in Dan Fink’s blog, about cursor-level CPU accounting limitations before 10gR2.

However neither of those problems are the main limitations of statspack. The main problem is that statspack samples it’s statistic snapshots from system level views like V$SYSSTAT and V$SYSTEM_EVENT. If every single session in your database is performing exactly the same (kind) of operation, then system level aggregates might be quite OK to get an overview what’s going on in the database. But from the moment you start having different kind of sessions (e.g. OLTP vs batch vs data feed vs report) in your database, then system level aggregates used by statspack do heavily distort what the reality looks like for specific sessions.

There’s another gotcha – by default statspack excludes the “SQL*Net message from client” wait event from main report section, stating that it is not important to look into idle events. However by that we will throw away an important piece of information when diagnosing end-to-end performance, the end user experience. The bad thing is that even if we did include the SQL*Net wait event in our reports (by deleting the corresponding row from STATS$IDLE_EVENT), then we would still have no idea how many of those gazillions of system-wide SQL*Net client wait seconds do belong to our specific session or user.
Statspack doesn’t measure session-experience, it just gives you a system-wide aggregate, which cannot be translated back to individual session statistics (just as you can not convert a hash value back to original value – most of the information is just lost!).

How to get session-level detailed overview of database performance, with historical reporting capability then?

Oracle 10g addresses this partially with ASH and AWR and DBMS_MONITOR’s selective statistic sampling (Search for V$SERV_MOD_ACT_STATS).

While ASH and AWR are both awesome products, they are not usable in Standard Edition nor before 10g ( you may want to check out Kyle Hailey’s ASH simulator instead ) and for using them you need to pay a separate license fee.

So let me introduce my attempt to solve those problems ( NB! Free stuff!!! ;):

  • The poor-man’s version of Automatic Workload Repository: It’s called… um… Semi-Automatic Workload Repository :)
  • The clever-man’s addition to statspack, which takes session-level snapshots: In other words, Sesspack.

The idea is very simple, just take snapshots of V$SESSION_EVENT and some V$SESSTAT (and V$SESS_TIME_MODEL in 10g) statistics and store those in a repository just like statspack does. Snapshots are taken using sesspack.snap_xyz procedures. As a parameter to these procedures I can pass a specific SID, a group of SIDs, a Oracle db username a OS-username or whatever filtering fields can be found from V$SESSION.

Once you have taken your snapshots when running the workload, you can run a report which calculates statistic and wait event deltas between snapshots and that it.

A demo?

[Read more...]

A gotcha with parallel index builds, parallel degree and query plans

Reading the following article about PARALLEL hint by Jonathan Lewis made me remember a somewhat related gotcha with parallelism.

Often when creating (or rebuilding) an index on a large table, doing it with PARALLEL x option makes it go faster – usually in case when your IO subsystem is not the bottleneck and you have enough spare CPU capacity to throw in.

A small example below:


Tanel@Sol01> create table t1 as select * from all_objects;

Table created.

Tanel@Sol01> create index i1 on t1(object_id);

Index created.

Tanel@Sol01> exec dbms_stats.gather_table_stats(user, 'T1', cascade=>true, no_invalidate=>false);

PL/SQL procedure successfully completed.

Ok, for whatever reason I need to rebuild my index, and for speed I do it in parallel:


Tanel@Sol01> alter index i1 rebuild parallel 4;

Index altered.

Tanel@Sol01>
Tanel@Sol01>
Tanel@Sol01> select
  2     sum(object_id)
  3  from
  4     t1
  5  where
  6     object_id > 60000
  7  /

SUM(OBJECT_ID)
--------------
      13233374

Tanel@Sol01>
Tanel@Sol01> @x

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------
Plan hash value: 3900446664

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |     5 |     8   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE   |      |     1 |     5 |            |          |
|*  2 |   INDEX RANGE SCAN| I1   |  2923 | 14615 |     8   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("OBJECT_ID">60000)

14 rows selected.

The execution plan shows a nice serial range scan for above query. Let’s run the same query with a different value for object_id:


Tanel@Sol01> select
  2     sum(object_id)
  3  from
  4     t1
  5  where
  6     object_id > 10000
  7  /

SUM(OBJECT_ID)
--------------
    1294174783

Tanel@Sol01>
Tanel@Sol01> @x

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------
Plan hash value: 2596547647

-------------------------------------------------------------------------------------------------------------------
| Id  | Operation                 | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
-------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |          |     1 |     5 |    19   (0)| 00:00:02 |        |      |            |
|   1 |  SORT AGGREGATE           |          |     1 |     5 |            |          |        |      |            |
|   2 |   PX COORDINATOR          |          |       |       |            |          |        |      |            |
|   3 |    PX SEND QC (RANDOM)    | :TQ10000 |     1 |     5 |            |          |  Q1,00 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE        |          |     1 |     5 |            |          |  Q1,00 | PCWP |            |
|   5 |      PX BLOCK ITERATOR    |          | 42937 |   209K|    19   (0)| 00:00:02 |  Q1,00 | PCWC |            |
|*  6 |       INDEX FAST FULL SCAN| I1       | 42937 |   209K|    19   (0)| 00:00:02 |  Q1,00 | PCWP |            |
-------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   6 - filter("OBJECT_ID">10000)

18 rows selected.

What? Suddenly my query turned parallel !

I haven’t enabled parallelism for my table! How can Oracle go parallel without my consent?

Tanel@Sol01> select table_name, degree from user_tables  where table_name = 'T1';

TABLE_NAME                     DEGREE
------------------------------ ----------------------------------------
T1                                      1

The answer lies in the result of next query:

Tanel@Sol01> select index_name, degree from user_indexes where table_name = 'T1';

INDEX_NAME                     DEGREE
------------------------------ ----------------------------------------
I1                             4

Parallel index (re)build will persistently set the index parallel degree in data dictionary to the value used during build!

This enables the CBO to be free to consider also parallel query plans and in our second select case a parallel plan seemed to be the best.

Whether this parallel plan actually is the most efficient way to go is a separate question, however this kind of unplanned parallelism may destabilize your system performance, especially as it can kick in only for certain instantiations of your SQL statement. Note that even one parallel-enabled object in your execution plan can parallelize the whole query joining many tables (just as even one table with statistics in a join turns on CBO for the whole cursor).

Combined with bind variable peeking side-effects and way too high parallel_max_servers value (hey it’s just a max value, let’s set it to 500), this can bring your OLTP system to knees at very unexpected times.

So, as my database does not have parallelism planned into it, I will eliminate the troublemaker:

Tanel@Sol01> alter index i1 noparallel;

Index altered.

Tanel@Sol01>
Tanel@Sol01> select
  2     sum(object_id)
  3  from
  4     t1
  5  where
  6     object_id > 10000
  7  /

SUM(OBJECT_ID)
--------------
    1294174783

Tanel@Sol01>
Tanel@Sol01> @x

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------
Plan hash value: 129980005

------------------------------------------------------------------------------
| Id  | Operation             | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |     1 |     5 |    19   (0)| 00:00:02 |
|   1 |  SORT AGGREGATE       |      |     1 |     5 |            |          |
|*  2 |   INDEX FAST FULL SCAN| I1   | 42937 |   209K|    19   (0)| 00:00:02 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("OBJECT_ID">10000)

14 rows selected.

So the key point is that unless your databases have planned and managed parallelism used in them, it’s worth to run the following query to identify potential troublemakers and disable their parallelism:

SELECT
	'INDEX' OBJECT_TYPE, OWNER, INDEX_NAME, TRIM(DEGREE)
FROM
	DBA_INDEXES
WHERE
	TRIM(DEGREE) > TO_CHAR(1)
UNION ALL
SELECT
	'TABLE', OWNER, TABLE_NAME, TRIM(DEGREE)
FROM
	DBA_TABLES
WHERE
	TRIM(DEGREE) > TO_CHAR(1)
/

On my test environment it returned the following rows:

OBJEC OWNER                          INDEX_NAME                     TRIM(DEGREE)
----- ------------------------------ ------------------------------ -------------
INDEX SYS                            UTL_RECOMP_SORT_IDX1           DEFAULT
TABLE TANEL                          T                              4

From here we see two addtional things:

  • Parallel operations also persist their degree to tables (using alter table move parallel x or CTAS for example)
  • There’s a parallel degree DEFAULT – which is used when you let the appropriate degree to be decided by optimizer

On the other hand, if you have planned for parallelism, then you probably want to keep the parallelism for tables and indexes consistent, e.g. enable it for all tables requiring parallelism and their indexes, not just for couple of indexes by accident.

Update:
Thanks to Adrian Billington for reminding me that also NOLOGGING flag will stick in data dictionary should you perform nologging operations on an object. You should review those too after rebuilds and reorgs.

My version of SQL string to table tokenizer

This one’s a short post on a fairly random topic as unfortunately I don’t have time today to come up with anything deeper :)

I needed to come up with a delimited string to table tokenizer for an Oracle development project. There are quite a few examples out there how to do that, including Adrian Billington’s www.oracle-developer.net and the http://technology.amis.nl/blog/?p=1631.

So far the simplest solution I had seen was using a bunch of INSTR’s, SUBSTR’s and DECODE’s in a CONNECT BY loop.
However as this application is using Oracle 10g, I could use regular expressions to make the SQL even shorter:

Tanel@Sol01> define separator=":"
Tanel@Sol01> define mystring="A:BCD::EFG:H:IJKL"
Tanel@Sol01>
Tanel@Sol01> SELECT
  2             LEVEL,
  3             REGEXP_REPLACE(
  4                     REGEXP_SUBSTR( '&mystring'||'&separator', '(.*?)&separator', 1, LEVEL )
  5                     , '&separator$'
  6                     , ''
  7             ) TOKEN
  8  FROM
  9     DUAL
 10  CONNECT BY
 11     REGEXP_INSTR( '&mystring'||'&separator', '(.*?)&separator', 1, LEVEL ) > 0
 12  ORDER BY
 13     LEVEL ASC
 14  /

LEVEL TOKEN
----- ----------
    1 A
    2 BCD
    3
    4 EFG
    5 H
    6 IJKL

6 rows selected.

For anyone not yet familiar with regular expressions, I definitely recommend to get to know at least the basics – they can save you hours (or even days) of work in some cases. And the cool thing is that regular expressions are fairly standard across the development platforms too (e.g. if you know Oracle regexp, then it’s very easy to handle Java, perl or C# regexp too).

And regexp is not only a developer thing, it can be very useful for everyday production DBAs as well. For example, every decent “grep” utility has regexp support built in – this makes search for complex conditions from log or config files easier for you.