SQL*Net message to client vs SQL*Net more data to client

After my last post about SQL*Net message to client wait event I had a follow-up question about what’s the difference between SQL*Net message to client and SQL*Net more data to client wait events. I’ll post the answer here:

The first session data unit (SDU) bufferful of return data is written to TCP socket buffer under “SQL*Net message to client” wait event.

If Oracle needs to write more data for a call than fits into the “first” SDU buffer, then further writes for that call are done under SQL*Net more data to client event.

So, whether and how much of the “SQL*Net more data to client” vs “SQL*Net message to client” waits you see depends on two things:

  • Amount of data returned to client per call
  • Oracle Net SDU size

A quick test with my Snapper performance tool illustrates this:

In one session run following:

SQL> set arraysize 1
SQL>
SQL> select rownum from dba_source;

    ROWNUM
----------
         1
         2
         3
         4
         5
         6
...many rows returned...

And I measure the waits in the other session:

SQL> @snapper out,gather=w 5 1 141

-- Session Snapper v1.06 by Tanel Poder ( http://www.tanelpoder.com )

---------------------------------------------------------------------------------------------------------------------------------------------
HEAD,     SID, SNAPSHOT START   ,  SECONDS, TYPE, STATISTIC                               ,         DELTA,      D/SEC,     HDELTA,     HD/SEC
---------------------------------------------------------------------------------------------------------------------------------------------
DATA,     141, 20080210 14:59:55,        5, WAIT, SQL*Net message to client               ,          6750,       1350,     6.75ms,     1.35ms
DATA,     141, 20080210 14:59:55,        5, WAIT, SQL*Net message from client             ,       4668258,     933652,      4.67s,   933.65ms
--  End of snap 1

No SQL*Net more data to client is evidenced above as due the arraysize = 1 setting Oracle fetches only 2 rows at a time. Therefore the returned resultset per fetch call always fits into the first SDU sized packet. Note that the reason why Oracle fetches 2 rows when arraysize is 1 is due an OCI optimization which tries to be clever and proactively detect end-of-resultset conditions when fetching a single row – that way there’s a chance to automatically cancel a cursor and release its pins without an explicit cursor cancel or close call. This should have positive effect reducing shared pool fragmentation in some conditions.

Anyway, now I set arraysize to 5000 and run the same SQL again:

[Read more...]

SQL*Net message to client wait isn’t really what it’s thought to be

In a recent Oracle Forum thread a question came up how to use SQL*Net message to client wait events for measuring network latency between server and client. The answer is that you can’t use it for network latency measurements at all, due how TCP stack works and how Oracle uses it.

I’ll paste my answer here too, for people who don’t follow Oracle Forums:

As I wrote in that reply, “SQL*Net message to client” does NOT measure network latency! It merely measures how long it took to put the response message into TCP send buffer on the server!

Once the response packet is put into TCP send buffer, Oracle server process continues on and starts waiting for “SQL*Net message FROM client” again. It’s up to TCP stack to deliver this packet from this point and Oracle server process has no way for measuring directly when did the packet arrive (it would have to start intercepting TCP ACK packets at kernel level for that).

This behaviour also explains, why the “SQL*Net message TO client” waits are usually unbelievably short – like 1 microsecond etc. The light in vacuum travels only 300 meters in one microsecond, the signal in a cable travels much less – yet when sending a single packet from London to Hong-Kong, we see few microsecond “response times”…

Example:

solaris02$ fgrep “SQL*Net message to client” sol10g_ora_837.trc | head
WAIT #8: nam=’SQL*Net message to client’ ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16418611294
WAIT #7: nam=’SQL*Net message to client’ ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16418644515
WAIT #5: nam=’SQL*Net message to client’ ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16428803213
WAIT #5: nam=’SQL*Net message to client’ ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16428880945
WAIT #5: nam=’SQL*Net message to client’ ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16428927443
WAIT #5: nam=’SQL*Net message to client’ ela= 0 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16428973661
WAIT #5: nam=’SQL*Net message to client’ ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16429019250
WAIT #5: nam=’SQL*Net message to client’ ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16429066742
WAIT #5: nam=’SQL*Net message to client’ ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16429114761
WAIT #5: nam=’SQL*Net message to client’ ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16429162471

Note that you can see longer times spent waiting for “SQL*Net message to client” when sending large amounts of data. This happens when your TCP send buffer gets full, thus TCP stack can not accept further packets. Sending will be blocked until the remote site sends back further ACK packets which state up to which byte in the TCP transmission stream it has received the data.

So, if you’re sending loads of data over a slow link or misconfigured TCP connection, the “SQL*Net message to client” wait time can be used as a low-confidence indicator of your SQL*Net throughput (in conjuction with “bytes sent via SQL*Net to client”), but never a measure of network latency!

Note that this is not OS platform specific, this is just how TCP and SQL*Net work.

Oracle hidden costs revealed, part 1 – Does a batch job run faster when executed locally?

This series is about revealing some Oracle’s internal execution costs and inefficiencies. I will analyze few situations and special cases where you can experience a performance hit where you normally wouldn’t expect to.

The first topic is about a question I saw in a recent Oracle Forum thread.

The question goes like this: “Is there any benefit if I run long sql queries from the server (by using telnet,etc) or from the remote by sql client.”

In order to leave out the network transfer cost of resultset for simplicity, I will rephrase the question like that: “Do I get better performance when I execute my server-side batch jobs (which don’t return any data to client) locally from the database server versus a remote application server or workstation?”

The obvious answer would be “NO, it does not matter where from you execute your batch job, as Oracle is a client server database system. All execution is done locally regardless of the client’s location, thus the performance is the same”.

While this sounds plausible in theory, there is (at least) one practical issue which can affect Oracle server performance depending on the clients platform and client libaries version.

It is caused by regular in-band break checking in client server communication channel where out of band break signalling is not available. A test case is below:

[Read more...]

Excellent article on Oracle 11g PL/SQL function result cache

I have so far avoided writing such pointer blog posts which only refer you to another article, but I have to do it with this one. Adrian Billington has written an excellent article on performance of Oracle 11g PL/SQL function result cache. His article is a good example of a thorough, well organized and well written technical content. I really enjoyed reading it and thanks to his thoroughness, he has just saved me some precious time doing that research on my own.

Get the article here: http://www.oracle-developer.net/display.php?id=504