In a recent Oracle Forum thread a question came up how to use SQL*Net message to client wait events for measuring network latency between server and client. The answer is that you can’t use it for network latency measurements at all, due how TCP stack works and how Oracle uses it.
I’ll paste my answer here too, for people who don’t follow Oracle Forums:
As I wrote in that reply, “SQL*Net message to client” does NOT measure network latency! It merely measures how long it took to put the response message into TCP send buffer on the server!
Once the response packet is put into TCP send buffer, Oracle server process continues on and starts waiting for “SQL*Net message FROM client” again. It’s up to TCP stack to deliver this packet from this point and Oracle server process has no way for measuring directly when did the packet arrive (it would have to start intercepting TCP ACK packets at kernel level for that).
This behaviour also explains, why the “SQL*Net message TO client” waits are usually unbelievably short – like 1 microsecond etc. The light in vacuum travels only 300 meters in one microsecond, the signal in a cable travels much less – yet when sending a single packet from London to Hong-Kong, we see few microsecond “response times”…
Example:
solaris02$ fgrep “SQL*Net message to client” sol10g_ora_837.trc | head
WAIT #8: nam=’SQL*Net message to client’ ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16418611294
WAIT #7: nam=’SQL*Net message to client’ ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16418644515
WAIT #5: nam=’SQL*Net message to client’ ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16428803213
WAIT #5: nam=’SQL*Net message to client’ ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16428880945
WAIT #5: nam=’SQL*Net message to client’ ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16428927443
WAIT #5: nam=’SQL*Net message to client’ ela= 0 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16428973661
WAIT #5: nam=’SQL*Net message to client’ ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16429019250
WAIT #5: nam=’SQL*Net message to client’ ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16429066742
WAIT #5: nam=’SQL*Net message to client’ ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16429114761
WAIT #5: nam=’SQL*Net message to client’ ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16429162471
Note that you can see longer times spent waiting for “SQL*Net message to client” when sending large amounts of data. This happens when your TCP send buffer gets full, thus TCP stack can not accept further packets. Sending will be blocked until the remote site sends back further ACK packets which state up to which byte in the TCP transmission stream it has received the data.
So, if you’re sending loads of data over a slow link or misconfigured TCP connection, the “SQL*Net message to client” wait time can be used as a low-confidence indicator of your SQL*Net throughput (in conjuction with “bytes sent via SQL*Net to client”), but never a measure of network latency!
Note that this is not OS platform specific, this is just how TCP and SQL*Net work.



Thanx for the explanation!
Comment by Brian Tkatch — February 7, 2008 @ 11:24 pm
What impact does the TCP.NODELAY option have on the sqlnet message to|from client response times?
Comment by Reto — February 8, 2008 @ 4:39 am
Disabling Nagle’s algorithm by setting TCP.NODELAY=YES should not make any difference to SQL*Net message TO client, at least not directly. Nagle’s algorithm affects when the TCP packet is sent from TCP buffer to the wire, but Oracle’s “SQL*Net message TO client” measures only the time spent putting the packet into TCP buffer.
Disabling Nagle’s algorithm does help as far as “SQL*Net message FROM client” is considered. Enabled Nagle’s algorithm in conjuction with delayed ACK algorithm (which both were designed to reduce network traffic due “send 1-character at a time” type applications like telnet) do increase the total application layer packet roundtrip time. They delay outward packet sending and ACK returning in hope of piggybacking multiple small packets into a full size TCP-segment).
There’s loads of info about Nagle’s algorithm and ACK delay in internet. One Oracle specific article is here:
http://www.miracleas.dk/tools/Miracle_4_nagle.pdf
Comment by tanelp — February 8, 2008 @ 4:13 pm
Hi Tanel,
GREAT NOte!
The “SQL*Net message to client” is not the exact same event as “SQL*Net more data to client”, right?
Can you explain the distinction?
Comment by Donald K. Burleson — February 9, 2008 @ 8:19 am
[...] Filed under: Internals, Networking, Oracle, Performance — tanelp @ 3:35 pm After my last post about SQL*Net message to client wait event I had a follow-up question about what’s the difference between SQL*Net message to client and [...]
Pingback by SQL*Net message to client vs SQL*Net more data to client « Tanel Poder’s blog: Core IT for geeks and pros — February 10, 2008 @ 3:36 pm
Don,
I just wrote another post regarding your question, I hope it answers it.
Comment by tanelp — February 10, 2008 @ 3:39 pm
Thanks Tanel for interesting articles/coments.
Question on “SQL*Net message from Client” :
My database server is located in Data Center1. I have applications which run from Datacenter1(DC1) and DataCenter2(DC2) which access the DB in DC1. DC1 and DC2 are physically apart (over 100 miles).
A query from App1 (run from DC1) takes 5 minutes. The same query from App2 (run from DC2) takes 9 minutes. I ran extended trace and did TKprof on the 2 queries and found that the Parse/Execute/Fetch times are almost same. I found the difference of 4 minutes from the wait events. Both the queries had the same # of “SQL*Net message from client”, but the query from DC2 had 240 extra seconds on the Time waited.
Since DC1 and DC2 are physically apart, ping from DC2 to DC1 has a lag of 2.8 msec. This probably could explain the longer wait times(sql*net message from client) in case 2.
Now, the question is – How can I reduce the run times for the query run from DC2? Is it possible to tune this?
Does TDU, MTU, TCP.NODELAY or any of these setting help my situation ?
Appreciate your response
-Srav
Comment by Srav — July 16, 2008 @ 11:31 am
Hi Srav,
Yep, good diagnosis. If the parse,exec,fetch times are the same, the extra time likely comes from something external to database like network. It looks like the reports are returning lots of data.. right?
Before looking into the Oracle parameters above I would first check two things:
1) Whether the application is using too low fetch buffer size (arraysize). This info you can get from sql trace file (the r= column in FETCH lines). If its too low (like 1, 10 or even 100) you should try to change the application settings to increase the arraysize.
2) Is there a network transport throughput bottleneck? This can be due overloaded link somewhere between two endpoints but often it’s just default TCP send window size settings as OS level which constrain the throughput when network roundtrip time is high (on solaris there are kernel tunables tcp_xmit_hiwat/tcp_recv_hiwat for example)
I would approach the step 2 the way that I would measure what throughput I’m getting first. SCP is not a good option (if its OpenSSH based) as it fiddles around with TCP socket buffer sizes at application level. There are tools for measuring the max throughput, but I would start from sqlplus and oracle.
1) Start sqlplus and set arraysize to 5000
2) run a script which does select * from dba_source (and doesnt display it on screen – set termout off in the script)
3) take snapshots from v$sesstat of that session fetching to measure how many bytes per second your throughput is (you can do it using my Snapper tool)
If the bytes/s is significantly lower than what you’d expect from your network link then there might be some TCP tuning needed.
Comment by tanelp — July 16, 2008 @ 3:28 pm
Appreciate your advice.
I already did the arraysize analysis and as expected it did show improvements in the response time. Tremendously. But, as you know the developers are hesitant to make the code change at this time (guess, will have to force them to), but in the mean time I will try your 2nd recommendation.
So, do you do think SDU/TDU/MTU settings would help? I tried the tcp.nodelay but to noavail.
I love reading your articles/blog and find your responses very intelligent and efficient.
Thanks
Comment by Srav — July 17, 2008 @ 4:49 am
Hi Srav, thanks :)
TCP.NODELAY wouldnt help as it’s already enabled (Nagle’s buffering algorithm disabled) by default in recent Oracle versions (including 9i).
Tuning SDU TDU settings would only help if current settings are lower from their optimal value (which is their max values in my opinion).
But if I recall correctly the default values in recent Oracle versions already are set to their max (2kB and 32kB).
Check this doc, it has some relevant test results:
http://www.sun.com/blueprints/1002/817-0370-10.pdf
Comment by tanelp — July 17, 2008 @ 6:01 am
More about TCP.NODELAY – this buffering algorithm was designed to reduce number of network packets sent when a lot of small messages needed to be sent (like someone typing in telnet, you could pack multiple “keystrokes” into a single TCP packet).
In case of large network transfers this should not affect you as your packets will be packed full (this is unless you have ridiculously small SDU/TDU sizes..)
Comment by tanelp — July 17, 2008 @ 6:05 am
Thanks Tanel.
The current settings of SDU/TDU were set at default(2k for 9i). I increased them to 32k, but still did not see any difference.
The dev team will be making the change to the arraysize in their OCI calls.
Another comment on TCP.nodelay. The setting of YES did not help my case either.
Thanks for your help
Comment by Srav — July 19, 2008 @ 4:55 am
Glen Fawcett wrote a SUN Blue Print paper in 2003 (Sun Part # 817-1781-10) and he says the SDU and TDU must be sized according to the MTU for the interface. On SUN Solaris the default MTU size for the NIC is 1500 and the SDU and TDU should be set to integral values of the MTU setting for the NIC. For example for an MTU size of 1500, SDU and TDU can be set to 15000. This is recommend to reduce the number and ave time waited for SQLNet wait events. Glen also says it is best to set the TDU = SDU. I have tried this method and I see it decreased the number and ave time waited for the SQLNet waits by half.
Comment by Mike — January 16, 2009 @ 7:22 pm