SQL*Net message to client wait isn’t really what it’s thought to be

February 7th, 2008

In a recent Oracle Forum thread a question came up how to use SQL*Net message to client wait events for measuring network latency between server and client. The answer is that you can’t use it for network latency measurements at all, due how TCP stack works and how Oracle uses it.

I’ll paste my answer here too, for people who don’t follow Oracle Forums:

As I wrote in that reply, “SQL*Net message to client” does NOT measure network latency! It merely measures how long it took to put the response message into TCP send buffer on the server!

Once the response packet is put into TCP send buffer, Oracle server process continues on and starts waiting for “SQL*Net message FROM client” again. It’s up to TCP stack to deliver this packet from this point and Oracle server process has no way for measuring directly when did the packet arrive (it would have to start intercepting TCP ACK packets at kernel level for that).

This behaviour also explains, why the “SQL*Net message TO client” waits are usually unbelievably short – like 1 microsecond etc. The light in vacuum travels only 300 meters in one microsecond, the signal in a cable travels much less – yet when sending a single packet from London to Hong-Kong, we see few microsecond “response times”…

Example:

solaris02$ fgrep “SQL*Net message to client” sol10g_ora_837.trc | head
WAIT #8: nam=’SQL*Net message to client’ ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16418611294
WAIT #7: nam=’SQL*Net message to client’ ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16418644515
WAIT #5: nam=’SQL*Net message to client’ ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16428803213
WAIT #5: nam=’SQL*Net message to client’ ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16428880945
WAIT #5: nam=’SQL*Net message to client’ ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16428927443
WAIT #5: nam=’SQL*Net message to client’ ela= 0 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16428973661
WAIT #5: nam=’SQL*Net message to client’ ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16429019250
WAIT #5: nam=’SQL*Net message to client’ ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16429066742
WAIT #5: nam=’SQL*Net message to client’ ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16429114761
WAIT #5: nam=’SQL*Net message to client’ ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16429162471

Note that you can see longer times spent waiting for “SQL*Net message to client” when sending large amounts of data. This happens when your TCP send buffer gets full, thus TCP stack can not accept further packets. Sending will be blocked until the remote site sends back further ACK packets which state up to which byte in the TCP transmission stream it has received the data.

So, if you’re sending loads of data over a slow link or misconfigured TCP connection, the “SQL*Net message to client” wait time can be used as a low-confidence indicator of your SQL*Net throughput (in conjuction with “bytes sent via SQL*Net to client”), but never a measure of network latency!

Note that this is not OS platform specific, this is just how TCP and SQL*Net work.

  • Share/Bookmark

Tanel Poder
Internals, Performance, Troubleshooting

  1. Trackbacks

  1. Comments

  2. Brian Tkatch
    February 7th, 2008 at 23:24 | #1

    Thanx for the explanation!

  3. Reto
    February 8th, 2008 at 04:39 | #2

    What impact does the TCP.NODELAY option have on the sqlnet message to|from client response times?

  4. February 8th, 2008 at 16:13 | #3

    Disabling Nagle’s algorithm by setting TCP.NODELAY=YES should not make any difference to SQL*Net message TO client, at least not directly. Nagle’s algorithm affects when the TCP packet is sent from TCP buffer to the wire, but Oracle’s “SQL*Net message TO client” measures only the time spent putting the packet into TCP buffer.

    Disabling Nagle’s algorithm does help as far as “SQL*Net message FROM client” is considered. Enabled Nagle’s algorithm in conjuction with delayed ACK algorithm (which both were designed to reduce network traffic due “send 1-character at a time” type applications like telnet) do increase the total application layer packet roundtrip time. They delay outward packet sending and ACK returning in hope of piggybacking multiple small packets into a full size TCP-segment).

    There’s loads of info about Nagle’s algorithm and ACK delay in internet. One Oracle specific article is here:

    http://www.miracleas.dk/tools/Miracle_4_nagle.pdf

  5. February 9th, 2008 at 08:19 | #4

    Hi Tanel,

    GREAT NOte!

    The “SQL*Net message to client” is not the exact same event as “SQL*Net more data to client”, right?

    Can you explain the distinction?

  6. February 10th, 2008 at 15:39 | #5

    Don,

    I just wrote another post regarding your question, I hope it answers it.

  7. Srav
    July 16th, 2008 at 11:31 | #6

    Thanks Tanel for interesting articles/coments.
    Question on “SQL*Net message from Client” :

    My database server is located in Data Center1. I have applications which run from Datacenter1(DC1) and DataCenter2(DC2) which access the DB in DC1. DC1 and DC2 are physically apart (over 100 miles).

    A query from App1 (run from DC1) takes 5 minutes. The same query from App2 (run from DC2) takes 9 minutes. I ran extended trace and did TKprof on the 2 queries and found that the Parse/Execute/Fetch times are almost same. I found the difference of 4 minutes from the wait events. Both the queries had the same # of “SQL*Net message from client”, but the query from DC2 had 240 extra seconds on the Time waited.

    Since DC1 and DC2 are physically apart, ping from DC2 to DC1 has a lag of 2.8 msec. This probably could explain the longer wait times(sql*net message from client) in case 2.

    Now, the question is – How can I reduce the run times for the query run from DC2? Is it possible to tune this?
    Does TDU, MTU, TCP.NODELAY or any of these setting help my situation ?

    Appreciate your response
    -Srav

  8. July 16th, 2008 at 15:28 | #7

    Hi Srav,

    Yep, good diagnosis. If the parse,exec,fetch times are the same, the extra time likely comes from something external to database like network. It looks like the reports are returning lots of data.. right?

    Before looking into the Oracle parameters above I would first check two things:

    1) Whether the application is using too low fetch buffer size (arraysize). This info you can get from sql trace file (the r= column in FETCH lines). If its too low (like 1, 10 or even 100) you should try to change the application settings to increase the arraysize.

    2) Is there a network transport throughput bottleneck? This can be due overloaded link somewhere between two endpoints but often it’s just default TCP send window size settings as OS level which constrain the throughput when network roundtrip time is high (on solaris there are kernel tunables tcp_xmit_hiwat/tcp_recv_hiwat for example)

    I would approach the step 2 the way that I would measure what throughput I’m getting first. SCP is not a good option (if its OpenSSH based) as it fiddles around with TCP socket buffer sizes at application level. There are tools for measuring the max throughput, but I would start from sqlplus and oracle.

    1) Start sqlplus and set arraysize to 5000
    2) run a script which does select * from dba_source (and doesnt display it on screen – set termout off in the script)
    3) take snapshots from v$sesstat of that session fetching to measure how many bytes per second your throughput is (you can do it using my Snapper tool)

    If the bytes/s is significantly lower than what you’d expect from your network link then there might be some TCP tuning needed.

  9. Srav
    July 17th, 2008 at 04:49 | #8

    Appreciate your advice.

    I already did the arraysize analysis and as expected it did show improvements in the response time. Tremendously. But, as you know the developers are hesitant to make the code change at this time (guess, will have to force them to), but in the mean time I will try your 2nd recommendation.

    So, do you do think SDU/TDU/MTU settings would help? I tried the tcp.nodelay but to noavail.

    I love reading your articles/blog and find your responses very intelligent and efficient.

    Thanks

  10. July 17th, 2008 at 06:01 | #9

    Hi Srav, thanks :)

    TCP.NODELAY wouldnt help as it’s already enabled (Nagle’s buffering algorithm disabled) by default in recent Oracle versions (including 9i).

    Tuning SDU TDU settings would only help if current settings are lower from their optimal value (which is their max values in my opinion).

    But if I recall correctly the default values in recent Oracle versions already are set to their max (2kB and 32kB).

    Check this doc, it has some relevant test results:

    http://www.sun.com/blueprints/1002/817-0370-10.pdf

  11. July 17th, 2008 at 06:05 | #10

    More about TCP.NODELAY – this buffering algorithm was designed to reduce number of network packets sent when a lot of small messages needed to be sent (like someone typing in telnet, you could pack multiple “keystrokes” into a single TCP packet).

    In case of large network transfers this should not affect you as your packets will be packed full (this is unless you have ridiculously small SDU/TDU sizes..)

  12. Srav
    July 19th, 2008 at 04:55 | #11

    Thanks Tanel.
    The current settings of SDU/TDU were set at default(2k for 9i). I increased them to 32k, but still did not see any difference.

    The dev team will be making the change to the arraysize in their OCI calls.

    Another comment on TCP.nodelay. The setting of YES did not help my case either.

    Thanks for your help

  13. Mike
    January 16th, 2009 at 19:22 | #12

    Glen Fawcett wrote a SUN Blue Print paper in 2003 (Sun Part # 817-1781-10) and he says the SDU and TDU must be sized according to the MTU for the interface. On SUN Solaris the default MTU size for the NIC is 1500 and the SDU and TDU should be set to integral values of the MTU setting for the NIC. For example for an MTU size of 1500, SDU and TDU can be set to 15000. This is recommend to reduce the number and ave time waited for SQLNet wait events. Glen also says it is best to set the TDU = SDU. I have tried this method and I see it decreased the number and ave time waited for the SQLNet waits by half.

  14. September 10th, 2009 at 17:36 | #13

    Hi! I was surfing and found your blog post… nice! I love your blog. :) Cheers! Sandra. R.

  15. November 20th, 2009 at 12:10 | #14

    Sign: dgcph Hello!!! mxzsb and 8912htudqtpfqb and 9065 : Hi! I was surfing and found your blog post! nice! I just came across your blog and wanted to say that Ive really enjoyed it.

  16. December 15th, 2009 at 07:57 | #15

    In all lines, #bytes=1. So why bytes=1?

    WAIT #7: nam=’SQL*Net message to client’ ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16418644515
    WAIT #5: nam=’SQL*Net message to client’ ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16428803213
    W

  17. Amos
    December 26th, 2009 at 01:33 | #16

    Tanel,
    sql*net message to client and from client are the two top wait events in one of my SQL trace file. In the TKPROF formatted trace file, I have the following:
    Event waited on Times Max. Wait Total Waited
    —————————————- Waited ———- ————
    SQL*Net message to client 37776 0.00 0.11
    SQL*Net message from client 37776 0.56 21.76

    What’s the relationship between “Total Waited” and “Max Waited”? Is either the 0.56 seconds or 21.76 seconds (?) a measurement of one single execution?

    Thanks,
    Amos

  18. December 26th, 2009 at 10:09 | #17

    These numbers are measured across all databases calls, in other wrods things like parses, executions and fetches too (!) of that statement.

    So even if you execute once but fetch 10 times, the total waited will be the sum, the max waited will be maximum time waited during/between database calls.

  19. Amos
    December 31st, 2009 at 15:22 | #18

    Hi Tanel,
    Thanks for your reply.
    Do you know your 2010 US seminars schedule?
    Happy New Year!
    Amos

  20. Marjan Koloski
    February 15th, 2010 at 02:43 | #19

    Hi Tanel,
    on our production database Oracle Enterprise Edition 11.1.0.7 64 bit on Windows server 2003 due to the
    bug 6319685 LGWR posts do not scale on some platforms, we enabled the fix for this bug
    with setting undocumented parameter “_FG_SYNC_SLEEP_USECS”=1. Once we enable this parameter problem with the log file sync is solved but now we have the wait event SQL*Net message to client which is 20% of the total wait time.
    How SQL*Net message to client wait event is connected with the parameter _FG_SYNC_SLEEP_USECS ? How can we tune this ?
    Thanks
    Marjan

  21. February 15th, 2010 at 06:39 | #20

    @Marjan Koloski
    Hi Marjan,

    Can you run my snapper script (“snapper out 10 1 &sid”) on one of these sessions and paste me the output, so I’d have better overview of what’s going on. Perhaps the problem isn’t the SQL*Net message to client itself, but an instrumentation bug with this workaround.

  22. Marjan Koloski
    February 17th, 2010 at 17:12 | #21

    It was false alarm. Wrong presentation on the Quest Performance Analysis. In fact setting this parameter help with removing the log file sync wait events completely and reduced the oracle cpu utilization by 40%. In the top 5 wait events redo waits were replaced with network waits :) so we tought that we are hitting some side effect of the change. After comparing the AWR reports before and after the change we realized that there is no change in the network waits.
    Thanks and Best Regards
    Marjan

  23. Luis
    March 23rd, 2010 at 10:25 | #22

    Hi Tanel,
    where can I get you snapper script ?

  24. March 27th, 2010 at 21:29 | #23

    Luis, you can download Snapper here:

    http://tech.e2sn.com/oracle-scripts-and-tools/