SQL*Net message to client wait isn’t really what it’s thought to be

In a recent Oracle Forum thread a question came up how to use SQL*Net message to client wait events for measuring network latency between server and client. The answer is that you can’t use it for network latency measurements at all, due how TCP stack works and how Oracle uses it.

I’ll paste my answer here too, for people who don’t follow Oracle Forums:

As I wrote in that reply, “SQL*Net message to client” does NOT measure network latency! It merely measures how long it took to put the response message into TCP send buffer on the server!

Once the response packet is put into TCP send buffer, Oracle server process continues on and starts waiting for “SQL*Net message FROM client” again. It’s up to TCP stack to deliver this packet from this point and Oracle server process has no way for measuring directly when did the packet arrive (it would have to start intercepting TCP ACK packets at kernel level for that).

This behaviour also explains, why the “SQL*Net message TO client” waits are usually unbelievably short – like 1 microsecond etc. The light in vacuum travels only 300 meters in one microsecond, the signal in a cable travels much less – yet when sending a single packet from London to Hong-Kong, we see few microsecond “response times”…

Example:

solaris02$ fgrep “SQL*Net message to client” sol10g_ora_837.trc | head
WAIT #8: nam=’SQL*Net message to client’ ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16418611294
WAIT #7: nam=’SQL*Net message to client’ ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16418644515
WAIT #5: nam=’SQL*Net message to client’ ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16428803213
WAIT #5: nam=’SQL*Net message to client’ ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16428880945
WAIT #5: nam=’SQL*Net message to client’ ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16428927443
WAIT #5: nam=’SQL*Net message to client’ ela= 0 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16428973661
WAIT #5: nam=’SQL*Net message to client’ ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16429019250
WAIT #5: nam=’SQL*Net message to client’ ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16429066742
WAIT #5: nam=’SQL*Net message to client’ ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16429114761
WAIT #5: nam=’SQL*Net message to client’ ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16429162471

Note that you can see longer times spent waiting for “SQL*Net message to client” when sending large amounts of data. This happens when your TCP send buffer gets full, thus TCP stack can not accept further packets. Sending will be blocked until the remote site sends back further ACK packets which state up to which byte in the TCP transmission stream it has received the data.

So, if you’re sending loads of data over a slow link or misconfigured TCP connection, the “SQL*Net message to client” wait time can be used as a low-confidence indicator of your SQL*Net throughput (in conjuction with “bytes sent via SQL*Net to client”), but never a measure of network latency!

Note that this is not OS platform specific, this is just how TCP and SQL*Net work.

This entry was posted in Uncategorized and tagged , , . Bookmark the permalink.

30 Responses to SQL*Net message to client wait isn’t really what it’s thought to be

  1. Brian Tkatch says:

    Thanx for the explanation!

  2. Reto says:

    What impact does the TCP.NODELAY option have on the sqlnet message to|from client response times?

  3. tanelp says:

    Disabling Nagle’s algorithm by setting TCP.NODELAY=YES should not make any difference to SQL*Net message TO client, at least not directly. Nagle’s algorithm affects when the TCP packet is sent from TCP buffer to the wire, but Oracle’s “SQL*Net message TO client” measures only the time spent putting the packet into TCP buffer.

    Disabling Nagle’s algorithm does help as far as “SQL*Net message FROM client” is considered. Enabled Nagle’s algorithm in conjuction with delayed ACK algorithm (which both were designed to reduce network traffic due “send 1-character at a time” type applications like telnet) do increase the total application layer packet roundtrip time. They delay outward packet sending and ACK returning in hope of piggybacking multiple small packets into a full size TCP-segment).

    There’s loads of info about Nagle’s algorithm and ACK delay in internet. One Oracle specific article is here:

    http://www.miracleas.dk/tools/Miracle_4_nagle.pdf

  4. Hi Tanel,

    GREAT NOte!

    The “SQL*Net message to client” is not the exact same event as “SQL*Net more data to client”, right?

    Can you explain the distinction?

  5. tanelp says:

    Don,

    I just wrote another post regarding your question, I hope it answers it.

  6. Srav says:

    Thanks Tanel for interesting articles/coments.
    Question on “SQL*Net message from Client” :

    My database server is located in Data Center1. I have applications which run from Datacenter1(DC1) and DataCenter2(DC2) which access the DB in DC1. DC1 and DC2 are physically apart (over 100 miles).

    A query from App1 (run from DC1) takes 5 minutes. The same query from App2 (run from DC2) takes 9 minutes. I ran extended trace and did TKprof on the 2 queries and found that the Parse/Execute/Fetch times are almost same. I found the difference of 4 minutes from the wait events. Both the queries had the same # of “SQL*Net message from client”, but the query from DC2 had 240 extra seconds on the Time waited.

    Since DC1 and DC2 are physically apart, ping from DC2 to DC1 has a lag of 2.8 msec. This probably could explain the longer wait times(sql*net message from client) in case 2.

    Now, the question is – How can I reduce the run times for the query run from DC2? Is it possible to tune this?
    Does TDU, MTU, TCP.NODELAY or any of these setting help my situation ?

    Appreciate your response
    -Srav

  7. tanelp says:

    Hi Srav,

    Yep, good diagnosis. If the parse,exec,fetch times are the same, the extra time likely comes from something external to database like network. It looks like the reports are returning lots of data.. right?

    Before looking into the Oracle parameters above I would first check two things:

    1) Whether the application is using too low fetch buffer size (arraysize). This info you can get from sql trace file (the r= column in FETCH lines). If its too low (like 1, 10 or even 100) you should try to change the application settings to increase the arraysize.

    2) Is there a network transport throughput bottleneck? This can be due overloaded link somewhere between two endpoints but often it’s just default TCP send window size settings as OS level which constrain the throughput when network roundtrip time is high (on solaris there are kernel tunables tcp_xmit_hiwat/tcp_recv_hiwat for example)

    I would approach the step 2 the way that I would measure what throughput I’m getting first. SCP is not a good option (if its OpenSSH based) as it fiddles around with TCP socket buffer sizes at application level. There are tools for measuring the max throughput, but I would start from sqlplus and oracle.

    1) Start sqlplus and set arraysize to 5000
    2) run a script which does select * from dba_source (and doesnt display it on screen – set termout off in the script)
    3) take snapshots from v$sesstat of that session fetching to measure how many bytes per second your throughput is (you can do it using my Snapper tool)

    If the bytes/s is significantly lower than what you’d expect from your network link then there might be some TCP tuning needed.

  8. Srav says:

    Appreciate your advice.

    I already did the arraysize analysis and as expected it did show improvements in the response time. Tremendously. But, as you know the developers are hesitant to make the code change at this time (guess, will have to force them to), but in the mean time I will try your 2nd recommendation.

    So, do you do think SDU/TDU/MTU settings would help? I tried the tcp.nodelay but to noavail.

    I love reading your articles/blog and find your responses very intelligent and efficient.

    Thanks

  9. tanelp says:

    Hi Srav, thanks :)

    TCP.NODELAY wouldnt help as it’s already enabled (Nagle’s buffering algorithm disabled) by default in recent Oracle versions (including 9i).

    Tuning SDU TDU settings would only help if current settings are lower from their optimal value (which is their max values in my opinion).

    But if I recall correctly the default values in recent Oracle versions already are set to their max (2kB and 32kB).

    Check this doc, it has some relevant test results:

    http://www.sun.com/blueprints/1002/817-0370-10.pdf

  10. tanelp says:

    More about TCP.NODELAY – this buffering algorithm was designed to reduce number of network packets sent when a lot of small messages needed to be sent (like someone typing in telnet, you could pack multiple “keystrokes” into a single TCP packet).

    In case of large network transfers this should not affect you as your packets will be packed full (this is unless you have ridiculously small SDU/TDU sizes..)

  11. Srav says:

    Thanks Tanel.
    The current settings of SDU/TDU were set at default(2k for 9i). I increased them to 32k, but still did not see any difference.

    The dev team will be making the change to the arraysize in their OCI calls.

    Another comment on TCP.nodelay. The setting of YES did not help my case either.

    Thanks for your help

  12. Mike says:

    Glen Fawcett wrote a SUN Blue Print paper in 2003 (Sun Part # 817-1781-10) and he says the SDU and TDU must be sized according to the MTU for the interface. On SUN Solaris the default MTU size for the NIC is 1500 and the SDU and TDU should be set to integral values of the MTU setting for the NIC. For example for an MTU size of 1500, SDU and TDU can be set to 15000. This is recommend to reduce the number and ave time waited for SQLNet wait events. Glen also says it is best to set the TDU = SDU. I have tried this method and I see it decreased the number and ave time waited for the SQLNet waits by half.

  13. sandrar says:

    Hi! I was surfing and found your blog post… nice! I love your blog. :) Cheers! Sandra. R.

  14. Daniel, Wu says:

    In all lines, #bytes=1. So why bytes=1?

    WAIT #7: nam=’SQL*Net message to client’ ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16418644515
    WAIT #5: nam=’SQL*Net message to client’ ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=16428803213
    W

  15. Amos says:

    Tanel,
    sql*net message to client and from client are the two top wait events in one of my SQL trace file. In the TKPROF formatted trace file, I have the following:
    Event waited on Times Max. Wait Total Waited
    —————————————- Waited ———- ————
    SQL*Net message to client 37776 0.00 0.11
    SQL*Net message from client 37776 0.56 21.76

    What’s the relationship between “Total Waited” and “Max Waited”? Is either the 0.56 seconds or 21.76 seconds (?) a measurement of one single execution?

    Thanks,
    Amos

  16. Tanel Poder says:

    These numbers are measured across all databases calls, in other wrods things like parses, executions and fetches too (!) of that statement.

    So even if you execute once but fetch 10 times, the total waited will be the sum, the max waited will be maximum time waited during/between database calls.

  17. Amos says:

    Hi Tanel,
    Thanks for your reply.
    Do you know your 2010 US seminars schedule?
    Happy New Year!
    Amos

  18. Marjan Koloski says:

    Hi Tanel,
    on our production database Oracle Enterprise Edition 11.1.0.7 64 bit on Windows server 2003 due to the
    bug 6319685 LGWR posts do not scale on some platforms, we enabled the fix for this bug
    with setting undocumented parameter “_FG_SYNC_SLEEP_USECS”=1. Once we enable this parameter problem with the log file sync is solved but now we have the wait event SQL*Net message to client which is 20% of the total wait time.
    How SQL*Net message to client wait event is connected with the parameter _FG_SYNC_SLEEP_USECS ? How can we tune this ?
    Thanks
    Marjan

  19. Tanel Poder says:

    @Marjan Koloski
    Hi Marjan,

    Can you run my snapper script (“snapper out 10 1 &sid”) on one of these sessions and paste me the output, so I’d have better overview of what’s going on. Perhaps the problem isn’t the SQL*Net message to client itself, but an instrumentation bug with this workaround.

  20. Marjan Koloski says:

    It was false alarm. Wrong presentation on the Quest Performance Analysis. In fact setting this parameter help with removing the log file sync wait events completely and reduced the oracle cpu utilization by 40%. In the top 5 wait events redo waits were replaced with network waits :) so we tought that we are hitting some side effect of the change. After comparing the AWR reports before and after the change we realized that there is no change in the network waits.
    Thanks and Best Regards
    Marjan

  21. Luis says:

    Hi Tanel,
    where can I get you snapper script ?

  22. Madhu says:

    Cleared my doubts, love this blog

  23. eintein says:

    You should better study physics in school. Light speed in the vacuum is about 300 000 kilometers per second, so in 1 microsecond it travels about 300 km.

    • Tanel Poder says:

      Mr Einstein,

      I suspect that this is some kind of a joke. I can’t imagine that my blog readers can’t divide or know the difference between millisecond and microsecond. If you’re serious, then please do your math again and if still get a different result than I did in the blog article, let us know :-)

  24. Sanjay says:

    Hi Tanelp
    We have oracle 11 g database and we have number of connections from different OSUER. The connections from particular OSUSER are showing “SQL*Net message from client” and INACTIVE status.

    Also all the session become INACTIVE from ACTIVE status within 1 minute. Also i observe that the wait event for particular session is

    SQL*Net message from client INACTIVE
    db file parallel read ACTIVE
    db file sequential read ACTIVE
    SQL*Net message from client INACTIVE.

    So what exactly is going on database. Is it connection problem from application team or any problem with database. Also whether is is fetching data or not???

    I couldn’t get any sql_text during active state also. Also i couldn’t get any o/p for

    SELECT sid, to_char(start_time,’DD/MM hh24:mi:ss’) stime,
    message,sofar , totalwork,user,(sofar/totalwork) * 100 as percent
    FROM v$session_longops
    where (sofar/totalwork) <1
    and sid=&sidl

    Please explain the behavior of session

    Regards
    Sanjau

    • Tanel Poder says:

      “SQL*Net message from client” , INACTIVE status means that the session is idle, it hasn’t been sent any more work to do by the application. Check out my snapper script (search for snapper.sql) to find what your sessions are doing:

      @snapper all 5 1 &sid

      • Sanjay says:

        Thank you very much!!! But i m not able to run in vi editor snapper.sql. The editor nt taking that much lines.

        Thanks and regards

        Sanjay

  25. rainer stenzel says:

    Just had an issue where trace_level_server=support degraded response time over SQLNet drastically popping up as “SQL*Net message to client” waits, so be careful to blame it to the network prematurely.

    • Tanel Poder says:

      Not sure if you actually read my blog entry, it explicitly explained that this wait event has nothing to do with your actual network latency. But yes, just like your SQL will get slower due to extra instrumentation and tracing overhead when you enable SQL trace, your network wait events may take longer if you enable overhead due to SQL*Net tracing. Or run strace/truss on a process, it will get slower. Nothing to do with the actual wait event, but just the fact that tracing slows things down.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>