SQL*Net message to client vs SQL*Net more data to client

After my last post about SQL*Net message to client wait event I had a follow-up question about what’s the difference between SQL*Net message to client and SQL*Net more data to client wait events. I’ll post the answer here:

The first session data unit (SDU) bufferful of return data is written to TCP socket buffer under “SQL*Net message to client” wait event.

If Oracle needs to write more data for a call than fits into the “first” SDU buffer, then further writes for that call are done under SQL*Net more data to client event.

So, whether and how much of the “SQL*Net more data to client” vs “SQL*Net message to client” waits you see depends on two things:

  • Amount of data returned to client per call
  • Oracle Net SDU size

A quick test with my Snapper performance tool illustrates this:

In one session run following:

SQL> set arraysize 1
SQL> select rownum from dba_source;

...many rows returned...

And I measure the waits in the other session:

SQL> @snapper out,gather=w 5 1 141

-- Session Snapper v1.06 by Tanel Poder ( http://www.tanelpoder.com )

HEAD,     SID, SNAPSHOT START   ,  SECONDS, TYPE, STATISTIC                               ,         DELTA,      D/SEC,     HDELTA,     HD/SEC
DATA,     141, 20080210 14:59:55,        5, WAIT, SQL*Net message to client               ,          6750,       1350,     6.75ms,     1.35ms
DATA,     141, 20080210 14:59:55,        5, WAIT, SQL*Net message from client             ,       4668258,     933652,      4.67s,   933.65ms
--  End of snap 1

No SQL*Net more data to client is evidenced above as due the arraysize = 1 setting Oracle fetches only 2 rows at a time. Therefore the returned resultset per fetch call always fits into the first SDU sized packet. Note that the reason why Oracle fetches 2 rows when arraysize is 1 is due an OCI optimization which tries to be clever and proactively detect end-of-resultset conditions when fetching a single row – that way there’s a chance to automatically cancel a cursor and release its pins without an explicit cursor cancel or close call. This should have positive effect reducing shared pool fragmentation in some conditions.

Anyway, now I set arraysize to 5000 and run the same SQL again:

SQL> set arraysize 5000
SQL> select rownum from dba_source;

...many rows returned...

And snapper output is below:

SQL> @snapper out,gather=w 5 1 141

-- Session Snapper v1.06 by Tanel Poder ( http://www.tanelpoder.com )

HEAD,     SID, SNAPSHOT START   ,  SECONDS, TYPE, STATISTIC                               ,         DELTA,      D/SEC,     HDELTA,     HD/SEC
DATA,     141, 20080210 15:00:35,        5, WAIT, SQL*Net message to client               ,            32,          6,       32us,      6.4us
DATA,     141, 20080210 15:00:35,        5, WAIT, SQL*Net more data to client             ,          2925,        585,     2.93ms,      585us
DATA,     141, 20080210 15:00:35,        5, WAIT, SQL*Net message from client             ,       4861859,     972372,      4.86s,   972.37ms
--  End of snap 1

Now we see SQL*Net more data to client waits as well as the 5000 rows returned for every fetch call just don’t fit into a single SDU buffer.

I’ll reiterate that both SQL*Net message to client and SQL*Net more data to client waits only record the time it took to write the return data from Oracle’s userland SDU buffer to OS kernel-land TCP socket buffer. Thus the wait times of only microseconds. Thanks to that, all of the time a TCP packet spent “flying” towards the client is actually accounted in SQL*Net message from client wait statistic. The problem here is though, that we don’t know how much of this time was spent on the wire and how much of it was application think time.

Therefore, unless you’re going to buy a tool which is able to interpret TCP ACK echo timestamps, you need to measure network latency using application side instrumentation.

This entry was posted in Oracle and tagged , , . Bookmark the permalink.

33 Responses to SQL*Net message to client vs SQL*Net more data to client

  1. Polarski bernard says:

    Really very instructive. Basically it means that we can do little with all ‘WAITS’ related to message to clients for we see only the local and easy part of the transfer.

    Can we infer that the 4.67 and 4.86 seconds reported as ‘FROM client’ are in fact taken on a clock time where the local TCP/IP stack is effectively busy transfering the data it just received from Oracle TO the remote client while at the same time Oracle claims the client is doing nothing? if it is so, it is a wonderful example of deceipt information.

  2. tanelp says:

    Yes, the FROM client wait includes the time the server TCP stack is sending reply back to client + the TCP packets flying time towards client + client think time + TCP packets flyback time + time it takes to wake up the Oracle server process once the next packet from client has arrived.

  3. Kyle says:

    Cool stuff ! I always wondered why the network wait events seemed to never indicate network problems.
    Here’s a question – how about “SQL*Net more data from client” . I think in this case the shadow actually is waiting for a response/data from the client which could be due to network problems or could be due to some thing wacky on the client side.

  4. tanelp says:

    Hi Kyle,

    “SQL*Net more data from client” happens when there’s a large amount of data sent from client (or other database in case of dblinks), which doesn’t fit into single SDU size Oracle packet. The server process knows that the call hasn’t ended and there is more data/packets to come before the call ends.

  5. tanelp says:

    ..and the “large amount of data” may actually mean few kB only, but these waits appear as long as whole call’s data doesn’t fit into one single SDU size packet.

    • Sri says:

      Hi Tanlep, I too face the same issues while extracting huge data from multiple links and some of DBA suggested to use array size (set to max 5000). How it will solve the issue if i use this property? can you please help..

      • Tanel Poder says:

        Do you mean database links? Oracle automatically sends the arraysize to 32767 for database link traffic between two database instances. Or do you mean queries that return data back to some application (not between two dblink end points). In any case, the optimization steps for applicatons should be done in this order:

        1) Increase arraysize for the applications fetching a lot of rows
        2) Increase the TCP send/receive buffer sizes (and make sure you have plenty of network bandwidth)
        3) Optimize the SDU size

        If you don’t fix #1, then there’s no point in fixing 2/3 as if you’re fetching a single row per roundtrip, the network latency kills you, not the throughput limitations. If you don’t fix 1 and 2, there’s not much point in fixing 3. People tend to start from the wrong end.

  6. khailey says:

    exactly – so this one particular wait event “SQL*Net more data from client” will include the network time along with some processing time from the user tool process, so it can indicated either network latency or a problem in the tool, the only problem I guess is there isn’t much way to know whether the slow down is because of the net or a problme in the tool.
    Again – thanks tons for this analysis – super cool!!

  7. Bob says:


    I came across your blog entry while researching a performance issue I’m having after moving to a new server. I’m trying to figure out why reports I am running take longer to run on my machine, versus the report developers machine. Significant difference between the dev’s machine and other users. Report taking 10secs on dev’s machine and up to 2mins on others.
    Using Toad, I captured the session waits for dev(her) session and mine.

    What I noticed was a significant difference in the Time_waited column. The waits for “more data to client”, “message from client” and “message to client were the same for both session, it was just teh waits time being way off.

    Her Wait times
    More data to client – 48
    Message from client – 1037

    My wait times
    More data to client – 3280
    Message from client – 7243

    These differences are fairly consistent. We are running the same report, the same way, and returning the same dataset.

    Can you suggest what might cause this kind of discrepency or how I can go about further diagnosing this?

  8. tanelp says:

    Hi Bob,

    There could be various reasons, one is just different network latency at different sites, another is different Oracle client libraries in use, which allows different optimizations in network traffic.

    Can you run session Snapper on the both sessions and post output then I can help further. I’m interested in few more stats like “bytes sent via SQL*Net to client” and “SQL*Net troundtrips to/from client” and full wait stats for these sessions.

  9. Bob says:

    Sorry its taken so long to get back to you. I never got the snapper code ran. However, we have resolved the issue. After much troubleshooting, gnashing of teeth, and yanking of hair, we found out it was a neogtiation issue between two switches.

    Our server has a gig nic in it and was connected to a gig switch. Call it switch A. Switch A was connected to a wall-jack, which led to a patch panel. At the patch panel, Switch A was routed to an input port in Switch B, which was a 100meg switch. We finally traced this path and noticed we had a fault like blinking on Switch B. We took our production system down for 30mins. We routed the new server directly to the wall jack and to switch B. File copy between client and server immeditately caused a fault light on Switch B. So, we went into the server and set the gig nic to 100meg full duplex. File copy was blindingly fast and no fault light. Brought the database back up and allowed users to begin work. System performance is exactly what I was hoping for on the new server. My 1min, 16 sec report took less than 2 secs.

    The reason why our developer machine was so fast was that she was connected to Switch B. If we moved her to anywhere but Switch B, she got the same poor performance as everyone else.

    So, problem appears to be resolved.


  10. tanelp says:

    Yup, that’s a classic! :) I’ve seen quite many weird network performance issues, which eventually turn out to be failed link speed autonegotiation problems.

    For the same reason, some network & sysadmins even have a practice of always disabling autonegotiation…

  11. Kyle Hailey says:

    For the autonegotiate problem, I would have thought that all the negtiation time would have been spent in the send and have nothing to do with the packing but if “SQL*Net more data to client” is just the packing and “SQL*Net message to client” is the transfer time (in this case) this doesn’t seem to be the case because both times slow down. Makes me wonder if “SQL*Net more data to client” doesn’t include somehow part of the transfer time…

  12. tanelp says:

    Hi Kyle,

    Note that Bob mentioned events from different classes “message FROM client” and “more data TO client”

    “SQL*Net more data to client” is supposed to be like “SQL*Net message to client” but only for large responses which span a sqlnet packet size. First packet is still sent back “under” message to client wait event and subsequent packets for that call using more data to client.

    Unfortunately the “SQL*Net message to client” time is not accounted properly in Oracle, so it can’t be used for calculating any transfer time.

    The “SQL*Net more data to client” is a little different as at least it looks like its instrumented properly (e.g. if the socket write call blocks, that time is accounted).

    There are still some unclear pieces, and some research I want to do on this once I get time. When I do, I post the results :)

  13. Hi Tanel,

    3 weeks ago I finished to configure an Oracle Database 11g ( 64 bits on a Linux (4 GM RAM), and after some tests, I noticed a SQL low response time. So, I thought … there is something strange happening here:
    I made a test in the local server, exporting a table T1 (using the traditional exp utility) that contains around 1 million rows on it. The export operation took less than 10 seconds to complete.

    After that, I performed the same export operation in a remote machine. By the way, the remote machine and the Oracle server are currently located at same network.

    For my surprise, the export operation took around 1 hour to complete!! I’m not sure where are the bottleneck. Do you believe that there is a contention or something relationed to the network layer? Apparently, the network is fine …

    Below, I attached the output of some informations while taking the export from the remote machine. I noticed high values in some informations …

    IO – v$sess_io
    block_gets = 3
    consistent_gets = 52,730
    physical_reads = 20,027

    WAITS – v$session_event
    SQL*Net message from client > total_waits = 25,635 time_waited = 306,000
    SQL*Net message to client > total_waits = 25,635 time_waited = 2
    SQL*Net nore data to client > total_waits = 97 time_waited = 0

    STATISTICS – v$sesstat
    bytes sent via SQL*Net to client = 146,167,218
    bytes received via SQL*Net from client = 299,577
    SQL*Net roundtrips to/from client = 26,048


    FROM “SCOTT”.”T1″


  14. Tanel Poder says:

    Hi Eduardo,

    See taht your session had made 26048 roundtrips during session startup. Now if every roundtrip takes let say 100ms then you’ll have ~2600 seconds of response time there. In fact in your case the time spent waiting for client (next fetch) was ~3000 seconds already. So it’s the network latency combined with high number of fetches/network roundtrips.

    How did you perform the export? Make your export buffer size larger, this should reduce number of network roundtrips and response time.

  15. Hi Tanel,

    Thanks for the response. In fact, I have never faced this kind of problem before. I’m trying to understand why in the local server the export operation took so fast, and why the export operation from a remote machine took so long. 2 or 3 minutes is acceptable, but 1 hour? For 5 years, we have working in a Oracle 10g installed in another server, and this kind of problem have never happened before. The development team used to take exports (exp) and imports (imp) using default buffers size since then, without problems:

    exp user/passwrod@service file=file.dmp …

    The network analyst of the company made some tests using “scp” Linux commands in order to copy big files from that new Oracle 11g server to another server and vice versa without problems. The speed rate of copying a file was normal as expected (10MB/s) in a 100Mbs network.

    The SDU/TDU SQLNet configuration is using the default values. The MTU of the LAN is fine too (1500 bytes).

    So, I’m not sure how to find out either what could might causing this delay or contention. Maybe looking at Linux Kernel related network parameters? Maybe taking a test … like connecting a direct crossover cable connection between the server and a client machine?

    Well, on next Monday I will try to take more tests …


  16. Tanel Poder says:

    I don’t know what numbers do you normally see but the numbers you sent earlier show that your exp had done 26000 sqlnet roundtrips which on average took over 10ms each. Note that the sql*net messasge from client wait time includes the following:

    1) time the result network packets are “flying” towards the client
    2) client application think time – this is important as if you use some pipe for compression on the fly, the think time may become even longer
    3) time the next request packet is “flying” towards the server

  17. Tanel Poder says:

    Btw, some misconfigured VLANs can driver up latency even in local area networks.

  18. Tanel Poder says:

    While doing your test, also run “tnsping DBNAME 999” so you’d get some latency figures (for small packets)

    If you think that everything is configured normally, there is no significant network latency nor application think time then you can look deeper – dump the network packets at client with tcpdump (if on linux) and analyze part of the transmission with Wireshark (www.wireshark.org). This will give you SDU/TDU sizes, resends, packet reassemblies and timestamps you need for finding where the time is going IF it’s a network transport issue.

  19. Hi Tanel,

    Thanks for the tips, I’ll check it out …


  20. Hi Tanel,

    I only got acceptable export speed when I changed the default buffer size to a high value:

    exp user/password buffer=100000000

    So, do you think there is some O/S kernel related parameter that I should to review?


  21. shenglin.du says:

    Hi Tanel,

    I hit a network issue on May 26th. PD complained the slowness of their job at that time. I checked the running session using v$session_event and found the most time waiting for ‘SQL*Net more data to client’ and ‘virtual circuit status'(We use MTS). Using the stats$system_event(we use statspack with 15 minutes sampling), I found the average wait time(TIME_WAITED_MICRO/TOTAL_WAITS) of ‘SQL*Net more data to client’ increased from 200 to about 600. However, the average wait time of ‘SQL*Net message to client’ was always 4-5. Finally, I found the root cause is the network issue caused by failed switch.

    My questions are,

    1. Can we use average wait time of ‘SQL*Net more data to client’ to measure the network latency? (it’s not the first time we use this to detect the network issue).

    2. Why the average wait time of ‘SQL*Net message to client’ didn’t change during the network issue since the ‘SQL*Net more data to client’ is just used to send more data over first SDU buffer as you stated above?

    PS. the system event of ‘SQL*Net more data to client’
    —————- —————– ———– —————————–
    2010-05-25 01:00 1128567822 4496478 250.98929
    2010-05-25 23:15 1231169379 4703941 261.731467
    2010-05-25 23:30 1282636291 4639954 276.432976
    2010-05-25 23:45 1360151005 4496391 302.498383
    2010-05-26 00:00 912223796 4210939 216.631919
    2010-05-26 00:15 1507247039 5024973 299.951271
    2010-05-26 00:30 1524815181 5234298 291.31226
    2010-05-26 00:45 2691204366 5108648 526.793853
    2010-05-26 01:00 3125441997 4723104 661.73474

    the system event of ‘SQL*Net message to client’
    —————- —————– ———– —————————–
    2010-05-25 01:00 15541619 3162845 4.91380988
    2010-05-25 23:15 14801067 3046039 4.85911933
    2010-05-25 23:30 14479453 2967351 4.8795889
    2010-05-25 23:45 14589455 3004230 4.85630428
    2010-05-26 00:00 14219425 2924318 4.86247563
    2010-05-26 00:15 16295497 3308064 4.92599206
    2010-05-26 00:30 18020082 3661416 4.92161557
    2010-05-26 00:45 17216179 3507996 4.9076963
    2010-05-26 01:00 16622061 3371335 4.93040917

    Thanks for your help on this in advance.


  22. Srini says:

    @Tanel Poder
    In one of your notes on “sql*net message to client” wait events you quoted as below.
    I donot understand about Oracle Net SDU size, can you please give more info on Oracle Net SDU size.

    “So, whether and how much of the “SQL*Net more data to client” vs “SQL*Net message to client” waits you see depends on two things:

    Amount of data returned to client per call
    Oracle Net SDU size “

  23. Damion Reeves says:

    Hi Tanel,

    Wondering if you could give me some advice…..I just upgraded one of our databases from Oracle EE on exadata 1/4 rack X2 to Oracle on a new exadata X3 rack.

    All seemed good, except for the following query that now seems to indicate far more “SQL*Net message from client”, “SQL*Net break/reset to client” and “SQL*Net message to client” wait events that I’d expect.

    The query takes 3 mins 37 sec to return 1823 rows.

    I executed the query on the same exadata X3 system where the database resides via SQL*Plus. So the only client involved is the exadata system itself.

    I also re-ran the query with “ALTER SESSION SET EVENTS ‘10046 trace name context forever, level 8’…..but the trace file produced doesn’t seem to show any “break” messages like I though might be the cause as per one of your previous posts.

    Also, “tnsping CTPROD 999” doesn’t seem to show any large TNS packet delays.

    c.value as object_name,
    cobjects a
    inner join cobject_transactions b on a.cobject_uid=b.cobject_uid and b.most_recent=’Y’
    inner join cobject_attributes c on c.cobject_transaction_uid = b.cobject_transaction_uid
    inner join cobject_types d on d.name_field_uid=c.cobject_attribute_type_uid
    inner join cobject_contacts e on e.cobject_uid=a.cobject_uid
    inner join cbusiness_contacts f on f.business_contact_uid = e.business_contact_uid
    inner join cbusinesses g on g.business_uid=f.business_uid
    inner join business_details h on g.business_uid=h.business_uid and h.most_recent=’Y’
    (role_effective_to is null or role_effective_to>=TO_DATE(’22/5/2013 0:00:00′,’DD/MM/YYYY HH24:MI:SS’))
    order by type_name,object_name;

    ———————————– ———– ———– ————
    SQL*Net message from client 32761 16407504 500.82
    SQL*Net break/reset to client 32758 56 0
    SQL*Net message to client 32762 3 0
    cursor: pin S wait on X 1 1 1.09
    cell multiblock physical read 1 0 .05
    cell single block physical read 81 4 .05
    gc current block 2-way 14 0 .02
    gc cr multi block request 2 0 .02
    gc cr disk read 1 0 .01
    library cache lock 27 0 .02
    row cache lock 498 4 .01
    library cache pin 28 0 .01
    latch: row cache objects 1 0 0
    Disk file operations I/O 2 0 .01
    enq: TM – contention 3 0 .01
    gc cr grant 2-way 37 0 .01



    • Tanel Poder says:

      Can you run snapper on this query when it’s running?

      1) Run the long-running query in one session
      2) In another session run @snapper4 all 10 1 &sid_of_where_you_run_your_query

      This should tell us how much data you send back per roundtrip etc…

  24. Vipen Koul says:

    Very informative article.

    How would a DBA by looking into the database views find what’s the ARRAYSIZE/PreFetchSize set for the session on a given connection?


    • Tanel Poder says:


      One way is to enable SQL trace for the session and look into the r= column in FETCH lines in trace.

      Another way would be to divide ROWS_PROCESSED / FETCHES in V$SQL for a cursor (this of course counts all sessions using that cursor).

      And in both cases be aware that the “rows processed” may be smaller than max fetch size just because there were not enough rows to be returned for a query… But if you see multiple FETCH lines for the same cursor right next to each other (without intermediate EXEC calls) you can be sure that the fetch buffer got full (at least once) and the r= field shows what was the requested arraysize…

  25. Ryan says:

    Your example implies that if the client increases the array size it could cause wait issues even if you don’t have much data to return and the amount
    of data to return can fit in 1 SDU.

    This is v$session_Event for a problem session. Looks like all the wait time is on 1 network pass. So I don’t think SDU is the issue. The amount of bytes in a query
    is roughly 91.

    So TIME_WAITED=13938 and MAX_WAIT=13621.

    We ran tnsping 999 from the application server and we got 20 ms back.
    This seems to happen repeatedly. We kill the session and then this happens again.
    The network team says there is no packet loss and no network issues.

    Any suggestions on what to do next?

    —————————————————————- ———– ———– ———— ———-
    SQL*Net message from client 4524 13938 3.08 13621

    • Tanel Poder says:

      Hi Ryan, which statement seems to imply that? I don’t think I’ve said this anywhere. If your result data fits into a single SDU size TNS packet, you shouldn’t even see the “more data” wait event, you’d see the “SQL*Net message from client” wait event, which means your session is idle, waiting for the next message (command) to arrive from the client application. So it’s the network roundtrip time PLUS the application think time (whatever the app is doing).

      In the output you have posted here, your database session has been waiting for the next command to arrive from the client application for 13938 seconds in total, 3 seconds on average and the longest wait has been 13621 seconds. Again, this is not (likely) a network problem, but application think time.

      • Ryan says:

        Thanks Tanel.

        If this can include application think time, how come we don’t see this event more often? In the past when applications run a query and then do stuff, I typically get idle SQLNET waits. In what cases would ‘more data from client’ mean waiting for application to get respond? Is this an ‘ack’ its supposed to give back after a tcp request? So if the application server is using a connection (and most are) when they run the query , get the data back, then the application sends an ‘ack’ back stating it recieved the data and then moves on to processing ?

        • Ryan says:

          This followup goes beyond the Oracle DB. Trying to figure out where the application might be hung up in the stack.

          Generally when the application in slow, you get an idle SQLNET wait. I rarely ever see ‘SQLNET More Data From Client’. My guess is that Oracle is ‘waiting’ for a tcp ack to come back stating that the middle tier received the data right? I am trying to figure out when this sqlnet wait changes to the idle SQLNET wait.

          My best guess is that the middle tier connection pool is hung up some how. I have seen this happen in weblogic if traffic gets too high. (I don’t know what they are using). Typically when a developer uses a connection pool the following happen:

          1. middle tier opens up a limited set of connections to the database and makes them available to the application . This generally happens on startup and is configurable by an administrator. This is generic. These connections go into a pool of available connections.
          2. Application request one of these connections and ‘opens’ it. The DB connections are already open, so the middle tier is just transferring a pointer to the existing DB connection to the application.
          3. applications execute SQL using this pointer (if DML commit/rollback, etc… )
          4. close connection. The ‘close connection’ does not impact the DB, since that stays open. It just means the application layer releases its connection in the middle pool
          and this connection goes back into the pool for another connection to grab it.

          My best guess is that this connection is not closed (not sure if this matters) and/or the middle tier has yet to send an ‘ack’ back stating that it received the data. Since most of the time we get idle waits during application think time, I am trying to figure out what the ‘more data from client’ oracle is waiting for is and where it could be hung up in the stack.

          One well known bug happens when developers forget to close connections in the middle tier since the connections are still open in the DB, but all the available pool is empty. I am not sure if this leads to ‘sqlnet more data from client’, but I doubt it. I think its an idle SQLNET wait.

          • Tanel Poder says:

            It’s hard to advise without seeing the real numbers (or only a subset of the full information). In the output you pasted earlier, you had SQL*Net message from client listed there, saw no “SQL*Net more data” wait event there… When your application session is slow – can you run snapper (@snapper4 all 30 1 &sid) on it and send me the output?

Leave a Reply

Your email address will not be published. Required fields are marked *