discussion
[Top] [All Lists]

Re: [Discussion] add another send limiting factor to triage (patch)

To: Matt Mathis <mathis@xxxxxxx>
Subject: Re: [Discussion] add another send limiting factor to triage (patch)
From: Guohan Lu <lguohan@xxxxxxxxx>
Date: Sun, 06 Jan 2008 20:52:01 +0800
Hello,

Matt Mathis wrote:
> We considered something along this line, but it didn't make it into the
> code. First notice that there are actually several different reasons why
> the sender may not be able to send including Nagle, and sender side
> Silly Window Syndrome avoidance.  In principle all of these could have
> their own instruments.   The case that you are interested in will prove
> to be a little tricky to instrument because tcp_output never checks
> buffer space: it just checks for available data.  Figuring out why there
> is not more data would require duplicating some of the buffer space
> logic from the socket code.  Not impossible, but a bit ugly.
I added a few lines (less then 10) to identify sender buffer limited
case. The basic idea is that when tcp copies data from application to
socket buffer, it checks whether the buffer space is full or not, and
when it's full it gives the socket a SOCK_NOSPACE flag. When the socket
has this flag and the sk_send_head is NULL, the data sending must then
be limited by sender buffer. The attachment is the patch, comments are
also given in the patch.

I also did two simple tests, one is ftp with sender buffer limited and
other is telnet with application limited. For the first case, sndbuf
limited factor contributes most of the time in the connection. For the
second case, 0% of time is limited by sender buffer.

First apply web100 2.5.18 to patch a vanilla linux 2.6.23, then apply
this patch.

> But notice that nearly all sender side bottle necks manifest themselves
> the same way: if you run out of CPU, or spend too much time waiting for
> SMP locks, etc, then the application itself can't run to keep the socket
> full.  Thus except for a few TCP specific algorithms (e.g. Nagle) the
> sender always stops for "not enough data", and never stops for anything
> more specific.
I agree. But to restate, when sk_send_head is NULL, the data sending
must be limited by the sender. But when sk_send_head is NULL and
SOCK_NOSPACE is flagged too, the sender buffer limits the data sending.

-Guohan

> 
> Thanks,
> --MM--
> -------------------------------------------
> Matt Mathis      http://www.psc.edu/~mathis
> Work:412.268.3319    Home/Cell:412.654.7529
> -------------------------------------------
> Evil is defined by mortals who think they know
> "The Truth" and use force to apply it to others.
> 
> On Wed, 2 Jan 2008, Guohan Lu wrote:
> 
>> Hello,
>>     I am working on a project to identify TCP throughput limiting
>> factors.
>> As we know, web100's triage statistics can provide three throughput
>> limiting
>> factors for a TCP connection, i.e sender, receiver window and cwnd.
>> Currently,
>> I am thinking of dividing sender limit factor into two subgroups, one is
>> application limit (i.e the application has no more data to send), and
>> the other
>> sender buffer limit (i.e. the sender buffer is full). The difference
>> between the two
>> is that for the 2nd case the application continuous generates data to
>> send,
>> but the socket buffer is full and all of them are sent but
>> unacknowledged.
>> The typical applications are telnet, bittorrent for the 1st case, and
>> are FTP
>> for the 2nd case.
>>    I am wondering if any one on this mailing list has this idea too, and
>> if I can get any advice from this mailing list first. I am browsing
>> web100's
>> code, e.g. tcp_write_xmit().
>>
>> best,
>>
>> Guohan Lu
>>
>> **********************************
>> Guohan Lu, Ph.D Candidate
>> Dept. of E.E., Tsinghua University,
>> Beijing, China
>> 86-10-62792515
>> **********************************
>>
>> _______________________________________________
>> Discussion mailing list
>> Discussion@xxxxxxxxxx
>> http://internal.web100.org/mailman/listinfo/discussion
>>
> 

--- linux-2.6.23/include/net/web100_stats.h     2008-01-06 19:20:34.000000000 
+0800
+++ linux-2.6.23-web100/include/net/web100_stats.h      2008-01-06 
19:12:54.000000000 +0800
@@ -31,6 +31,7 @@
 enum wc_sndlim_states {
        WC_SNDLIM_NONE = -1,
        WC_SNDLIM_SENDER,
+       WC_SNDLIM_SNDBUF,
        WC_SNDLIM_CWND,
        WC_SNDLIM_RWIN,
        WC_SNDLIM_STARTUP,
--- linux-2.6.23/fs/proc/web100.c       2008-01-06 19:20:34.000000000 +0800
+++ linux-2.6.23-web100/fs/proc/web100.c        2008-01-06 19:14:10.000000000 
+0800
@@ -1217,6 +1217,9 @@ add_var(web100_file_lookup(ino), #name, 
        ADD_RO_STATSRENAME(PROC_CONN_READ, "SndLimTransSender", 
WEB100_TYPE_COUNTER32, SndLimTrans[WC_SNDLIM_SENDER]);
        ADD_RO_STATSRENAME(PROC_CONN_READ, "SndLimBytesSender", 
WEB100_TYPE_COUNTER64, SndLimBytes[WC_SNDLIM_SENDER]);
        ADD_RO_STATSRENAME(PROC_CONN_READ, "SndLimTimeSender", 
WEB100_TYPE_COUNTER32, SndLimTime[WC_SNDLIM_SENDER]);
+       ADD_RO_STATSRENAME(PROC_CONN_READ, "SndLimTransSndbuf", 
WEB100_TYPE_COUNTER32, SndLimTrans[WC_SNDLIM_SNDBUF]);
+       ADD_RO_STATSRENAME(PROC_CONN_READ, "SndLimBytesSndbuf", 
WEB100_TYPE_COUNTER64, SndLimBytes[WC_SNDLIM_SNDBUF]);
+       ADD_RO_STATSRENAME(PROC_CONN_READ, "SndLimTimeSndbuf", 
WEB100_TYPE_COUNTER32, SndLimTime[WC_SNDLIM_SNDBUF]);
        ADD_RO_STATSRENAME(PROC_CONN_READ, "SndLimTransCwnd", 
WEB100_TYPE_COUNTER32, SndLimTrans[WC_SNDLIM_CWND]);
        ADD_RO_STATSRENAME(PROC_CONN_READ, "SndLimBytesCwnd", 
WEB100_TYPE_COUNTER64, SndLimBytes[WC_SNDLIM_CWND]);
        ADD_RO_STATSRENAME(PROC_CONN_READ, "SndLimTimeCwnd", 
WEB100_TYPE_COUNTER32, SndLimTime[WC_SNDLIM_CWND]);
--- linux-2.6.23/net/ipv4/tcp_output.c  2008-01-06 19:20:34.000000000 +0800
+++ linux-2.6.23-web100/net/ipv4/tcp_output.c   2008-01-06 20:24:17.000000000 
+0800
@@ -1473,8 +1473,13 @@ static int tcp_write_xmit(struct sock *s
                tcp_minshall_update(tp, mss_now, skb);
                sent_pkts++;
        }
-       if (why == WC_SNDLIM_NONE)
-               why = WC_SNDLIM_SENDER;
+       if (why == WC_SNDLIM_NONE) {
+               if (sk->sk_socket &&
+                   test_bit(SOCK_NOSPACE, &sk->sk_socket->flags))
+                       why = WC_SNDLIM_SNDBUF;
+               else
+                       why = WC_SNDLIM_SENDER;
+       }
        WEB100_UPDATE_FUNC(tp, web100_update_sndlim(tp, why));
 
        if (likely(sent_pkts)) {
--- linux-2.6.23/net/ipv4/tcp.c 2008-01-06 19:20:34.000000000 +0800
+++ linux-2.6.23-web100/net/ipv4/tcp.c  2008-01-06 20:19:49.000000000 +0800
@@ -604,6 +604,11 @@ new_segment:
 
 wait_for_sndbuf:
                set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
+               /* sk_send_head is NULL, it means all skb has been sent out. 
+                * Therefore, if there were still free spaces in sender buffer,
+                * the packet is also very likely to be sent. */
+               if (!sk->sk_send_head)
+                       WEB100_UPDATE_FUNC(tp, web100_update_sndlim(tp, 
WC_SNDLIM_SNDBUF));
 wait_for_memory:
                if (copied) {
                        tcp_push(sk, flags & ~MSG_MORE, mss_now, 
TCP_NAGLE_PUSH);
@@ -854,6 +859,11 @@ new_segment:
 
 wait_for_sndbuf:
                        set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
+                       /* sk_send_head is NULL, it means all skb has been sent 
out. 
+                        * Therefore, if there were still free spaces in sender 
buffer,
+                        * the packet is also very likely to be sent. */
+                       if (!sk->sk_send_head)
+                               WEB100_UPDATE_FUNC(tp, web100_update_sndlim(tp, 
WC_SNDLIM_SNDBUF));
 wait_for_memory:
                        if (copied) {
                                tcp_push(sk, flags & ~MSG_MORE, mss_now, 
TCP_NAGLE_PUSH);
_______________________________________________
Discussion mailing list
Discussion@xxxxxxxxxx
http://internal.web100.org/mailman/listinfo/discussion
<Prev in Thread] Current Thread [Next in Thread>