Skip to content

Commit

Permalink
tcp: improve undo on timeout
Browse files Browse the repository at this point in the history
Upon timeout, undo (via both timestamps/Eifel and DSACKs) was
disabled if any retransmits were still in flight.  The concern was
perhaps that spurious retransmission sent in a previous recovery
episode may trigger DSACKs to falsely undo the current recovery.

However, this inadvertently misses undo opportunities (using either
TCP timestamps or DSACKs) when timeout occurs during a loss episode,
i.e.  recurring timeouts or timeout during fast recovery. In these
cases some retransmissions will be in flight but we should allow
undo. Furthermore, we should only reset undo_marker and undo_retrans
upon timeout if we are starting a new recovery episode. Finally,
when we do reset our undo state, we now do so in a manner similar
to tcp_enter_recovery(), so that we require a DSACK for each of
the outstsanding retransmissions. This will achieve the original
goal by requiring that we receive the same number of DSACKs as
retransmissions.

This patch increases the undo events by 50% on Google servers.

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
  • Loading branch information
Yuchung Cheng authored and David S. Miller committed Aug 23, 2014
1 parent a7d5f58 commit 989e04c
Show file tree
Hide file tree
Showing 2 changed files with 12 additions and 16 deletions.
2 changes: 1 addition & 1 deletion include/linux/tcp.h
Original file line number Diff line number Diff line change
Expand Up @@ -276,7 +276,7 @@ struct tcp_sock {
u32 retrans_stamp; /* Timestamp of the last retransmit,
* also used in SYN-SENT to remember stamp of
* the first SYN. */
u32 undo_marker; /* tracking retrans started here. */
u32 undo_marker; /* snd_una upon a new recovery episode. */
int undo_retrans; /* number of undoable retransmissions. */
u32 total_retrans; /* Total retransmits for entire connection */

Expand Down
26 changes: 11 additions & 15 deletions net/ipv4/tcp_input.c
Original file line number Diff line number Diff line change
Expand Up @@ -1888,21 +1888,21 @@ static inline void tcp_reset_reno_sack(struct tcp_sock *tp)
tp->sacked_out = 0;
}

static void tcp_clear_retrans_partial(struct tcp_sock *tp)
void tcp_clear_retrans(struct tcp_sock *tp)
{
tp->retrans_out = 0;
tp->lost_out = 0;

tp->undo_marker = 0;
tp->undo_retrans = -1;
tp->fackets_out = 0;
tp->sacked_out = 0;
}

void tcp_clear_retrans(struct tcp_sock *tp)
static inline void tcp_init_undo(struct tcp_sock *tp)
{
tcp_clear_retrans_partial(tp);

tp->fackets_out = 0;
tp->sacked_out = 0;
tp->undo_marker = tp->snd_una;
/* Retransmission still in flight may cause DSACKs later. */
tp->undo_retrans = tp->retrans_out ? : -1;
}

/* Enter Loss state. If we detect SACK reneging, forget all SACK information
Expand All @@ -1925,18 +1925,18 @@ void tcp_enter_loss(struct sock *sk)
tp->prior_ssthresh = tcp_current_ssthresh(sk);
tp->snd_ssthresh = icsk->icsk_ca_ops->ssthresh(sk);
tcp_ca_event(sk, CA_EVENT_LOSS);
tcp_init_undo(tp);
}
tp->snd_cwnd = 1;
tp->snd_cwnd_cnt = 0;
tp->snd_cwnd_stamp = tcp_time_stamp;

tcp_clear_retrans_partial(tp);
tp->retrans_out = 0;
tp->lost_out = 0;

if (tcp_is_reno(tp))
tcp_reset_reno_sack(tp);

tp->undo_marker = tp->snd_una;

skb = tcp_write_queue_head(sk);
is_reneg = skb && (TCP_SKB_CB(skb)->sacked & TCPCB_SACKED_ACKED);
if (is_reneg) {
Expand All @@ -1950,9 +1950,6 @@ void tcp_enter_loss(struct sock *sk)
if (skb == tcp_send_head(sk))
break;

if (TCP_SKB_CB(skb)->sacked & TCPCB_SACKED_RETRANS)
tp->undo_marker = 0;

TCP_SKB_CB(skb)->sacked &= (~TCPCB_TAGBITS)|TCPCB_SACKED_ACKED;
if (!(TCP_SKB_CB(skb)->sacked&TCPCB_SACKED_ACKED) || is_reneg) {
TCP_SKB_CB(skb)->sacked &= ~TCPCB_SACKED_ACKED;
Expand Down Expand Up @@ -2671,8 +2668,7 @@ static void tcp_enter_recovery(struct sock *sk, bool ece_ack)
NET_INC_STATS_BH(sock_net(sk), mib_idx);

tp->prior_ssthresh = 0;
tp->undo_marker = tp->snd_una;
tp->undo_retrans = tp->retrans_out ? : -1;
tcp_init_undo(tp);

if (inet_csk(sk)->icsk_ca_state < TCP_CA_CWR) {
if (!ece_ack)
Expand Down

0 comments on commit 989e04c

Please sign in to comment.