Subsections


Packet Tailgating Technique

In this section, we describe the tailgating technique and our prototype implementation, (nettimer). We also analyze the advantages and disadvantages of the technique.

The technique is divided into two phases: the sigma phase, which measures the characteristics of the entire path, and the tailgating phase, which measures the characteristics of each link individually.

The purpose of the sigma phase is to measure the non-link-specific quantities described in the previous section: bn - 1 and dn - 1. We do this by sending single packets of different sizes and using linear regression on the minimum delay for each size. We send packets until the confidence of the linear regression exceeds 99%. This is the same technique (described in Section 2) that Jacobson and Downey use. As mentioned before, this step requires many packets. While previous work does this step for every router along a path, we only do it once from the source to the destination to calculate bn - 1 and dn - 1 for (6).

The purpose of the tailgating phase is to compute the remainder of the variables in (6). As mentioned before, the tailgating technique assumes that we can send one packet without queueing and a second packet that queues after the first packet at link lq, but at no later link. We do this by sending the largest possible non-fragmented packet with an IP Time-to-Live (TTL) field of lq immediately followed by the smallest possible packet. The smaller packet almost always (the exception is discussed in Section 4.2) has a lower transmission delay than the larger packet's transmission delay on the next link. This causes the smaller packet (the tailgater) to queue continuously after the larger packet (the tailgated). The TTL for the tailgated packet will cause it to be dropped at link lq, so the tailgater can then continue without queueing to the destination.

As with the sigma phase, we take the minimum of several delay samples of the tailgater packet. We start this process with a TTL of 1 and continue until we reach the destination. We measure closer links first to compute blq - 1 for later links.

In our nettimer implementation, we randomly probe different links until the error for all of them drops below 2%. This value was chosen to balance the time to finish with the accuracy of the results. We calculate the error for a particular link by using the bootstrap method [5]. We randomly re-sample with replacement from the set of delays for a link until we have a new set of samples of 25% of the size of the original. Using this new set, we compute a new minimum delay. We repeat this process 20 times and then compute the variance of all the new delays. The error is this variance divided by the actual minimum delay of the set. These values are selected so that the CPU overhead of the computation is unnoticable at typical network latencies.


Tailgating Complexities

In addition to the basic technique described above, we have to deal with several additional complexities: 1) calculating the maximum inter-packet transmission time, 2) clock skew, 3) using round trip measurements, 4) causing acknowledgements to be sent, 5) detecting that packets were dropped, and 6) dealing with invisible nodes.

Inter-packet Transmission Time

As discussed in 3.2, the tailgated and tailgater packets must be sent such that queueing takes place at the link to be measured. We could guarantee this by sending the packets at the same instant in time, but this is impossible. Here we derive the maximum time we have between each packet transmission so that (6) remains valid. There may be scheduling variations in the source host's operating system such that the deadline sometimes cannot be met, so we want to determine when the deadline is to filter out those measurements.

We want packet k to queue at link lq:

qklq > 0

Using (4),

\begin{displaymath}\begin{split}
\max\left(0, t^{k-1}_{l_q+1} - d_{l_q} - t^k_{l...
...ight) > 0 \\
t^k_{l_q} < t^{k-1}_{l_q+1} - d_{l_q}
\end{split}\end{displaymath}

Using (3) and the assumption that the tailgated and tailgater packets experience no queueing before lq,

\begin{displaymath}\begin{split}
t^k_0 + \sum_{i=0}^{l_{q}-1}\left(\frac{s^k}{b_...
...q}\left(\frac{s^{k-1}}{b_i} + d_i\right)
- d_{l_q}
\end{split}\end{displaymath}

Simplifying,

tk0 - tk - 10 < $\displaystyle {\frac{s^{k-1}}{b_{l_q}}}$ + $\displaystyle {\frac{s^{k-1} -
s^k}{b^{l_q -1}}}$

This means that we might have trouble sending the tailgater packet quickly enough if closer links have high bandwidth. For example, if the tailgated packet is 1500 bytes and the first link has a bandwidth of 100Mb/s, then we must send the tailgater packet within 120 microseconds, which is possible on most machines. However, if the first link is 1Gb/s, then we only have 12 microseconds to send the tailgater. In this case, we can use the regression technique on closer links until we are at a link sufficiently far away that we can meet the transmission deadline.

Clock Skew

Equation 6 uses timing at both the sender tk - 10 and the receiver tkn, which suggests that clock skew could cause error in the calculation. However, we show here that any clock skew is canceled in the calculation.

We model clock skew as time measurements taken at the receiver to be offset by $ \epsilon$:

$\displaystyle \hat{t}^{k}_{n}$ = tkn + $\displaystyle \epsilon$    $\displaystyle \hat{d}^{n-1}_{}$ = dn - 1 + $\displaystyle \epsilon$

Where $ \hat{t}$, $ \hat{d}$, and $ \hat{b}$ are the clock skewed versions of packet arrival time, cumulative link delay, and link bandwidth. The clock skewed version of ( [*]) becomes

$\displaystyle \hat{b}_{l_q}^{}$ = $\displaystyle {\frac{s^{k-1}}{(\hat{t}^k_n + \frac{s^k -
s^{k-1}}{b^{l_q-1}} - \frac{s^k}{b^{n-1}} - t^{k-1}_0 - \hat{d}^{n-1})}}$

Substituting and simplifying,

\begin{displaymath}\begin{split}
\hat{b}_{l_q} & = \frac{s^{k-1}}{(t^k_n + \epsi...
...k}{b^{n-1}} - t^{k-1}_0 - d^{n-1})} \\
& = b_{l_q}
\end{split}\end{displaymath}

So clock skew does not affect the calculation.

Using Round Trip Measurements

Although ( [*]) uses timings at both the sender and the receiver, we can transform it to use only timings at the sender. The idea is to cause the receiver to send an acknowledgement for each tailgater packet that arrives. If we can correlate the tailgater and acknowledgement packets (described in the next section), then we can determine the round-trip properties of the path and use these to calculate the link bandwidths of the forward path.

We define S, T, Q, B, and D to be the packet size, packet arrival time, packet queueing delay, link bandwidth, and link delay of the flow in the reverse of the direction we are interested in. We start with measurements of the arrival times of packets at link 0, the round trip delay (drt), and the round trip bandwidth (brt) and derive a form of ( [*]) containing only these variables.

We know that that the round trip delay is equal to the sum of the delay in the forward direction and the delay in the reverse direction and similarly for bandwidth:

\begin{displaymath}\begin{split}
d^{rt} = d^{n-1} + D^{n-1} \quad \quad \frac{1}...
...{1}{b^{n-1}} = \frac{1}{b^{rt}} - \frac{1}{B^{n-1}}
\end{split}\end{displaymath}

Furthermore, using (3) and assuming that the acknowledgement is sent immediately when the tailgater arrives and experiences no queueing:

\begin{displaymath}\begin{split}
T^k_0 = t^k_n + \sum_{i=0}^{n-1}\left(\frac{S^k...
... \sum_{i=0}^{n-1}\left(\frac{S^k}{B_i} + D_i\right)
\end{split}\end{displaymath}

Substituting into ( [*]) and simplifying,

\begin{displaymath}\begin{split}
b_{l_q} & = \frac{s^{k-1}}{T^k_0 +
\frac{s^k-S^...
...{l_q-1}} - \frac{s^k}{b^{rt}} - t^{k-1}_0 - d^{rt}}
\end{split}\end{displaymath}

We know all the terms of this equation except the inverse sum of the inverse reverse link bandwidths Bn - 1. At this point, we assume that the bandwidths are symmetric ( Bn - 1 = $ {\frac{b^{rt}}{2}}$). To measure asymmetric links, we would have to be able to measure one way delay.

Causing Acknowledgements to be Sent

To get round trip measurements, we must cause the receiver to reply consistently to arriving packets. The nettimer implementation sends TCP FIN packets in both the sigma and tailgate phase. Hosts are required to respond to TCP FIN packets with a TCP RST (reset) packet. Nettimer uses this to measure the round-trip-delay. This is better than using ICMP packets because some routers and hosts block or rate-limit ICMP packets. Similarly, TCP is better than UDP because UDP packets sent to closed ports induce ICMP port unreachable messages which are rate-limited on some systems (e.g. Linux and Solaris).

Dropped Packets

A dropped tailgater is not a problem because it simply means one fewer delay samples. However, a source can only admit a tailgater timing sample if it also received the ICMP TTL-exceeded message for the corresponding tailgated packet. This is because the tailgated packet could have been dropped (perhaps because of congestion) before link lq, and the tailgater packet could have traveled through the network unimpeded, thus giving an abnormally low minimum delay.

Dealing with Invisible Nodes

Finally, to combat the problem of invisible nodes mentioned in Section 2, nettimer gets timing information directly from the kernel using libpcap [12], rather than from the application level. This removes the application-kernel node from the measured path. In addition, we tried to extend the tailgating technique to measure the kernel-NIC node by causing select packets to be dropped in the NIC, but we could not find a way to do this across many different NIC drivers. As with the one-packet-based tools, bridges remain invisible nodes for nettimer.


Tailgating Analysis

In this section we list the advantages and disadvantages of the packet tailgating technique. The advantages of the technique are its speed, unobtrusiveness and robustness compared to the other link bandwidth techniques described in Section 2. The disadvantages of the technique are its need to send packets back-to-back on the first link, its inability to measure a very fast link after a very slow link, the fact that queuing anywhere along the path disrupts the measurement of all links on the path, and the accumulation of errors in the calculation.

Packet tailgating is potentially faster and less obtrusive than the previously discussed techniques because it performs the expensive linear regression step only once for the entire path instead of once for every link. The tailgating step has to be done for every link, but this only requires finding the minimum delay for a pair of packets compared to finding the minimum delay of 16 to 64 different packet sizes. We test this hypothesis using nettimer in the next section.

Packet tailgating is potentially more robust because it can detect multichannel links, does not rely on timely delivery of ICMP packets, and can be run without acknowledgements.

Packet tailgating can measure multi-channel links because, like the packet bunch extension to packet pair, it can send multiple packets to fill the channels. To do this, we send the tailgated packet followed by any number of tailgaters. If we send c + 1 packets where c is the number of channels, then we can cause the last tailgater to queue behind the tailgated packet. We can use this queueing to measure the bandwidth of one of the channels (we assume that all the channels have the same bandwidth). We determine the number of channels by sending variable numbers of tailgaters until we observe that sending c + 1 packets results in significantly higher delay than sending c.

Unlike the techniques described in Section 2.2, packet tailgating does not use ICMP time-exceeded packets from intermediate nodes for measurement. Our nettimer implementation uses these packets for identifying routers and for determining which tailgated packets were prematurely dropped, but it does not rely on their timely delivery. This enables it to run faster in environments where these packets are rate-limited and to run more accurately in environments where they are delivered inconsistently.

In fact, packet tailgating can work without acknowledgments at all from the destination, although our current implementation does not have this feature. By deploying software at the destination host, we can measure the one-way delay of packets. The tailgating source can continue to send later tailgater packets without knowing the delay of earlier tailgater packets. If the earlier delays can be occasionally transmitted back to the source, then the source can adaptively decide when to finish the two stages, but this is an optimization. Otherwise, the tailgater source can just send a fixed number of packets in each stage. Eventually the source and destination must communicate so that the source can specify which tailgated packets were prematurely dropped.

Measuring without acknowledgements avoids queuing in the return path, enabling packet tailgating to be twice as accurate as single-packet techniques. In addition, measuring without acknowledgements avoids ack-implosion on multicast trees, enabling packet tailgating to measure the bandwidth of several links simultaneously on a multicast tree. Single packet techniques used on a multicast tree would cause a flood of acknowledgements to flow back to the source. The queueing of these acks would likely destroy their usefulness for round trip delay measurements.

Packet tailgating also has several limitations compared to previous techniques. The first is that the source must be able to send packets quickly on the first link. This limitation is quantified in the previous section.

The second limitation is that tailgating cannot measure a very fast link after a very slow link. The problem is that the tailgated packet may have finished transmitting on the faster link before the tailgater packet has finished transmitting on the slower link. The typical size of the tailgated and tailgater packets are 1500 and 40 bytes (with 1500 bytes being the largest unfragmented packet many paths will carry and 40 bytes being the smallest TCP packet due to IP header size). Therefore, this is only a problem when the ratio of bandwidths of the faster link to the slower link exceeds 1500/40 = 37.5.

The solution to this problem is to use the one-packet technique to measure the problematic link and use tailgating everywhere else. We split the path by performing the sigma phase three times: once to the node just before the very fast link, once to the node just after the very fast link, and once to the destination node. We know we should do this when the measured ratio of bandwidths of two adjacent links approaches 37.5. This solution increases the number of packets sent beyond what regular tailgating sends but should still send fewer packets than using a one-packet method over the entire path.

Another limitation is that queueing anywhere along the path disrupts the measurement of all the links. In contrast, for one-packet techniques, only queueing at earlier links affects the measurement of a link. This could be a significant disadvantage for tailgating if there is a very congested link far downstream along a path. It would prevent accurate and fast tailgating measurement of all the earlier links. The solution is similar to that for the previous limitation. We perform the sigma phase to the node just before the congested link and to the destination node. We then perform the tailgating phase to the pre-congestion node to measure all the links before the congested one.

The final limitation is that errors may accumulate during the calculation such that links that are very far away are unmeasurable. The one-packet techniques only propagate errors forward one link [8] so they are fairly robust with respect to errors. In our measurements in the next section, the accumulation of error is not noticeable in paths up to length 11. However, quantifying the exact error is part of our future work.

Kevin Lai 2001-04-04