Mean Delay Considered Harmful

Stanislav Shalunov, April 2006
Summary: When network delay is reported as a single number, a common approach is to report the arithmetic average (mean) of observed delay singleton measurements. This practice yields a poor metric. Median is superior to mean in every way save simplicity of computation.

The delay (one-way or round-trip) of a single packet is a singleton measurement of delay of a network. When network delay is reported, singleton is usually not the value to report; instead, some number of packets is sent, and some function of their delays is reported. Such a set of delays is called a sample. The function yields some empirical characteristic of the sample. If the sample is thought to come from an underlying random distribution, the function can be a parameter estimator for the distribution.

A common function used in this context is arithmetic average. It has several problems:

  1. Treatment of lost packets. Some packets that were sent might not be received within the period of time the receiver is willing to wait for them. These packets are called lost. It is important to realize that the only thing observed about such lost packets is that they didn't make it within the timeout period. For all we know, they might arrive later. What is the delay of the lost packet? We can only say that it is greater than the timeout. (It might be infinite or finite.) The treatment the mean estimator must give to these packets is to simply omit (!) them from the sample. Why does it omit the packets? Because it's the right thing to do? Of course not---it biases the result in a timeout-dependent way, which can't be desirable. It omits the packets because, within the arithmetic average framework, there's no better answer.
  2. Wrong answer. Reporting ``mean delay'' implies an assumption of a distribution from which delay values are drawn and reporting the mean of such distribution. Needless to say, arithmetic average of values in the sample below a certain value is not a consistent, or even asymptotically consistent, estimator of the mean. This can be insignificant for certain classes of distributions if the cutoff value is chosen so that it's large enough. However, for other classes of distributions, e.g., heavy-tailed distributions, any cutoff value is bad and extremely distorting; with a heavy-tailed distribution, for any given cutoff, the bulk of the contribution to the estimate of the mean will come from values above the cutoff. It is unknown whether network delay has a heavy tail, but using an estimator that is known to give the wrong answer for some distribution, when a much better choice is available is folly. Note that this problem stems from the first one.
  3. Lack of robustness. An even more general property is robustness of the estimator. Robustness, in statistics, is a general property of analysis: robust analysis does not change significantly when the data does not change significantly. One property of robust estimators is that when only a small fraction of values in a sample is changed, the estimator does not change significantly. Arithmetic average is not a robust estimator. For any given sample and any given value of the estimate, changing just a single given value in the sample is enough to make the arithmetic average equal to the desired estimate. The problem becomes especially acute when the tail of the distribution is relatively heavy (as it appears to often be with network delay).

A better statistic than mean is the median. It lacks any of the problems above, without contributing any new problems, other than relative difficulty of computation (arithmetic average computation is trivial). It must be noted here that it is possible to compute the median approximately in O(n) time (with a single pass over the data) and O(1) space. The amount of space is traded against precision of the computation and is not large at all in practical terms. The code in thrulay (currently in src/thrulayd.c, search for ``Quantiles'') provides a free implementation of this algorithm with a BSD-like license (so it can be used in closed-source commercial products, too).

Comments