D i s t r i b u t i o n - f r e e c o n f i d e n c e intervals for m e a s u r e m e n t of effective b a n d w i d t h s Laszlo Gyorfi, Andras Racz, Ken Duffy, John T. Lewis, Raymond Russell, Fergal Toomey Dublin Institute for Advanced Studies Applied Probability Group School of Theoretical Physics 10 Burlington Road, Dublin 4, Ireland russell@stp.dias.ie The notion of effective bandwidth has become widely accepted as a measure of the resource requirements of bursty traffic in queuing networks. Intuitively, the effective bandwidth of a traffic source at a given network resource determines the quantity of resource capacity which must be reserved for it in order to achieve a specified rate of data-loss. This quantity depends on the statistical properties of the traffic source, on the properties of other traffic which may be sharing the resource in question, and on the nature of the resource itself (for example, buffered or unbuffered). It has been realised, through the work of a number of authors, that the complex relationships between these different factors can be unravelled using a family of large deviation limit results for the data-loss probability. These results lead to the effective bandwidth function, as defined by Frank Kelly. Given a stochastic model of a traffic source, the associated effective bandwidth function can be calculated more-or-less easily from the model's parameters. For many purposes, however, it may be more practical to determine effective bandwidths directly from traffic measurements, thereby eliminating the need to fit a stochastic model to recorded data. In this paper we address the extent of sampling error in measured values of effective bandwidth. Sampling error is not a critical issue in off-line traffic characterisation, but assumes greater importance when measurements are used for the purpose of dynamic resource allocation. An important consideration is the fact that the large deviation results in which the effective bandwidth function has its origins are intended to control the frequencies of extremely rare events, whose probabilities may be of the order of 10e-6 or less. At this level of liklihood even small sampling errors can have a significant impact; the 1.0 - 10e-6 quantile for an estimator of effective bandwidth may be considerably larger than its mean. Interval-estimates of bandwidth requirement are desirable: if the target data-loss rate is 10e-6, then a 1.0 - 10e-6 upper confidence limit for bandwidth requirement can be used safely as a basis for resource allocation. Approximate confidence intervals can be obtained from a Gaussian approximation in the usual manner, but this approach does not seem appropriate here due to the very low liklihood levels which are of interest. Instead we turn to concentration inequalities designed to provide rigorous upper bounds on the probabilities of rare events. Hoeffding's inequality is particularly attractive for our problem: to use it we require only an upper bound on the random variables of interest, and this can obtained directly, without further measurement, if traffic sources declare a peak rate or other token bucket constraint.
/lp/association-for-computing-machinery/distribution-free-confidence-intervals-for-measurement-of-effective-2ogq1s0Gx0