A tolerance interval (TI) is a statistical interval within which, with some confidence level, a specified sampled proportion of a population falls. "More specifically, a $100\times p %/100\times(1-α)$ tolerance interval provides limits within which at least a certain proportion (p) of the population falls with a given level of confidence (1−α)."^[1] "A (p, 1−α) tolerance interval (TI) based on a sample is constructed so that it would include at least a proportion p of the sampled population with confidence 1−α; such a TI is usually referred to as p-content − (1−α) coverage TI."^[2] "A (p, 1−α) upper tolerance limit (TL) is simply a 1−α upper confidence limit for the 100 p percentile of the population."^[2]

Definition

This article needs attention from an expert in Statistics. The specific problem is: Definition needs to be contrasted and discussed against definition of a prediction interval. WikiProject Statistics may be able to help recruit an expert. (May 2024)

Given

observations $\mathbf {x} =(x_{1},\ldots ,x_{n})$ which are realization of independent random variables $\mathbf {X} =(X_{1},\ldots ,X_{n})$ which have a common distribution ${\displaystyle F_{\theta ))$ , with unknown parameter $\theta$
a random variable ${\displaystyle X_{0))$ from the same distribution ${\displaystyle F_{\theta ))$ and independent of the first $n$ variables.

Then a tolerance interval with endpoints $(L(\mathbf {x} ),U(\mathbf {x} )]$ which has the defining property: $\inf _{\theta }\((\Pr }_{\theta }\left(F_{\theta }(U(\mathbf {X} ))-F_{\theta }(L(\mathbf {X} )\right)\geq p)\}=100(1-\alpha )$ , without referring to a sample ${\displaystyle X_{0))$ .

This is in contrast to a prediction interval with endpoints $[l(\mathbf {x} ),u(\mathbf {x} )]$ has the defining property $\inf _{\theta }\((\Pr }_{\theta }(X_{0}\in [l(\mathbf {X} ),u(\mathbf {X} )])\}=100(1-\alpha )$ .

Calculation

One-sided normal tolerance intervals have an exact solution in terms of the sample mean and sample variance based on the noncentral t-distribution.^[3] Two-sided normal tolerance intervals can be obtained based on the noncentral chi-squared distribution.^[3]

Relation to other intervals

Further information: Interval estimation

"In the parameters-known case, a 95% tolerance interval and a 95% prediction interval are the same."^[4] If we knew a population's exact parameters, we would be able to compute a range within which a certain proportion of the population falls. For example, if we know a population is normally distributed with mean $\mu$ and standard deviation $\sigma$ , then the interval $\mu \pm 1.96\sigma$ includes 95% of the population (1.96 is the z-score for 95% coverage of a normally distributed population).

However, if we have only a sample from the population, we know only the sample mean ${\hat {\mu ))$ and sample standard deviation ${\hat {\sigma ))$ , which are only estimates of the true parameters. In that case, ${\hat {\mu ))\pm 1.96{\hat {\sigma ))$ will not necessarily include 95% of the population, due to variance in these estimates. A tolerance interval bounds this variance by introducing a confidence level $\gamma$ , which is the confidence with which this interval actually includes the specified proportion of the population. For a normally distributed population, a z-score can be transformed into a "k factor" or tolerance factor^[5] for a given $\gamma$ via lookup tables or several approximation formulas.^[6] "As the degrees of freedom approach infinity, the prediction and tolerance intervals become equal."^[7]

The tolerance interval is less widely known than the confidence interval and prediction interval, a situation some educators have lamented, as it can lead to misuse of the other intervals where a tolerance interval is more appropriate.^[8]^[9]

The tolerance interval differs from a confidence interval in that the confidence interval bounds a single-valued population parameter (the mean or the variance, for example) with some confidence, while the tolerance interval bounds the range of data values that includes a specific proportion of the population. Whereas a confidence interval's size is entirely due to sampling error, and will approach a zero-width interval at the true population parameter as sample size increases, a tolerance interval's size is due partly to sampling error and partly to actual variance in the population, and will approach the population's probability interval as sample size increases.^[8]^[9]

The tolerance interval is related to a prediction interval in that both put bounds on variation in future samples. However, the prediction interval only bounds a single future sample, whereas a tolerance interval bounds the entire population (equivalently, an arbitrary sequence of future samples). In other words, a prediction interval covers a specified proportion of a population on average, whereas a tolerance interval covers it with a certain confidence level, making the tolerance interval more appropriate if a single interval is intended to bound multiple future samples.^[9]^[10]

Examples

^[8] gives the following example:

So consider once again a proverbial EPA mileage test scenario, in which several nominally identical autos of a particular model are tested to produce mileage figures ${\displaystyle y_{1},y_{2},...,y_{n))$ . If such data are processed to produce a 95% confidence interval for the mean mileage of the model, it is, for example, possible to use it to project the mean or total gasoline consumption for the manufactured fleet of such autos over their first 5,000 miles of use. Such an interval, would however, not be of much help to a person renting one of these cars and wondering whether the (full) 10-gallon tank of gas will suffice to carry him the 350 miles to his destination. For that job, a prediction interval would be much more useful. (Consider the differing implications of being "95% sure" that $\mu \geq 35$ as opposed to being "95% sure" that $y_{n+1}\geq 35$ .) But neither a confidence interval for $\mu$ nor a prediction interval for a single additional mileage is exactly what is needed by a design engineer charged with determining how large a gas tank the model really needs to guarantee that 99% of the autos produced will have a 400-mile cruising range. What the engineer really needs is a tolerance interval for a fraction $p=.99$ of mileages of such autos.

Another example is given by:^[10]

The air lead levels were collected from $n=15$ different areas within the facility. It was noted that the log-transformed lead levels fitted a normal distribution well (that is, the data are from a lognormal distribution. Let $\mu$ and ${\displaystyle \sigma ^{2))$ , respectively, denote the population mean and variance for the log-transformed data. If $X$ denotes the corresponding random variable, we thus have $X\sim {\mathcal {N))(\mu ,\sigma ^{2})$ . We note that $\exp(\mu )$ is the median air lead level. A confidence interval for $\mu$ can be constructed the usual way, based on the t-distribution; this in turn will provide a confidence interval for the median air lead level. If ${\bar {X))$ and $S$ denote the sample mean and standard deviation of the log-transformed data for a sample of size n, a 95% confidence interval for $\mu$ is given by ${\bar {X))\pm t_{n-1,0.975}S/{\sqrt {n))$ , where ${\displaystyle t_{m,1-\alpha ))$ denotes the $1-\alpha$ quantile of a t-distribution with $m$ degrees of freedom. It may also be of interest to derive a 95% upper confidence bound for the median air lead level. Such a bound for $\mu$ is given by ${\bar {X))+t_{n-1,0.95}S/{\sqrt {n))$ . Consequently, a 95% upper confidence bound for the median air lead is given by ${\displaystyle \exp {\left({\bar {X))+t_{n-1,0.95}S/{\sqrt {n))\right)))$ . Now suppose we want to predict the air lead level at a particular area within the laboratory. A 95% upper prediction limit for the log-transformed lead level is given by ${\bar {X))+t_{n-1,0.95}S{\sqrt {\left(1+1/n\right)))$ . A two-sided prediction interval can be similarly computed. The meaning and interpretation of these intervals are well known. For example, if the confidence interval ${\bar {X))\pm t_{n-1,0.975}S/{\sqrt {n))$ is computed repeatedly from independent samples, 95% of the intervals so computed will include the true value of $\mu$ , in the long run. In other words, the interval is meant to provide information concerning the parameter $\mu$ only. A prediction interval has a similar interpretation, and is meant to provide information concerning a single lead level only. Now suppose we want to use the sample to conclude whether or not at least 95% of the population lead levels are below a threshold. The confidence interval and prediction interval cannot answer this question, since the confidence interval is only for the median lead level, and the prediction interval is only for a single lead level. What is required is a tolerance interval; more specifically, an upper tolerance limit. The upper tolerance limit is to be computed subject to the condition that at least 95% of the population lead levels is below the limit, with a certain confidence level, say 99%.

References

K. Krishnamoorthy (2009). Statistical Tolerance Regions: Theory, Applications, and Computation. John Wiley and Sons. ISBN 978-0-470-38026-0.; Chap. 1, "Preliminaries", is available at http://media.wiley.com/product_data/excerpt/68/04703802/0470380268.pdf
Derek S. Young (August 2010). "tolerance: An R Package for Estimating Tolerance Intervals". Journal of Statistical Software. 36 (5): 1–39. ISSN 1548-7660. Retrieved 19 February 2013.
ISO 16269-6, Statistical interpretation of data, Part 6: Determination of statistical tolerance intervals, Technical Committee ISO/TC 69, Applications of statistical methods. Available at http://standardsproposals.bsigroup.com/home/getpdf/458

Statistics

Descriptive statistics

Continuous data

Center	Mean Arithmetic Arithmetic-Geometric Cubic Generalized/power Geometric Harmonic Heronian Heinz Lehmer Median Mode
Dispersion	Average absolute deviation Coefficient of variation Interquartile range Percentile Range Standard deviation Variance
Shape	Central limit theorem Moments Kurtosis L-moments Skewness

Count data

Index of dispersion

Summary tables

Dependence

Graphics

Data collection

Study design	Effect size Missing data Optimal design Population Replication Sample size determination Statistic Statistical power
Survey methodology	Sampling Cluster Stratified Opinion poll Questionnaire Standard error
Controlled experiments	Blocking Factorial experiment Interaction Random assignment Randomized controlled trial Randomized experiment Scientific control
Adaptive designs	Adaptive clinical trial Stochastic approximation Up-and-down designs
Observational studies	Cohort study Cross-sectional study Natural experiment Quasi-experiment

Statistical inference

Statistical theory

Frequentist inference

Point estimation	Estimating equations Maximum likelihood Method of moments M-estimator Minimum distance Unbiased estimators Mean-unbiased minimum-variance Rao–Blackwellization Lehmann–Scheffé theorem Median unbiased Plug-in
Interval estimation	Confidence interval Pivot Likelihood interval Prediction interval Tolerance interval Resampling Bootstrap Jackknife
Testing hypotheses	1- & 2-tails Power Uniformly most powerful test Permutation test Randomization test Multiple comparisons
Parametric tests	Likelihood-ratio Score/Lagrange multiplier Wald

Specific tests

Z-test (normal) Student's t-test F-test
Goodness of fit	Chi-squared G-test Kolmogorov–Smirnov Anderson–Darling Lilliefors Jarque–Bera Normality (Shapiro–Wilk) Likelihood-ratio test Model selection Cross validation AIC BIC
Rank statistics	Sign Sample median Signed rank (Wilcoxon) Hodges–Lehmann estimator Rank sum (Mann–Whitney) Nonparametric anova 1-way (Kruskal–Wallis) 2-way (Friedman) Ordered alternative (Jonckheere–Terpstra) Van der Waerden test

Bayesian inference

Correlation	Pearson product-moment Partial correlation Confounding variable Coefficient of determination
Regression analysis	Errors and residuals Regression validation Mixed effects models Simultaneous equations models Multivariate adaptive regression splines (MARS)
Linear regression	Simple linear regression Ordinary least squares General linear model Bayesian regression
Non-standard predictors	Nonlinear regression Nonparametric Semiparametric Isotonic Robust Heteroscedasticity Homoscedasticity
Generalized linear model	Exponential families Logistic (Bernoulli) / Binomial / Poisson regressions
Partition of variance	Analysis of variance (ANOVA, anova) Analysis of covariance Multivariate ANOVA Degrees of freedom

Categorical / Multivariate / Time-series / Survival analysis

Categorical

Multivariate

Time-series

General	Decomposition Trend Stationarity Seasonal adjustment Exponential smoothing Cointegration Structural break Granger causality
Specific tests	Dickey–Fuller Johansen Q-statistic (Ljung–Box) Durbin–Watson Breusch–Godfrey
Time domain	Autocorrelation (ACF) partial (PACF) Cross-correlation (XCF) ARMA model ARIMA model (Box–Jenkins) Autoregressive conditional heteroskedasticity (ARCH) Vector autoregression (VAR)
Frequency domain	Spectral density estimation Fourier analysis Least-squares spectral analysis Wavelet Whittle likelihood

Survival

Survival function	Kaplan–Meier estimator (product limit) Proportional hazards models Accelerated failure time (AFT) model First hitting time
Hazard function	Nelson–Aalen estimator
Test	Log-rank test

Applications

Biostatistics	Bioinformatics Clinical trials / studies Epidemiology Medical statistics
Engineering statistics	Chemometrics Methods engineering Probabilistic design Process / quality control Reliability System identification
Social statistics	Actuarial science Census Crime statistics Demography Econometrics Jurimetrics National accounts Official statistics Population statistics Psychometrics
Spatial statistics	Cartography Environmental statistics Geographic information system Geostatistics Kriging

Definition

Calculation

Relation to other intervals

Examples

See also

References

Further reading