Noncentral generalization of the chi-squared distribution
Noncentral chi-squared
Probability density function |
Cumulative distribution function |
Parameters |
degrees of freedom
non-centrality parameter |
---|
Support |
|
---|
PDF |
|
---|
CDF |
with Marcum Q-function |
---|
Mean |
|
---|
Variance |
|
---|
Skewness |
|
---|
Excess kurtosis |
|
---|
MGF |
|
---|
CF |
|
---|
In probability theory and statistics, the noncentral chi-squared distribution (or noncentral chi-square distribution, noncentral distribution) is a noncentral generalization of the chi-squared distribution. It often arises in the power analysis of statistical tests in which the null distribution is (perhaps asymptotically) a chi-squared distribution; important examples of such tests are the likelihood-ratio tests.[1]
Definitions
Background
Let be k independent, normally distributed random variables with means and unit variances. Then the random variable
is distributed according to the noncentral chi-squared distribution. It has two parameters: which specifies the number of degrees of freedom (i.e. the number of ), and which is related to the mean of the random variables by:
is sometimes called the noncentrality parameter. Note that some references define in other ways, such as half of the above sum, or its square root.
This distribution arises in multivariate statistics as a derivative of the multivariate normal distribution. While the central chi-squared distribution is the squared norm of a random vector with distribution (i.e., the squared distance from the origin to a point taken at random from that distribution), the non-central is the squared norm of a random vector with distribution. Here is a zero vector of length k, and is the identity matrix of size k.
Density
The probability density function (pdf) is given by
where is distributed as chi-squared with degrees of freedom.
From this representation, the noncentral chi-squared distribution is seen to be a Poisson-weighted mixture of central chi-squared distributions. Suppose that a random variable J has a Poisson distribution with mean , and the conditional distribution of Z given J = i is chi-squared with k + 2i degrees of freedom. Then the unconditional distribution of Z is non-central chi-squared with k degrees of freedom, and non-centrality parameter .
Alternatively, the pdf can be written as
where is a modified Bessel function of the first kind given by
Using the relation between Bessel functions and hypergeometric functions, the pdf can also be written as:[2]
The case k = 0 (zero degrees of freedom), in which case the distribution has a discrete component at zero,
is discussed by Torgersen (1972) and further by Siegel (1979).
Derivation of the pdf
The derivation of the probability density function is most easily done by performing the following steps:
- Since have unit variances, their joint distribution is spherically symmetric, up to a location shift.
- The spherical symmetry then implies that the distribution of depends on the means only through the squared length, . Without loss of generality, we can therefore take and .
- Now derive the density of (i.e. the k = 1 case). Simple transformation of random variables shows that
- where is the standard normal density.
- Expand the cosh term in a Taylor series. This gives the Poisson-weighted mixture representation of the density, still for k = 1. The indices on the chi-squared random variables in the series above are 1 + 2i in this case.
- Finally, for the general case. We've assumed, without loss of generality, that are standard normal, and so has a central chi-squared distribution with (k − 1) degrees of freedom, independent of . Using the poisson-weighted mixture representation for , and the fact that the sum of chi-squared random variables is also a chi-square, completes the result. The indices in the series are (1 + 2i) + (k − 1) = k + 2i as required.
Properties
Moment generating function
The moment-generating function is given by
Moments
The first few raw moments are:
The first few central moments are:
The nth cumulant is
Hence
Cumulative distribution function
Again using the relation between the central and noncentral chi-squared distributions, the cumulative distribution function (cdf) can be written as
where is the cumulative distribution function of the central chi-squared distribution with k degrees of freedom which is given by
- and where is the lower incomplete gamma function.
The Marcum Q-function can also be used to represent the cdf.[3]
When the degrees of freedom k is positive odd integer, we have a closed form expression for the complementary cumulative distribution function given by[4]
where n is non-negative integer, Q is the Gaussian Q-function, and I is the modified Bessel function of first kind with half-integer order. The modified Bessel function of first kind with half-integer order in itself can be represented as a finite sum in terms of hyperbolic functions.
In particular, for k = 1, we have
Also, for k = 3, we have
Approximation (including for quantiles)
Abdel-Aty[5] derives (as "first approx.") a non-central Wilson–Hilferty transformation:
is approximately normally distributed, i.e.,
which is quite accurate and well adapting to the noncentrality. Also, becomes for , the (central) chi-squared case.
Sankaran[6] discusses a number of closed form approximations for the cumulative distribution function. In an earlier paper,[7] he derived and states the following approximation:
where
- denotes the cumulative distribution function of the standard normal distribution;
This and other approximations are discussed in a later text book.[8]
More recently, since the CDF of non-central chi-squared distribution with odd degree of freedom can be exactly computed, the CDF for even degree of freedom can be approximated by exploiting the monotonicity and log-concavity properties of Marcum-Q function as
Another approximation that also serves as an upper bound is given by
For a given probability, these formulas are easily inverted to provide the corresponding approximation for , to compute approximate quantiles.
Occurrence and applications
Use in tolerance intervals
Two-sided normal regression tolerance intervals can be obtained based on the noncentral chi-squared distribution.[10] This enables the calculation of a statistical interval within which, with some confidence level, a specified proportion of a sampled population falls.