Power series derived from a discrete probability distribution
In probability theory, the probability generating function of a discrete random variable is a power series representation (the generating function) of the probability mass function of the random variable. Probability generating functions are often employed for their succinct description of the sequence of probabilities Pr(X = i) in the probability mass function for a random variable X, and to make available the well-developed theory of power series with non-negative coefficients.
Definition
Univariate case
If X is a discrete random variable taking values in the non-negative integers {0,1, ...}, then the probability generating function of X is defined as
[1]
![{\displaystyle G(z)=\operatorname {E} (z^{X})=\sum _{x=0}^{\infty }p(x)z^{x},}](https://wikimedia.org/api/rest_v1/media/math/render/svg/601d281138657d96ff6c5dd0d2295cfda413f5fd)
where p is the probability mass function of X. Note that the subscripted notations GX and pX are often used to emphasize that these pertain to a particular random variable X, and to its distribution. The power series converges absolutely at least for all complex numbers z with |z| ≤ 1; in many examples the radius of convergence is larger.
Multivariate case
If X = (X1,...,Xd) is a discrete random variable taking values in the d-dimensional non-negative integer lattice {0,1, ...}d, then the probability generating function of X is defined as
![{\displaystyle G(z)=G(z_{1},\ldots ,z_{d})=\operatorname {E} {\bigl (}z_{1}^{X_{1))\cdots z_{d}^{X_{d)){\bigr )}=\sum _{x_{1},\ldots ,x_{d}=0}^{\infty }p(x_{1},\ldots ,x_{d})z_{1}^{x_{1))\cdots z_{d}^{x_{d)),}](https://wikimedia.org/api/rest_v1/media/math/render/svg/6350e1f6a2701e64ce555cfa61902e903502f137)
where p is the probability mass function of X. The power series converges absolutely at least for all complex vectors
with
Properties
Power series
Probability generating functions obey all the rules of power series with non-negative coefficients. In particular, G(1−) = 1, where G(1−) = limz→1G(z) from below, since the probabilities must sum to one. So the radius of convergence of any probability generating function must be at least 1, by Abel's theorem for power series with non-negative coefficients.
Probabilities and expectations
The following properties allow the derivation of various basic quantities related to X:
- The probability mass function of X is recovered by taking derivatives of G,
![{\displaystyle p(k)=\operatorname {Pr} (X=k)={\frac {G^{(k)}(0)}{k!)).}](https://wikimedia.org/api/rest_v1/media/math/render/svg/2040c297af2a9fd7b715befc2bea0a2b02e99019)
- It follows from Property 1 that if random variables X and Y have probability-generating functions that are equal,
, then
. That is, if X and Y have identical probability-generating functions, then they have identical distributions.
- The normalization of the probability mass function can be expressed in terms of the generating function by
![{\displaystyle \operatorname {E} [1]=G(1^{-})=\sum _{i=0}^{\infty }p(i)=1.}](https://wikimedia.org/api/rest_v1/media/math/render/svg/cb9ac4537a44a8a5216606c8fcc8c093279b799e)
- The expectation of
is given by
![{\displaystyle \operatorname {E} [X]=G'(1^{-}).}](https://wikimedia.org/api/rest_v1/media/math/render/svg/a4c001dc37d1ea6e1bf6774d7fad40f45a8a6e06)
- More generally, the kth factorial moment,
of X is given by
![{\displaystyle \operatorname {E} \left[{\frac {X!}{(X-k)!))\right]=G^{(k)}(1^{-}),\quad k\geq 0.}](https://wikimedia.org/api/rest_v1/media/math/render/svg/1ee4b3d97b3bf6a3c0103dc8411147afcbe44528)
- So the variance of X is given by
![{\displaystyle \operatorname {Var} (X)=G''(1^{-})+G'(1^{-})-\left[G'(1^{-})\right]^{2}.}](https://wikimedia.org/api/rest_v1/media/math/render/svg/e4e77ce2135d6d93dab77d15b82cda8c60f61779)
- Finally, the kth raw moment of X is given by
![{\displaystyle \operatorname {E} [X^{k}]=\left(z{\frac {\partial }{\partial z))\right)^{k}G(z){\Big |}_{z=1^{-))}](https://wikimedia.org/api/rest_v1/media/math/render/svg/31e75bd0711e82a0772c01f373f1fbdf093fce58)
where X is a random variable,
is the probability generating function (of X) and
is the moment-generating function (of X) .
Functions of independent random variables
Probability generating functions are particularly useful for dealing with functions of independent random variables. For example:
- If X1, X2, ..., XN is a sequence of independent (and not necessarily identically distributed) random variables that take on natural-number values, and
![{\displaystyle S_{N}=\sum _{i=1}^{N}a_{i}X_{i},}](https://wikimedia.org/api/rest_v1/media/math/render/svg/7aca6500014ec26fb4b0112a4f55c69f7a52fd73)
- where the ai are constant natural numbers, then the probability generating function is given by
![{\displaystyle G_{S_{N))(z)=\operatorname {E} (z^{S_{N)))=\operatorname {E} \left(z^{\sum _{i=1}^{N}a_{i}X_{i},}\right)=G_{X_{1))(z^{a_{1)))G_{X_{2))(z^{a_{2)))\cdots G_{X_{N))(z^{a_{N))).}](https://wikimedia.org/api/rest_v1/media/math/render/svg/d1c7ec9efde0d20ccd9864578814e9d8065cf66f)
- For example, if
![{\displaystyle S_{N}=\sum _{i=1}^{N}X_{i},}](https://wikimedia.org/api/rest_v1/media/math/render/svg/3f1e163893aa129ef2f62f6071c009cb427cd430)
- then the probability generating function, GSN(z), is given by
![{\displaystyle G_{S_{N))(z)=G_{X_{1))(z)G_{X_{2))(z)\cdots G_{X_{N))(z).}](https://wikimedia.org/api/rest_v1/media/math/render/svg/921c82aef0a82eec392b3334956ecd24ad556d5f)
- It also follows that the probability generating function of the difference of two independent random variables S = X1 − X2 is
![{\displaystyle G_{S}(z)=G_{X_{1))(z)G_{X_{2))(1/z).}](https://wikimedia.org/api/rest_v1/media/math/render/svg/915208fe1bac4bead86adeedb51db2319c328d81)
- Suppose that N, the number of independent random variables in the sum above, is not a fixed natural number but is also an independent, discrete random variable taking values on the non-negative integers, with probability generating function GN. If the X1, X2, ..., XN are independent and identically distributed with common probability generating function GX, then
![{\displaystyle G_{S_{N))(z)=G_{N}(G_{X}(z)).}](https://wikimedia.org/api/rest_v1/media/math/render/svg/e5de4e9e3de32ca266357b455bfd7522f65f4991)
- This can be seen, using the law of total expectation, as follows:
![{\displaystyle {\begin{aligned}G_{S_{N))(z)&=\operatorname {E} (z^{S_{N)))=\operatorname {E} (z^{\sum _{i=1}^{N}X_{i)))\\[4pt]&=\operatorname {E} {\big (}\operatorname {E} (z^{\sum _{i=1}^{N}X_{i))\mid N){\big )}=\operatorname {E} {\big (}(G_{X}(z))^{N}{\big )}=G_{N}(G_{X}(z)).\end{aligned))}](https://wikimedia.org/api/rest_v1/media/math/render/svg/5c696b17b2376859b07c2c83e0084d7aca523fe3)
- This last fact is useful in the study of Galton–Watson processes and compound Poisson processes.
- Suppose again that N is also an independent, discrete random variable taking values on the non-negative integers, with probability generating function GN and probability mass function
. If the X1, X2, ..., XN are independent, but not identically distributed random variables, where
denotes the probability generating function of
, then
![{\displaystyle G_{S_{N))(z)=\sum _{i\geq 1}f_{i}\prod _{k=1}^{i}G_{X_{i))(z).}](https://wikimedia.org/api/rest_v1/media/math/render/svg/a9d6bb2592202450bd6ceefee834c77544889dc0)
- For identically distributed Xi this simplifies to the identity stated before. The general case is sometimes useful to obtain a decomposition of SN by means of generating functions.