In probability theory, the law (or formula) of total probability is a fundamental rule relating marginal probabilities to conditional probabilities. It expresses the total probability of an outcome which can be realized via several distinct events, hence the name.

## Statement

The law of total probability is[1] a theorem that states, in its discrete case, if ${\displaystyle \left$$(B_{n}:n=1,2,3,\ldots }\right$$)$ is a finite or countably infinite set of mutually exclusive and collectively exhaustive events, then for any event ${\displaystyle A}$:

${\displaystyle P(A)=\sum _{n}P(A\cap B_{n})}$

or, alternatively,[1]

${\displaystyle P(A)=\sum _{n}P(A\mid B_{n})P(B_{n}),}$

where, for any ${\displaystyle n}$, if ${\displaystyle P(B_{n})=0}$, then these terms are simply omitted from the summation since ${\displaystyle P(A\mid B_{n})}$ is finite.

The summation can be interpreted as a weighted average, and consequently the marginal probability, ${\displaystyle P(A)}$, is sometimes called "average probability";[2] "overall probability" is sometimes used in less formal writings.[3]

The law of total probability can also be stated for conditional probabilities:

${\displaystyle P({A|C})={\frac {P({A,C})}{P(C)))={\frac {\sum \limits _{n}{P({A,{B_{n)),C}))){P(C)))={\frac {\sum \limits _{n}P({A\mid {B_{n)),C})P(((B_{n))\mid C})P(C)}{P(C)))=\sum \limits _{n}P({A\mid {B_{n)),C})P(((B_{n))\mid C})}$

Taking the ${\displaystyle B_{n))$ as above, and assuming ${\displaystyle C}$ is an event independent of any of the ${\displaystyle B_{n))$:

${\displaystyle P(A\mid C)=\sum _{n}P(A\mid C,B_{n})P(B_{n})}$

## Continuous case

The law of total probability extends to the case of conditioning on events generated by continuous random variables. Let ${\displaystyle (\Omega ,{\mathcal {F)),P)}$ be a probability space. Suppose ${\displaystyle X}$ is a random variable with distribution function ${\displaystyle F_{X))$, and ${\displaystyle A}$ an event on ${\displaystyle (\Omega ,{\mathcal {F)),P)}$. Then the law of total probability states

${\displaystyle P(A)=\int _{-\infty }^{\infty }P(A|X=x)dF_{X}(x).}$

If ${\displaystyle X}$ admits a density function ${\displaystyle f_{X))$, then the result is

${\displaystyle P(A)=\int _{-\infty }^{\infty }P(A|X=x)f_{X}(x)dx.}$

Moreover, for the specific case where ${\displaystyle A=\{Y\in B\))$, where ${\displaystyle B}$ is a Borel set, then this yields

${\displaystyle P(Y\in B)=\int _{-\infty }^{\infty }P(Y\in B|X=x)f_{X}(x)dx.}$

## Example

Suppose that two factories supply light bulbs to the market. Factory X's bulbs work for over 5000 hours in 99% of cases, whereas factory Y's bulbs work for over 5000 hours in 95% of cases. It is known that factory X supplies 60% of the total bulbs available and Y supplies 40% of the total bulbs available. What is the chance that a purchased bulb will work for longer than 5000 hours?

Applying the law of total probability, we have:

{\displaystyle {\begin{aligned}P(A)&=P(A\mid B_{X})\cdot P(B_{X})+P(A\mid B_{Y})\cdot P(B_{Y})\\[4pt]&={99 \over 100}\cdot {6 \over 10}+{95 \over 100}\cdot {4 \over 10}=((594+380} \over 1000}={974 \over 1000}\end{aligned))}

where

• ${\displaystyle P(B_{X})={6 \over 10))$ is the probability that the purchased bulb was manufactured by factory X;
• ${\displaystyle P(B_{Y})={4 \over 10))$ is the probability that the purchased bulb was manufactured by factory Y;
• ${\displaystyle P(A\mid B_{X})={99 \over 100))$ is the probability that a bulb manufactured by X will work for over 5000 hours;
• ${\displaystyle P(A\mid B_{Y})={95 \over 100))$ is the probability that a bulb manufactured by Y will work for over 5000 hours.

Thus each purchased light bulb has a 97.4% chance to work for more than 5000 hours.

## Other names

The term law of total probability is sometimes taken to mean the law of alternatives, which is a special case of the law of total probability applying to discrete random variables.[citation needed] One author uses the terminology of the "Rule of Average Conditional Probabilities",[4] while another refers to it as the "continuous law of alternatives" in the continuous case.[5] This result is given by Grimmett and Welsh[6] as the partition theorem, a name that they also give to the related law of total expectation.

## Notes

1. ^ a b Zwillinger, D., Kokoska, S. (2000) CRC Standard Probability and Statistics Tables and Formulae, CRC Press. ISBN 1-58488-059-7 page 31.
2. ^ Paul E. Pfeiffer (1978). Concepts of probability theory. Courier Dover Publications. pp. 47–48. ISBN 978-0-486-63677-1.
3. ^ Deborah Rumsey (2006). Probability for dummies. For Dummies. p. 58. ISBN 978-0-471-75141-0.
4. ^ Jim Pitman (1993). Probability. Springer. p. 41. ISBN 0-387-97974-3.
5. ^ Kenneth Baclawski (2008). Introduction to probability with R. CRC Press. p. 179. ISBN 978-1-4200-6521-3.
6. ^ Probability: An Introduction, by Geoffrey Grimmett and Dominic Welsh, Oxford Science Publications, 1986, Theorem 1B.

## References

• Introduction to Probability and Statistics by Robert J. Beaver, Barbara M. Beaver, Thomson Brooks/Cole, 2005, page 159.
• Theory of Statistics, by Mark J. Schervish, Springer, 1995.
• Schaum's Outline of Probability, Second Edition, by John J. Schiller, Seymour Lipschutz, McGraw–Hill Professional, 2010, page 89.
• A First Course in Stochastic Models, by H. C. Tijms, John Wiley and Sons, 2003, pages 431–432.
• An Intermediate Course in Probability, by Alan Gut, Springer, 1995, pages 5–6.