Parameters ${\displaystyle \alpha >0}$ shape (real)${\displaystyle \beta >0}$ shape (real) ${\displaystyle r>0}$ — number of successes until the experiment is stopped (integer but can be extended to real) ${\displaystyle k\in \{0,1,2,\ldots \))$ ${\displaystyle {\frac {\mathrm {B} (r+k,\alpha +\beta )}{\mathrm {B} (r,\alpha ))){\frac {\Gamma (k+\beta )}{k!\;\Gamma (\beta )))}$ ${\displaystyle {\begin{cases}{\frac {r\beta }{\alpha -1))&{\text{if))\ \alpha >1\\\infty &{\text{otherwise))\ \end{cases))}$ ${\displaystyle {\begin{cases}{\frac {r(\alpha +r-1)\beta (\alpha +\beta -1)}{(\alpha -2){(\alpha -1)}^{2))}&{\text{if))\ \alpha >2\\\infty &{\text{otherwise))\ \end{cases))}$ ${\displaystyle {\begin{cases}{\frac {(\alpha +2r-1)(\alpha +2\beta -1)}{(\alpha -3){\sqrt {\frac {r(\alpha +r-1)\beta (\alpha +\beta -1)}{\alpha -2))))}&{\text{if))\ \alpha >3\\\infty &{\text{otherwise))\ \end{cases))}$ does not exist ${\displaystyle {\frac {\Gamma (\alpha +r)\Gamma (\alpha +\beta )}{\Gamma (\alpha +\beta +r)\Gamma (\alpha ))){}_{2}F_{1}(r,\beta ;\alpha +\beta +r;e^{it})\!}$ where ${\displaystyle \Gamma }$ is the gamma function and ${\displaystyle {}_{2}F_{1))$ is the hypergeometric function.

In probability theory, a beta negative binomial distribution is the probability distribution of a discrete random variable ${\displaystyle X}$ equal to the number of failures needed to get ${\displaystyle r}$ successes in a sequence of independent Bernoulli trials. The probability ${\displaystyle p}$ of success on each trial stays constant within any given experiment but varies across different experiments following a beta distribution. Thus the distribution is a compound probability distribution.

This distribution has also been called both the inverse Markov-Pólya distribution and the generalized Waring distribution[1] or simply abbreviated as the BNB distribution. A shifted form of the distribution has been called the beta-Pascal distribution.[1]

If parameters of the beta distribution are ${\displaystyle \alpha }$ and ${\displaystyle \beta }$, and if

${\displaystyle X\mid p\sim \mathrm {NB} (r,p),}$

where

${\displaystyle p\sim {\textrm {B))(\alpha ,\beta ),}$

then the marginal distribution of ${\displaystyle X}$ is a beta negative binomial distribution:

${\displaystyle X\sim \mathrm {BNB} (r,\alpha ,\beta ).}$

In the above, ${\displaystyle \mathrm {NB} (r,p)}$ is the negative binomial distribution and ${\displaystyle {\textrm {B))(\alpha ,\beta )}$ is the beta distribution.

## Definition and derivation

Denoting ${\displaystyle f_{X|p}(k|q),f_{p}(q|\alpha ,\beta )}$ the densities of the negative binomial and beta distributions respectively, we obtain the PMF ${\displaystyle f(k|\alpha ,\beta ,r)}$ of the BNB distribution by marginalization:

${\displaystyle f(k|\alpha ,\beta ,r)=\int _{0}^{1}f_{X|p}(k|r,q)\cdot f_{p}(q|\alpha ,\beta )\mathrm {d} q=\int _{0}^{1}{\binom {k+r-1}{k))(1-q)^{k}q^{r}\cdot {\frac {q^{\alpha -1}(1-q)^{\beta -1)){\mathrm {B} (\alpha ,\beta )))\mathrm {d} q={\frac {1}{\mathrm {B} (\alpha ,\beta ))){\binom {k+r-1}{k))\int _{0}^{1}q^{\alpha +r-1}(1-q)^{\beta +k-1}\mathrm {d} q}$

Noting that the integral evaluates to:

${\displaystyle \int _{0}^{1}q^{\alpha +r-1}(1-q)^{\beta +k-1}\mathrm {d} q={\frac {\Gamma (\alpha +r)\Gamma (\beta +k)}{\Gamma (\alpha +\beta +k+r)))}$

we can arrive at the following formulas by relatively simple manipulations.

If ${\displaystyle r}$ is an integer, then the PMF can be written in terms of the beta function,:

${\displaystyle f(k|\alpha ,\beta ,r)={\binom {r+k-1}{k)){\frac {\mathrm {B} (\alpha +r,\beta +k)}{\mathrm {B} (\alpha ,\beta )))}$.

More generally, the PMF can be written

${\displaystyle f(k|\alpha ,\beta ,r)={\frac {\Gamma (r+k)}{k!\;\Gamma (r))){\frac {\mathrm {B} (\alpha +r,\beta +k)}{\mathrm {B} (\alpha ,\beta )))}$

or

${\displaystyle f(k|\alpha ,\beta ,r)={\frac {\mathrm {B} (r+k,\alpha +\beta )}{\mathrm {B} (r,\alpha ))){\frac {\Gamma (k+\beta )}{k!\;\Gamma (\beta )))}$.

### PMF expressed with Gamma

Using the properties of the Beta function, the PMF with integer ${\displaystyle r}$ can be rewritten as:

${\displaystyle f(k|\alpha ,\beta ,r)={\binom {r+k-1}{k)){\frac {\Gamma (\alpha +r)\Gamma (\beta +k)\Gamma (\alpha +\beta )}{\Gamma (\alpha +r+\beta +k)\Gamma (\alpha )\Gamma (\beta )))}$.

More generally, the PMF can be written as

${\displaystyle f(k|\alpha ,\beta ,r)={\frac {\Gamma (r+k)}{k!\;\Gamma (r))){\frac {\Gamma (\alpha +r)\Gamma (\beta +k)\Gamma (\alpha +\beta )}{\Gamma (\alpha +r+\beta +k)\Gamma (\alpha )\Gamma (\beta )))}$.

### PMF expressed with the rising Pochammer symbol

The PMF is often also presented in terms of the Pochammer symbol for integer ${\displaystyle r}$

${\displaystyle f(k|\alpha ,\beta ,r)={\frac {r^{(k)}\alpha ^{(r)}\beta ^{(k))){k!(\alpha +\beta )^{(r+k)))))$

## Properties

### Non-identifiable

The beta negative binomial is non-identifiable which can be seen easily by simply swapping ${\displaystyle r}$ and ${\displaystyle \beta }$ in the above density or characteristic function and noting that it is unchanged. Thus estimation demands that a constraint be placed on ${\displaystyle r}$, ${\displaystyle \beta }$ or both.

### Relation to other distributions

The beta negative binomial distribution contains the beta geometric distribution as a special case when either ${\displaystyle r=1}$ or ${\displaystyle \beta =1}$. It can therefore approximate the geometric distribution arbitrarily well. It also approximates the negative binomial distribution arbitrary well for large ${\displaystyle \alpha }$. It can therefore approximate the Poisson distribution arbitrarily well for large ${\displaystyle \alpha }$, ${\displaystyle \beta }$ and ${\displaystyle r}$.

### Heavy tailed

By Stirling's approximation to the beta function, it can be easily shown that for large ${\displaystyle k}$

${\displaystyle f(k|\alpha ,\beta ,r)\sim {\frac {\Gamma (\alpha +r)}{\Gamma (r)\mathrm {B} (\alpha ,\beta ))){\frac {k^{r-1)){(\beta +k)^{r+\alpha ))))$

which implies that the beta negative binomial distribution is heavy tailed and that moments less than or equal to ${\displaystyle \alpha }$ do not exist.

## Beta geometric distribution

The beta geometric distribution is an important special case of the beta negative binomial distribution occurring for ${\displaystyle r=1}$. In this case the pmf simplifies to

${\displaystyle f(k|\alpha ,\beta )={\frac {\mathrm {B} (\alpha +1,\beta +k)}{\mathrm {B} (\alpha ,\beta )))}$.

This distribution is used in some Buy Till you Die (BTYD) models.

Further, when ${\displaystyle \beta =1}$ the beta geometric reduces to the Yule–Simon distribution. However, it is more common to define the Yule-Simon distribution in terms of a shifted version of the beta geometric. In particular, if ${\displaystyle X\sim BG(\alpha ,1)}$ then ${\displaystyle X+1\sim YS(\alpha )}$.

## Beta negative binomial as a Pólya urn model

In the case when the 3 parameters ${\displaystyle r,\alpha }$ and ${\displaystyle \beta }$ are positive integers, the Beta negative binomial can also be motivated by an urn model - or more specifically a basic Pólya urn model. Consider an urn initially containing ${\displaystyle \alpha }$ red balls (the stopping color) and ${\displaystyle \beta }$ blue balls. At each step of the model, a ball is drawn at random from the urn and replaced, along with one additional ball of the same color. The process is repeated over and over, until ${\displaystyle r}$ red colored balls are drawn. The random variable ${\displaystyle X}$ of observed draws of blue balls are distributed according to a ${\displaystyle \mathrm {BNB} (r,\alpha ,\beta )}$. Note, at the end of the experiment, the urn always contains the fixed number ${\displaystyle r+\alpha }$ of red balls while containing the random number ${\displaystyle X+\beta }$ blue balls.

By the non-identifiability property, ${\displaystyle X}$ can be equivalently generated with the urn initially containing ${\displaystyle \alpha }$ red balls (the stopping color) and ${\displaystyle r}$ blue balls and stopping when ${\displaystyle \beta }$ red balls are observed.