Probability distribution
Log-normal
Probability density function Identical parameter but differing parameters  |
Cumulative distribution function
 |
Notation |
 |
---|
Parameters |
,  |
---|
Support |
 |
---|
PDF |
 |
---|
CDF |
![{\displaystyle {\frac {1}{2))\left[1+\operatorname {erf} \left({\frac {\ln x-\mu }{\sigma {\sqrt {2))))\right)\right]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/ec93a42a4327bd3b71f08850aa4a4abe2cc99f94) |
---|
Quantile |
 |
---|
Mean |
 |
---|
Median |
 |
---|
Mode |
 |
---|
Variance |
![{\displaystyle [\exp(\sigma ^{2})-1]\exp(2\mu +\sigma ^{2})}](https://wikimedia.org/api/rest_v1/media/math/render/svg/b71d1959535c7b8ea00f302c3045c8dd941999b7) |
---|
Skewness |
 |
---|
Ex. kurtosis |
 |
---|
Entropy |
 |
---|
MGF |
defined only for numbers with a non-positive real part, see text |
---|
CF |
representation is asymptotically divergent but sufficient for numerical purposes |
---|
Fisher information |
 |
---|
Method of Moments |
, ![{\displaystyle \sigma ={\sqrt {\log \left({\frac {\operatorname {Var} [X]}{\operatorname {E} [X]^{2))}+1\right)))}](https://wikimedia.org/api/rest_v1/media/math/render/svg/b2b2fece191126453f2a1f03b6ad6088dd06e056) |
---|
In probability theory, a log-normal (or lognormal) distribution is a continuous probability distribution of a random variable whose logarithm is normally distributed. Thus, if the random variable X is log-normally distributed, then Y = ln(X) has a normal distribution.[1][2] Equivalently, if Y has a normal distribution, then the exponential function of Y, X = exp(Y), has a log-normal distribution. A random variable which is log-normally distributed takes only positive real values. It is a convenient and useful model for measurements in exact and engineering sciences, as well as medicine, economics and other topics (e.g., energies, concentrations, lengths, prices of financial instruments, and other metrics).
The distribution is occasionally referred to as the Galton distribution or Galton's distribution, after Francis Galton.[3] The log-normal distribution has also been associated with other names, such as McAlister, Gibrat and Cobb–Douglas.[3]
A log-normal process is the statistical realization of the multiplicative product of many independent random variables, each of which is positive. This is justified by considering the central limit theorem in the log domain (sometimes called Gibrat's law). The log-normal distribution is the maximum entropy probability distribution for a random variate X—for which the mean and variance of ln(X) are specified.[4]
Definitions
Generation and parameters
Let
be a standard normal variable, and let
and
be two real numbers. Then, the distribution of the random variable

is called the log-normal distribution with parameters
and
. These are the expected value (or mean) and standard deviation of the variable's natural logarithm, not the expectation and standard deviation of
itself.
Relation between normal and log-normal distribution. If

is normally distributed, then

is log-normally distributed.
This relationship is true regardless of the base of the logarithmic or exponential function: if
is normally distributed, then so is
for any two positive numbers
. Likewise, if
is log-normally distributed, then so is
, where
.
In order to produce a distribution with desired mean
and variance
, one uses
and
Alternatively, the "multiplicative" or "geometric" parameters
and
can be used. They have a more direct interpretation:
is the median of the distribution, and
is useful for determining "scatter" intervals, see below.
Probability density function
A positive random variable X is log-normally distributed (i.e.,
), if the natural logarithm of X is normally distributed with mean
and variance
:

Let
and
be respectively the cumulative probability distribution function and the probability density function of the N(0,1) distribution, then we have that[1][3]
![{\displaystyle {\begin{aligned}f_{X}(x)&={\frac {\rm {d))((\rm {d))x))\Pr(X\leq x)={\frac {\rm {d))((\rm {d))x))\Pr(\ln X\leq \ln x)={\frac {\rm {d))((\rm {d))x))\Phi \left({\frac {\ln x-\mu }{\sigma ))\right)\\[6pt]&=\varphi \left({\frac {\ln x-\mu }{\sigma ))\right){\frac {\rm {d))((\rm {d))x))\left({\frac {\ln x-\mu }{\sigma ))\right)=\varphi \left({\frac {\ln x-\mu }{\sigma ))\right){\frac {1}{\sigma x))\\[6pt]&={\frac {1}{x\sigma {\sqrt {2\pi \,))))\exp \left(-{\frac {(\ln x-\mu )^{2)){2\sigma ^{2))}\right).\end{aligned))}](https://wikimedia.org/api/rest_v1/media/math/render/svg/8543fc457ecf9cb584b82df297462c3e3191cf43)
Cumulative distribution function
The cumulative distribution function is

where
is the cumulative distribution function of the standard normal distribution (i.e., N(0,1)).
This may also be expressed as follows:[1]
![{\displaystyle {\frac {1}{2))\left[1+\operatorname {erf} \left({\frac {\ln x-\mu }{\sigma {\sqrt {2))))\right)\right]={\frac {1}{2))\operatorname {erfc} \left(-{\frac {\ln x-\mu }{\sigma {\sqrt {2))))\right)}](https://wikimedia.org/api/rest_v1/media/math/render/svg/d7373f66d2a24f5817a8bc2f2f44836941b79118)
where erfc is the complementary error function.
Multivariate log-normal
If
is a multivariate normal distribution, then
has a multivariate log-normal distribution.[5][6] The exponential is applied elementwise to the random vector
. The mean of
is
![\operatorname {E} [{\boldsymbol {Y))]_{i}=e^{\mu _{i}+{\frac {1}{2))\Sigma _{ii)),](https://wikimedia.org/api/rest_v1/media/math/render/svg/488f8b7b6e5331b3d4b257c87b40752a01ee6293)
and its covariance matrix is
![\operatorname {Var} [{\boldsymbol {Y))]_{ij}=e^{\mu _{i}+\mu _{j}+{\frac {1}{2))(\Sigma _{ii}+\Sigma _{jj})}(e^{\Sigma _{ij))-1).](https://wikimedia.org/api/rest_v1/media/math/render/svg/11b3d9175a3f442f40eb4687f58014c3efdfa7d0)
Since the multivariate log-normal distribution is not widely used, the rest of this entry only deals with the univariate distribution.
Characteristic function and moment generating function
All moments of the log-normal distribution exist and
![{\displaystyle \operatorname {E} [X^{n}]=e^{n\mu +n^{2}\sigma ^{2}/2))](https://wikimedia.org/api/rest_v1/media/math/render/svg/1ec49bbb5852b6e735f0a6a49468771db326b7bf)
This can be derived by letting
within the integral. However, the log-normal distribution is not determined by its moments.[7] This implies that it cannot have a defined moment generating function in a neighborhood of zero.[8] Indeed, the expected value
is not defined for any positive value of the argument
, since the defining integral diverges.
The characteristic function
is defined for real values of t, but is not defined for any complex value of t that has a negative imaginary part, and hence the characteristic function is not analytic at the origin. Consequently, the characteristic function of the log-normal distribution cannot be represented as an infinite convergent series.[9] In particular, its Taylor formal series diverges:

However, a number of alternative divergent series representations have been obtained.[9][10][11][12]
A closed-form formula for the characteristic function
with
in the domain of convergence is not known. A relatively simple approximating formula is available in closed form, and is given by[13]

where
is the Lambert W function. This approximation is derived via an asymptotic method, but it stays sharp all over the domain of convergence of
.
Properties
a.

is a log-normal variable with

.

is computed by transforming to the normal variable

, then integrating its density over the domain defined by

(blue regions), using the numerical method of ray-tracing.
[14] b & c. The pdf and cdf of the function

of the log-normal variable can also be computed in this way.
Probability in different domains
The probability content of a log-normal distribution in any arbitrary domain can be computed to desired precision by first transforming the variable to normal, then numerically integrating using the ray-trace method.[14] (Matlab code)
Probabilities of functions of a log-normal variable
Since the probability of a log-normal can be computed in any domain, this means that the cdf (and consequently pdf and inverse cdf) of any function of a log-normal variable can also be computed.[14] (Matlab code)
Geometric or multiplicative moments
The geometric or multiplicative mean of the log-normal distribution is
. It equals the median. The geometric or multiplicative standard deviation is
.[15][16]
By analogy with the arithmetic statistics, one can define a geometric variance,
, and a geometric coefficient of variation,[15]
, has been proposed. This term was intended to be analogous to the coefficient of variation, for describing multiplicative variation in log-normal data, but this definition of GCV has no theoretical basis as an estimate of
itself (see also Coefficient of variation).
Note that the geometric mean is smaller than the arithmetic mean. This is due to the AM–GM inequality and is a consequence of the logarithm being a concave function. In fact,
[17]
In finance, the term
is sometimes interpreted as a convexity correction. From the point of view of stochastic calculus, this is the same correction term as in Itō's lemma for geometric Brownian motion.
Arithmetic moments
For any real or complex number n, the n-th moment of a log-normally distributed variable X is given by[3]
![{\displaystyle \operatorname {E} [X^{n}]=e^{n\mu +{\frac {1}{2))n^{2}\sigma ^{2)).}](https://wikimedia.org/api/rest_v1/media/math/render/svg/a4b6efe7347f26a8054654edcdfb03eb8b28bbf1)
Specifically, the arithmetic mean, expected square, arithmetic variance, and arithmetic standard deviation of a log-normally distributed variable X are respectively given by:[1]
![{\displaystyle {\begin{aligned}\operatorname {E} [X]&=e^{\mu +{\tfrac {1}{2))\sigma ^{2)),\\[4pt]\operatorname {E} [X^{2}]&=e^{2\mu +2\sigma ^{2)),\\[4pt]\operatorname {Var} [X]&=\operatorname {E} [X^{2}]-\operatorname {E} [X]^{2}=(\operatorname {E} [X])^{2}(e^{\sigma ^{2))-1)=e^{2\mu +\sigma ^{2))(e^{\sigma ^{2))-1),\\[4pt]\operatorname {SD} [X]&={\sqrt {\operatorname {Var} [X]))=\operatorname {E} [X]{\sqrt {e^{\sigma ^{2))-1))=e^{\mu +{\tfrac {1}{2))\sigma ^{2)){\sqrt {e^{\sigma ^{2))-1)),\end{aligned))}](https://wikimedia.org/api/rest_v1/media/math/render/svg/2b59e2bead4a03f70fcf34a610106ae8704959a6)
The arithmetic coefficient of variation
is the ratio
. For a log-normal distribution it is equal to[2]
![{\displaystyle \operatorname {CV} [X]={\sqrt {e^{\sigma ^{2))-1)).}](https://wikimedia.org/api/rest_v1/media/math/render/svg/6cad386f192fe53b9e0525951f5423f46e03e36d)
This estimate is sometimes referred to as the "geometric CV" (GCV),[18][19] due to its use of the geometric variance. Contrary to the arithmetic standard deviation, the arithmetic coefficient of variation is independent of the arithmetic mean.
The parameters μ and σ can be obtained, if the arithmetic mean and the arithmetic variance are known:
![{\displaystyle {\begin{aligned}\mu &=\ln \left({\frac {\operatorname {E} [X]^{2)){\sqrt {\operatorname {E} [X^{2}]))}\right)=\ln \left({\frac {\operatorname {E} [X]^{2)){\sqrt {\operatorname {Var} [X]+\operatorname {E} [X]^{2))))\right),\\[4pt]\sigma ^{2}&=\ln \left({\frac {\operatorname {E} [X^{2}]}{\operatorname {E} [X]^{2))}\right)=\ln \left(1+{\frac {\operatorname {Var} [X]}{\operatorname {E} [X]^{2))}\right).\end{aligned))}](https://wikimedia.org/api/rest_v1/media/math/render/svg/ede6a785b6ed56d35a478e9927963cea65ba96e4)
A probability distribution is not uniquely determined by the moments E[Xn] = enμ + 1/2n2σ2 for n ≥ 1. That is, there exist other distributions with the same set of moments.[3] In fact, there is a whole family of distributions with the same moments as the log-normal distribution.[citation needed]
Mode, median, quantiles
The mode is the point of global maximum of the probability density function. In particular, by solving the equation
, we get that:
![{\displaystyle \operatorname {Mode} [X]=e^{\mu -\sigma ^{2)).}](https://wikimedia.org/api/rest_v1/media/math/render/svg/696ae3ee691abe8666911db6b83228e86d685f85)
Since the log-transformed variable
has a normal distribution, and quantiles are preserved under monotonic transformations, the quantiles of
are

where
is the quantile of the standard normal distribution.
Specifically, the median of a log-normal distribution is equal to its multiplicative mean,[20]
![{\displaystyle \operatorname {Med} [X]=e^{\mu }=\mu ^{*}.}](https://wikimedia.org/api/rest_v1/media/math/render/svg/20824d0df6479d5d425debc8b3646f5ebc87557c)
Partial expectation
The partial expectation of a random variable
with respect to a threshold
is defined as

Alternatively, by using the definition of conditional expectation, it can be written as
. For a log-normal random variable, the partial expectation is given by:

where
is the normal cumulative distribution function. The derivation of the formula is provided in the Talk page. The partial expectation formula has applications in insurance and economics, it is used in solving the partial differential equation leading to the Black–Scholes formula.
Conditional expectation
The conditional expectation of a log-normal random variable
—with respect to a threshold
—is its partial expectation divided by the cumulative probability of being in that range:
![{\displaystyle {\begin{aligned}E[X\mid X<k]&=e^{\mu +{\frac {\sigma ^{2)){2))}\cdot {\frac {\Phi \left[{\frac {\ln(k)-\mu -\sigma ^{2)){\sigma ))\right]}{\Phi \left[{\frac {\ln(k)-\mu }{\sigma ))\right]))\\[8pt]E[X\mid X\geqslant k]&=e^{\mu +{\frac {\sigma ^{2)){2))}\cdot {\frac {\Phi \left[{\frac {\mu +\sigma ^{2}-\ln(k)}{\sigma ))\right]}{1-\Phi \left[{\frac {\ln(k)-\mu }{\sigma ))\right]))\\[8pt]E[X\mid X\in [k_{1},k_{2}]]&=e^{\mu +{\frac {\sigma ^{2)){2))}\cdot {\frac {\Phi \left[{\frac {\ln(k_{2})-\mu -\sigma ^{2)){\sigma ))\right]-\Phi \left[{\frac {\ln(k_{1})-\mu -\sigma ^{2)){\sigma ))\right]}{\Phi \left[{\frac {\ln(k_{2})-\mu }{\sigma ))\right]-\Phi \left[{\frac {\ln(k_{1})-\mu }{\sigma ))\right]))\end{aligned))}](https://wikimedia.org/api/rest_v1/media/math/render/svg/183bad844b619e2b3e49c9a51d7021120b124865)
Alternative parameterizations
In addition to the characterization by
or
, here are multiple ways how the log-normal distribution can be parameterized. ProbOnto, the knowledge base and ontology of probability distributions[21][22] lists seven such forms:
Overview of parameterizations of the log-normal distributions.
- LogNormal1(μ,σ) with mean, μ, and standard deviation, σ, both on the log-scale [23]
![{\displaystyle P(x;{\boldsymbol {\mu )),{\boldsymbol {\sigma )))={\frac {1}{x\sigma {\sqrt {2\pi ))))\exp \left[-{\frac {(\ln x-\mu )^{2)){2\sigma ^{2))}\right]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/d254929914b40b8fa2329e6a02fa53353ed7fa07)
- LogNormal2(μ,υ) with mean, μ, and variance, υ, both on the log-scale
![{\displaystyle P(x;{\boldsymbol {\mu )),{\boldsymbol {v)))={\frac {1}{x{\sqrt {v)){\sqrt {2\pi ))))\exp \left[-{\frac {(\ln x-\mu )^{2)){2v))\right]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/47756e2ba5dcec56108d985f54ced5802726cb2f)
- LogNormal3(m,σ) with median, m, on the natural scale and standard deviation, σ, on the log-scale[23]
![{\displaystyle P(x;{\boldsymbol {m)),{\boldsymbol {\sigma )))={\frac {1}{x\sigma {\sqrt {2\pi ))))\exp \left[-{\frac {\ln ^{2}(x/m)}{2\sigma ^{2))}\right]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/f81ab16e30f347597540cda86ccb2702b0be5f85)
- LogNormal4(m,cv) with median, m, and coefficient of variation, cv, both on the natural scale
![{\displaystyle P(x;{\boldsymbol {m)),{\boldsymbol {cv)))={\frac {1}{x{\sqrt {\ln(cv^{2}+1))){\sqrt {2\pi ))))\exp \left[-{\frac {\ln ^{2}(x/m)}{2\ln(cv^{2}+1)))\right]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/34339307b7039935ae1071b1fb0ca1a04b51e0a2)
- LogNormal5(μ,τ) with mean, μ, and precision, τ, both on the log-scale[24]
![{\displaystyle P(x;{\boldsymbol {\mu )),{\boldsymbol {\tau )))={\sqrt {\frac {\tau }{2\pi ))}{\frac {1}{x))\exp \left[-{\frac {\tau }{2))(\ln x-\mu )^{2}\right]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/9ded50fe9521e9c21996167b6261b45fa2270849)
- LogNormal6(m,σg) with median, m, and geometric standard deviation, σg, both on the natural scale[25]
![{\displaystyle P(x;{\boldsymbol {m)),{\boldsymbol {\sigma _{g))})={\frac {1}{x\ln(\sigma _{g}){\sqrt {2\pi ))))\exp \left[-{\frac {\ln ^{2}(x/m)}{2\ln ^{2}(\sigma _{g})))\right]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/e5d63692975723d89e040953aefd71ec18822b86)
- LogNormal7(μN,σN) with mean, μN, and standard deviation, σN, both on the natural scale[26]
![{\displaystyle P(x;{\boldsymbol {\mu _{N))},{\boldsymbol {\sigma _{N))})={\frac {1}{x{\sqrt {2\pi \ln \left(1+\sigma _{N}^{2}/\mu _{N}^{2}\right)))))\exp \left(-{\frac ((\Big [}\ln x-\ln {\frac {\mu _{N)){\sqrt {1+\sigma _{N}^{2}/\mu _{N}^{2)))){\Big ]}^{2)){2\ln(1+\sigma _{N}^{2}/\mu _{N}^{2})))\right)}](https://wikimedia.org/api/rest_v1/media/math/render/svg/c50f543829dadcdbc41b007ca74be9029c563e56)
Examples for re-parameterization
Consider the situation when one would like to run a model using two different optimal design tools, for example PFIM[27] and PopED.[28] The former supports the LN2, the latter LN7 parameterization, respectively. Therefore, the re-parameterization is required, otherwise the two tools would produce different results.
For the transition
following formulas hold
and
.
For the transition
following formulas hold
and
.
All remaining re-parameterisation formulas can be found in the specification document on the project website.[29]
Multiple, reciprocal, power
- Multiplication by a constant: If
then
for 
- Reciprocal: If
then 
- Power: If
then
for 
Multiplication and division of independent, log-normal random variables
If two independent, log-normal variables
and
are multiplied [divided], the product [ratio] is again log-normal, with parameters
[
] and
, where
. This is easily generalized to the product of
such variables.
More generally, if
are
independent, log-normally distributed variables, then
Multiplicative central limit theorem
The geometric or multiplicative mean of
independent, identically distributed, positive random variables
shows, for
approximately a log-normal distribution with parameters
and
, assuming
is finite.
In fact, the random variables do not have to be identically distributed. It is enough for the distributions of
to all have finite variance and satisfy the other conditions of any of the many variants of the central limit theorem.
This is commonly known as Gibrat's law.
Other
A set of data that arises from the log-normal distribution has a symmetric Lorenz curve (see also Lorenz asymmetry coefficient).[30]
The harmonic
, geometric
and arithmetic
means of this distribution are related;[31] such relation is given by

Log-normal distributions are infinitely divisible,[32] but they are not stable distributions, which can be easily drawn from.[33]
Statistical inference
Estimation of parameters
For determining the maximum likelihood estimators of the log-normal distribution parameters μ and σ, we can use the same procedure as for the normal distribution. Note that

where
is the density function of the normal distribution
. Therefore, the log-likelihood function is

Since the first term is constant with regard to μ and σ, both logarithmic likelihood functions,
and
, reach their maximum with the same
and
. Hence, the maximum likelihood estimators are identical to those for a normal distribution for the observations
,

For finite n, the estimator for
is unbiased, but the one for
is biased. As for the normal distribution, an unbiased estimator for
can be obtained by replacing the denominator n by n−1 in the equation for
.
When the individual values
are not available, but the sample's mean
and standard deviation s is, then the corresponding parameters are determined by the following formulas, obtained from solving the equations for the expectation
and variance
for
and
:

Statistics
The most efficient way to analyze log-normally distributed data consists of applying the well-known methods based on the normal distribution to logarithmically transformed data and then to back-transform results if appropriate.
Scatter intervals
A basic example is given by scatter intervals: For the normal distribution, the interval
contains approximately two thirds (68%) of the probability (or of a large sample), and
contain 95%. Therefore, for a log-normal distribution,
![{\displaystyle [\mu ^{*}/\sigma ^{*},\mu ^{*}\cdot \sigma ^{*}]=[\mu ^{*}{}^{\times }\!\!/\sigma ^{*}]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/90bfe33d3e1ec78fd21196427394f5f4fe5e1836)
contains 2/3, and
![{\displaystyle [\mu ^{*}/(\sigma ^{*})^{2},\mu ^{*}\cdot (\sigma ^{*})^{2}]=[\mu ^{*}{}^{\times }\!\!/(\sigma ^{*})^{2}]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/721c476ec6cdb74bed626ea73e2e5f44bff32d84)
contains 95% of the probability. Using estimated parameters, then approximately the same percentages of the data should be contained in these intervals.
Confidence interval for μ*
Using the principle, note that a confidence interval for
is
, where
is the standard error and q is the 97.5% quantile of a t distribution with n-1 degrees of freedom. Back-transformation leads to a confidence interval for
,
![{\displaystyle [{\widehat {\mu ))^{*}{}^{\times }\!\!/(\operatorname {sem} ^{*})^{q}]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/7b9c1579089d540825002f6a247b9991d2d87936)
with
Extremal principle of entropy to fix the free parameter σ
In applications,
is a parameter to be determined. For growing processes balanced by production and dissipation, the use of an extremal principle of Shannon entropy shows that[41]

This value can then be used to give some scaling relation between the inflexion point and maximum point of the log-normal distribution.[41] This relationship is determined by the base of natural logarithm,
, and exhibits some geometrical similarity to the minimal surface energy principle.
These scaling relations are useful for predicting a number of growth processes (epidemic spreading, droplet splashing, population growth, swirling rate of the bathtub vortex, distribution of language characters, velocity profile of turbulences, etc.).
For example, the log-normal function with such
fits well with the size of secondarily produced droplets during droplet impact [42] and the spreading of an epidemic disease.[43]
The value
is used to provide a probabilistic solution for the Drake equation.[44]