In mathematics and statistics, the arithmetic mean ( /ˌærɪθˈmɛtɪk ˈmiːn/ arr-ith-MET-ik), arithmetic average, or just the mean or average (when the context is clear) is the sum of a collection of numbers divided by the count of numbers in the collection.^[1] The collection is often a set of results from an experiment, an observational study, or a survey. The term "arithmetic mean" is preferred in some mathematics and statistics contexts because it helps distinguish it from other types of means, such as geometric and harmonic.

In addition to mathematics and statistics, the arithmetic mean is frequently used in economics, anthropology, history, and almost every academic field to some extent. For example, per capita income is the arithmetic average income of a nation's population.

While the arithmetic mean is often used to report central tendencies, it is not a robust statistic: it is greatly influenced by outliers (values much larger or smaller than most others). For skewed distributions, such as the distribution of income for which a few people's incomes are substantially higher than most people's, the arithmetic mean may not coincide with one's notion of "middle". In that case, robust statistics, such as the median, may provide a better description of central tendency.

Definition

The arithmetic mean of a set of observed data is equal to the sum of the numerical values of each observation, divided by the total number of observations. Symbolically, for a data set consisting of the values ${\displaystyle x_{1},\dots ,x_{n))$ , the arithmetic mean is defined by the formula:

{\bar {x))={\frac {1}{n))\left(\sum _{i=1}^{n}{x_{i))\right)={\frac {x_{1}+x_{2}+\dots +x_{n)){n))

^[2]

(For an explanation of the summation operator, see summation.)

For example, if the monthly salaries of $10$ employees are ${\displaystyle \{2500,2700,2400,2300,2550,2650,2750,2450,2600,2400\))$ , then the arithmetic mean is:

{\frac {2500+2700+2400+2300+2550+2650+2750+2450+2600+2400}{10))=2530

If the data set is a statistical population (i.e., consists of every possible observation and not just a subset of them), then the mean of that population is called the population mean and denoted by the Greek letter $\mu$ . If the data set is a statistical sample (a subset of the population), it is called the sample mean (which for a data set $X$ is denoted as ${\overline {X))$ ).

The arithmetic mean can be similarly defined for vectors in multiple dimensions, not only scalar values; this is often referred to as a centroid. More generally, because the arithmetic mean is a convex combination (meaning its coefficients sum to $1$ ), it can be defined on a convex space, not only a vector space.

Motivating properties

The arithmetic mean has several properties that make it interesting, especially as a measure of central tendency. These include:

If numbers ${\displaystyle x_{1},\dotsc ,x_{n))$ have mean ${\bar {x))$ , then $(x_{1}-{\bar {x)))+\dotsb +(x_{n}-{\bar {x)))=0$ . Since $x_{i}-{\bar {x))$ is the distance from a given number to the mean, one way to interpret this property is by saying that the numbers to the left of the mean are balanced by the numbers to the right. The mean is the only number for which the residuals (deviations from the estimate) sum to zero. This can also be interpreted as saying that the mean is translationally invariant in the sense that for any real number $a$ , ${\overline {x+a))={\bar {x))+a$ .
If it is required to use a single number as a "typical" value for a set of known numbers ${\displaystyle x_{1},\dotsc ,x_{n))$ , then the arithmetic mean of the numbers does this best since it minimizes the sum of squared deviations from the typical value: the sum of ${\displaystyle (x_{i}-{\bar {x)))^{2))$ . The sample mean is also the best single predictor because it has the lowest root mean squared error.^[3] If the arithmetic mean of a population of numbers is desired, then the estimate of it that is unbiased is the arithmetic mean of a sample drawn from the population.

The arithmetic mean is independent of scale of the units of measurement, in the sense that ${\text{avg))(ca_{1},\cdots ,ca_{n})=c\cdot {\text{avg))(a_{1},\cdots ,a_{n}).$ So, for example, calculating a mean of liters and then converting to gallons is the same as converting to gallons first and then calculating the mean. This is also called first order homogeneity.

Additional properties

The arithmetic mean of a sample is always between the largest and smallest values in that sample.
The arithmetic mean of any amount of equal-sized number groups together is the arithmetic mean of the arithmetic means of each group.

Contrast with median

Main article: Median

The arithmetic mean may be contrasted with the median. The median is defined such that no more than half the values are larger, and no more than half are smaller than it. If elements in the data increase arithmetically when placed in some order, then the median and arithmetic average are equal. For example, consider the data sample ${\displaystyle \{1,2,3,4\))$ . The mean is $2.5$ , as is the median. However, when we consider a sample that cannot be arranged to increase arithmetically, such as ${\displaystyle \{1,2,4,8,16\))$ , the median and arithmetic average can differ significantly. In this case, the arithmetic average is $6.2$ , while the median is $4$ . The average value can vary considerably from most values in the sample and can be larger or smaller than most.

There are applications of this phenomenon in many fields. For example, since the 1980s, the median income in the United States has increased more slowly than the arithmetic average of income.^[4]

Generalizations

Weighted average

Main article: Weighted average

A weighted average, or weighted mean, is an average in which some data points count more heavily than others in that they are given more weight in the calculation.^[5] For example, the arithmetic mean of $3$ and $5$ is ${\frac {3+5}{2))=4$ , or equivalently $3{\frac {1}{2))+5{\frac {1}{2))=4$ . In contrast, a weighted mean in which the first number receives, for example, twice as much weight as the second (perhaps because it is assumed to appear twice as often in the general population from which these numbers were sampled) would be calculated as $3{\frac {2}{3))+5{\frac {1}{3))={\frac {11}{3))$ . Here the weights, which necessarily sum to one, are ${\frac {2}{3))$ and ${\frac {1}{3))$ , the former being twice the latter. The arithmetic mean (sometimes called the "unweighted average" or "equally weighted average") can be interpreted as a special case of a weighted average in which all weights are equal to the same number ( ${\frac {1}{2))$ in the above example and ${\frac {1}{n))$ in a situation with $n$ numbers being averaged).

Continuous probability distributions

If a numerical property, and any sample of data from it, can take on any value from a continuous range instead of, for example, just integers, then the probability of a number falling into some range of possible values can be described by integrating a continuous probability distribution across this range, even when the naive probability for a sample number taking one certain value from infinitely many is zero. In this context, the analog of a weighted average, in which there are infinitely many possibilities for the precise value of the variable in each range, is called the mean of the probability distribution. The most widely encountered probability distribution is called the normal distribution; it has the property that all measures of its central tendency, including not just the mean but also the median mentioned above and the mode (the three Ms^[6]), are equal. This equality does not hold for other probability distributions, as illustrated for the log-normal distribution here.

Angles

Main article: Mean of circular quantities

Particular care is needed when using cyclic data, such as phases or angles. Taking the arithmetic mean of 1° and 359° yields a result of 180°. This is incorrect for two reasons:

Firstly, angle measurements are only defined up to an additive constant of 360° ( $2\pi$ or $\tau$ , if measuring in radians). Thus, these could easily be called 1° and -1°, or 361° and 719°, since each one of them produces a different average.
Secondly, in this situation, 0° (or 360°) is geometrically a better average value: there is lower dispersion about it (the points are both 1° from it and 179° from 180°, the putative average).

In general application, such an oversight will lead to the average value artificially moving towards the middle of the numerical range. A solution to this problem is to use the optimization formulation (that is, define the mean as the central point: the point about which one has the lowest dispersion) and redefine the difference as a modular distance (i.e., the distance on the circle: so the modular distance between 1° and 359° is 2°, not 358°).

Symbols and encoding

The arithmetic mean is often denoted by a bar (vinculum or macron), as in ${\bar {x))$ .^[3]

Some software (text processors, web browsers) may not display the "x̄" symbol correctly. For example, the HTML symbol "x̄" combines two codes — the base letter "x" plus a code for the line above ( ̄ or ¯).^[7]

In some document formats (such as PDF), the symbol may be replaced by a "¢" (cent) symbol when copied to a text processor such as Microsoft Word.

Notes

References

External links

Statistics

Descriptive statistics

Continuous data

Center	Mean Arithmetic Arithmetic-Geometric Cubic Generalized/power Geometric Harmonic Heronian Heinz Lehmer Median Mode
Dispersion	Average absolute deviation Coefficient of variation Interquartile range Percentile Range Standard deviation Variance
Shape	Central limit theorem Moments Kurtosis L-moments Skewness

Count data

Index of dispersion

Summary tables

Dependence

Graphics

Data collection

Study design	Effect size Missing data Optimal design Population Replication Sample size determination Statistic Statistical power
Survey methodology	Sampling Cluster Stratified Opinion poll Questionnaire Standard error
Controlled experiments	Blocking Factorial experiment Interaction Random assignment Randomized controlled trial Randomized experiment Scientific control
Adaptive designs	Adaptive clinical trial Stochastic approximation Up-and-down designs
Observational studies	Cohort study Cross-sectional study Natural experiment Quasi-experiment

Statistical inference

Statistical theory

Frequentist inference

Point estimation	Estimating equations Maximum likelihood Method of moments M-estimator Minimum distance Unbiased estimators Mean-unbiased minimum-variance Rao–Blackwellization Lehmann–Scheffé theorem Median unbiased Plug-in
Interval estimation	Confidence interval Pivot Likelihood interval Prediction interval Tolerance interval Resampling Bootstrap Jackknife
Testing hypotheses	1- & 2-tails Power Uniformly most powerful test Permutation test Randomization test Multiple comparisons
Parametric tests	Likelihood-ratio Score/Lagrange multiplier Wald

Specific tests

Z-test (normal) Student's t-test F-test
Goodness of fit	Chi-squared G-test Kolmogorov–Smirnov Anderson–Darling Lilliefors Jarque–Bera Normality (Shapiro–Wilk) Likelihood-ratio test Model selection Cross validation AIC BIC
Rank statistics	Sign Sample median Signed rank (Wilcoxon) Hodges–Lehmann estimator Rank sum (Mann–Whitney) Nonparametric anova 1-way (Kruskal–Wallis) 2-way (Friedman) Ordered alternative (Jonckheere–Terpstra) Van der Waerden test

Bayesian inference

Correlation	Pearson product-moment Partial correlation Confounding variable Coefficient of determination
Regression analysis	Errors and residuals Regression validation Mixed effects models Simultaneous equations models Multivariate adaptive regression splines (MARS)
Linear regression	Simple linear regression Ordinary least squares General linear model Bayesian regression
Non-standard predictors	Nonlinear regression Nonparametric Semiparametric Isotonic Robust Heteroscedasticity Homoscedasticity
Generalized linear model	Exponential families Logistic (Bernoulli) / Binomial / Poisson regressions
Partition of variance	Analysis of variance (ANOVA, anova) Analysis of covariance Multivariate ANOVA Degrees of freedom

Categorical / Multivariate / Time-series / Survival analysis

Categorical

Multivariate

Time-series

General	Decomposition Trend Stationarity Seasonal adjustment Exponential smoothing Cointegration Structural break Granger causality
Specific tests	Dickey–Fuller Johansen Q-statistic (Ljung–Box) Durbin–Watson Breusch–Godfrey
Time domain	Autocorrelation (ACF) partial (PACF) Cross-correlation (XCF) ARMA model ARIMA model (Box–Jenkins) Autoregressive conditional heteroskedasticity (ARCH) Vector autoregression (VAR)
Frequency domain	Spectral density estimation Fourier analysis Least-squares spectral analysis Wavelet Whittle likelihood

Survival

Survival function	Kaplan–Meier estimator (product limit) Proportional hazards models Accelerated failure time (AFT) model First hitting time
Hazard function	Nelson–Aalen estimator
Test	Log-rank test

Applications

Biostatistics	Bioinformatics Clinical trials / studies Epidemiology Medical statistics
Engineering statistics	Chemometrics Methods engineering Probabilistic design Process / quality control Reliability System identification
Social statistics	Actuarial science Census Crime statistics Demography Econometrics Jurimetrics National accounts Official statistics Population statistics Psychometrics
Spatial statistics	Cartography Environmental statistics Geographic information system Geostatistics Kriging