The theory of statistics provides a basis for the whole range of techniques, in both study design and data analysis, that are used within applications of statistics.[1][2] The theory covers approaches to statistical-decision problems and to statistical inference, and the actions and deductions that satisfy the basic principles stated for these different approaches. Within a given approach, statistical theory gives ways of comparing statistical procedures; it can find a best possible procedure within a given context for given statistical problems, or can provide guidance on the choice between alternative procedures.[2][3]

Apart from philosophical considerations about how to make statistical inferences and decisions, much of statistical theory consists of mathematical statistics, and is closely linked to probability theory, to utility theory, and to optimization.


Statistical theory provides an underlying rationale and provides a consistent basis for the choice of methodology used in applied statistics.


Statistical models describe the sources of data and can have different types of formulation corresponding to these sources and to the problem being studied. Such problems can be of various kinds:

Statistical models, once specified, can be tested to see whether they provide useful inferences for new data sets.[4]

Data collection

Statistical theory provides a guide to comparing methods of data collection, where the problem is to generate informative data using optimization and randomization while measuring and controlling for observational error.[5][6][7] Optimization of data collection reduces the cost of data while satisfying statistical goals,[8][9] while randomization allows reliable inferences. Statistical theory provides a basis for good data collection and the structuring of investigations in the topics of:

Summarising data

The task of summarising statistical data in conventional forms (also known as descriptive statistics) is considered in theoretical statistics as a problem of defining what aspects of statistical samples need to be described and how well they can be described from a typically limited sample of data. Thus the problems theoretical statistics considers include:

Interpreting data

Besides the philosophy underlying statistical inference, statistical theory has the task of considering the types of questions that data analysts might want to ask about the problems they are studying and of providing data analytic techniques for answering them. Some of these tasks are:

When a statistical procedure has been specified in the study protocol, then statistical theory provides well-defined probability statements for the method when applied to all populations that could have arisen from the randomization used to generate the data. This provides an objective way of estimating parameters, estimating confidence intervals, testing hypotheses, and selecting the best. Even for observational data, statistical theory provides a way of calculating a value that can be used to interpret a sample of data from a population, it can provide a means of indicating how well that value is determined by the sample, and thus a means of saying corresponding values derived for different populations are as different as they might seem; however, the reliability of inferences from post-hoc observational data is often worse than for planned randomized generation of data.

Applied statistical inference

Statistical theory provides the basis for a number of data-analytic approaches that are common across scientific and social research. Interpreting data is done with one of the following approaches:

Many of the standard methods for those approaches rely on certain statistical assumptions (made in the derivation of the methodology) actually holding in practice. Statistical theory studies the consequences of departures from these assumptions. In addition it provides a range of robust statistical techniques that are less dependent on assumptions, and it provides methods checking whether particular assumptions are reasonable for a given data set.

See also



  1. ^ Cox & Hinkley (1974, p.1)
  2. ^ a b Rao, C. R. (1981). "Foreword". In Arthanari, T. S.; Dodge, Yadolah (eds.). Mathematical Programming in Statistics. New York: John Wiley & Sons. pp. vii–viii. ISBN 0-471-08073-X. MR 0607328.
  3. ^ Lehmann & Romano (2005)
  4. ^ Freedman (2009)
  5. ^ Charles Sanders Peirce and Joseph Jastrow (1885). "On Small Differences in Sensation". Memoirs of the National Academy of Sciences. 3: 73–83.
  6. ^ Hacking, Ian (September 1988). "Telepathy: Origins of Randomization in Experimental Design". Isis. 79 (3): 427–451. doi:10.1086/354775. JSTOR 234674. MR 1013489. S2CID 52201011.
  7. ^ Stephen M. Stigler (November 1992). "A Historical View of Statistical Concepts in Psychology and Educational Research". American Journal of Education. 101 (1): 60–70. doi:10.1086/444032. S2CID 143685203.
  8. ^ a b Atkinson et al. (2007)
  9. ^ Kiefer, Jack Carl (1985). Brown, Lawrence D.; Olkin, Ingram; Sacks, Jerome; et al. (eds.). Jack Carl Kiefer: Collected papers III—Design of experiments. Springer-Verlag and the Institute of Mathematical Statistics. pp. 718+xxv. ISBN 0-387-96004-X.
  10. ^ Hinkelmann & Kempthorne (2008)
  11. ^ Bailey (2008).
  12. ^ Kish (1965)
  13. ^ Cochran (1977)
  14. ^ Särndal et al. (1992)


Further reading