Ordination or gradient analysis, in multivariate analysis, is a method complementary to data clustering, and used mainly in exploratory data analysis (rather than in hypothesis testing). In contrast to cluster analysis, ordination orders quantities in a (usually lower-dimensional) latent space. In the ordination space, quantities that are near each other share attributes (i.e., are similar to some degree), and dissimilar objects are farther from each other. Such relationships between the objects, on each of several axes or latent variables, are then characterized numerically and/or graphically in a biplot.

The first ordination method, principal components analysis, was suggested by Karl Pearson in 1901.

Methods

Ordination methods can broadly be categorized in eigenvector-, algorithm-, or model-based methods. Many classical ordination techniques, including principal components analysis, correspondence analysis (CA) and its derivatives (detrended correspondence analysis, canonical correspondence analysis, and redundancy analysis, belong to the first group.

The second group includes some distance-based methods such as non-metric multidimensional scaling, and machine learning methods such as T-distributed stochastic neighbor embedding and nonlinear dimensionality reduction.

The third group includes model-based ordination methods, which can be considered as multivariate extensions of Generalized Linear Models.[1][2][3][4] Model-based ordination methods are more flexible in their application than classical ordination methods, so that it is for example possible to include random-effects.[5] Unlike in the aforementioned two groups, there is no (implicit or explicit) distance measure in the ordination. Instead, a distribution needs to be specified for the responses as is typical for statistical models. These and other assumptions, such as the assumed mean-variance relationship, can be validated with the use of residual diagnostics, unlike in other ordination methods.

Applications

Ordination can be used on the analysis of any set of multivariate objects. It is frequently used in several environmental or ecological sciences, particularly plant community ecology. It is also used in genetics and systems biology for microarray data analysis and in psychometrics.

References

1. ^ Hui, Francis K.C.; Taskinen, Sara; Pledger, Shirley; Foster, Scott D.; Warton, David I. (2015). O'Hara, Robert B. (ed.). "Model‐based approaches to unconstrained ordination". Methods in Ecology and Evolution. 6 (4): 399–411. doi:10.1111/2041-210X.12236. ISSN 2041-210X. S2CID 62624917.
2. ^ Warton, David I.; Blanchet, F. Guillaume; O’Hara, Robert B.; Ovaskainen, Otso; Taskinen, Sara; Walker, Steven C.; Hui, Francis K. C. (2015-12-01). "So Many Variables: Joint Modeling in Community Ecology". Trends in Ecology & Evolution. 30 (12): 766–779. doi:10.1016/j.tree.2015.09.007. ISSN 0169-5347. PMID 26519235.
3. ^ Yee, Thomas W. (2004). "A New Technique for Maximum-Likelihood Canonical Gaussian Ordination". Ecological Monographs. 74 (4): 685–701. doi:10.1890/03-0078. ISSN 0012-9615.
4. ^ Hawinkel, Stijn; Kerckhof, Frederiek-Maarten; Bijnens, Luc; Thas, Olivier (2019-02-13). "A unified framework for unconstrained and constrained ordination of microbiome read count data". PLOS ONE. 14 (2): e0205474. doi:10.1371/journal.pone.0205474. ISSN 1932-6203. PMC 6373939. PMID 30759084.
5. ^ van der Veen, Bert; Hui, Francis K. C.; Hovstad, Knut A.; O'Hara, Robert B. (2023). "Concurrent ordination: Simultaneous unconstrained and constrained latent variable modelling". Methods in Ecology and Evolution. 14 (2): 683–695. doi:10.1111/2041-210X.14035. hdl:11250/3050891. ISSN 2041-210X.