You do not have permission to edit this page, for the following reasons:

This IP address has been blocked from editing Wikipedia.
This does not affect your ability to read Wikipedia pages.
Most people who see this message have done nothing wrong. Some kinds of blocks restrict editing from specific service providers or telecom companies in response to recent abuse or vandalism, and can sometimes affect other users who are unrelated to that abuse. Review the information below for assistance if you do not believe that you have done anything wrong.

The IP address or range 54.236.0.0/16 has been blocked by ‪JJMC89‬ for the following reason(s):

The IP address that you are currently using has been blocked because it is believed to be a web host provider or colocation provider. To prevent abuse, web hosts and colocation providers may be blocked from editing Wikipedia.
You will not be able to edit Wikipedia using a web host or colocation provider because it hides your IP address, much like a proxy or VPN.
We recommend that you attempt to use another connection to edit. For example, if you use a proxy or VPN to connect to the internet, turn it off when editing Wikipedia. If you edit using a mobile connection, try using a Wi-Fi connection, and vice versa. If you are using a corporate internet connection, switch to a different Wi-Fi network. If you have a Wikipedia account, please log in.
If you do not have any other way to edit Wikipedia, you will need to request an IP block exemption.

How to appeal if you are confident that your connection does not use a colocation provider's IP address:
If you are confident that you are not using a web host, you may appeal this block by adding the following text on your talk page: ((unblock|reason=Caught by a colocation web host block but this host or IP is not a web host. My IP address is _______. Place any further information here. ~~~~)). You must fill in the blank with your IP address for this block to be investigated. Your IP address can be determined here. Alternatively, if you wish to keep your IP address private you can use the unblock ticket request system. There are several reasons you might be editing using the IP address of a web host or colocation provider (such as if you are using VPN software or a business network); please use this method of appeal only if you think your IP address is in fact not a web host or colocation provider.

Administrators: The IP block exemption user right should only be applied to allow users to edit using web host in exceptional circumstances, and requests should usually be directed to the functionaries team via email. If you intend to give the IPBE user right, a CheckUser needs to take a look at the account. This can be requested most easily at SPI Quick Checkuser Requests. Unblocking an IP or IP range with this template is highly discouraged without at least contacting the blocking administrator.

This block will expire on 05:36, 2 June 2025. Your current IP address is 54.236.42.174.

Even when blocked, you will usually still be able to edit your user talk page, as well as email administrators and other editors.

For information on how to proceed, please read the FAQ for blocked users and the guideline on block appeals. The guide to appealing blocks may also be helpful.

Other useful links: Blocking policy · Help:I have been blocked
This IP address range has been globally blocked.
This does not affect your ability to read Wikipedia pages.
Most people who see this message have done nothing wrong. Some kinds of blocks restrict editing from specific service providers or telecom companies in response to recent abuse or vandalism, and can sometimes affect other users who are unrelated to that abuse. Review the information below for assistance if you do not believe that you have done anything wrong.

This block affects editing on all Wikimedia wikis.
The IP address or range 54.236.0.0/16 has been globally blocked by ‪Jon Kolbert‬ for the following reason(s):

Open proxy/Webhost: See the help page if you are affected

This block will expire on 05:40, 21 February 2029. Your current IP address is 54.236.42.174.

Even while globally blocked, you will usually still be able to edit pages on Meta-Wiki.

If you believe you were blocked by mistake, you can find additional information and instructions in the No open proxies global policy. Otherwise, to discuss the block please post a request for review on Meta-Wiki. You could also send an email to the stewards VRT queue at stewards@wikimedia.org including all above details.

Other useful links: Global blocks · Help:I have been blocked

You can view and copy the source of this page:

((Short description|Square matrix containing the distances between elements in a set))
((more citations needed|date=February 2017))
In [[mathematics]], [[computer science]] and especially [[graph theory]], a '''distance matrix''' is a [[square matrix]] (two-dimensional array) containing the [[distance]]s, taken pairwise, between the elements of a set.<ref>Weyenberg, G., & Yoshida, R. (2015). Reconstructing the phylogeny: Computational methods. In Algebraic and Discrete Mathematical methods for modern Biology (pp. 293–319). Academic Press.</ref> Depending upon the application involved, the ''distance'' being used to define this matrix may or may not be a [[metric (mathematics)|metric]]. If there are ((mvar|N)) elements, this matrix will have size ((math|''N''&times;''N'')). In graph-theoretic applications, the elements are more often referred to as points, nodes or vertices.

==Non-metric distance matrix==
In general, a distance matrix is a weighted [[adjacency matrix]] of some graph.  In a [[Network (mathematics)|network]], a [[directed graph]] with weights assigned to the arcs, the distance between two nodes of the network can be defined as the minimum of the sums of the weights on the shortest paths joining the two nodes.<ref>[[Frank Harary]], Robert Z. Norman and Dorwin Cartwright (1965) ''Structural Models: An Introduction to the Theory of Directed Graphs'', pages 134–8, [[John Wiley & Sons]] ((mr|id=0184874))</ref> This distance function, while well defined, is not a metric. There need be no restrictions on the weights other than the need to be able to combine and compare them, so negative weights are used in some applications. Since paths are directed, symmetry can not be guaranteed, and if cycles exist the distance matrix may not be [[Hollow matrix|hollow]].

An algebraic formulation of the above can be obtained by using the [[min-plus algebra]]. Matrix multiplication in this system is defined as follows: Given two ((math|''n'' × ''n'')) matrices ((math|1=''A'' = (''a((sub|ij))''))) and ((math|1=''B'' = (''b((sub|ij))''))), their distance product ((math|1=''C'' = (''c((sub|ij))'') = ''A'' ⭑ ''B'')) is defined as an ((math|''n'' × ''n'')) matrix such that 
:<math>c_{ij} = \min_{k=1}^n \{a_{ik} + b_{kj}\}.</math>
Note that the off-diagonal elements that are not connected directly will need to be set to infinity or a suitable large value for the min-plus operations to work correctly. A zero in these locations will be incorrectly interpreted as an edge with no distance, cost, etc.

If ((mvar|W)) is an ((math|''n'' × ''n'')) matrix containing the edge weights of a [[Graph (discrete mathematics)|graph]], then ((mvar|W((sup|k)))) (using this distance product) gives the distances between vertices using paths of length at most ((mvar|k)) edges, and ((mvar|W((sup|n)))) is the distance matrix of the graph.

An arbitrary graph ((mvar|G)) on ((mvar|n)) vertices can be modeled as a weighted [[complete graph]] on ((mvar|n)) vertices by assigning a weight of one to each edge of the complete graph that corresponds to an edge of ((mvar|G)) and zero to all other edges. ((mvar|W)) for this complete graph is the [[adjacency matrix]] of ((mvar|G)). The distance matrix of ((mvar|G)) can be computed from ((mvar|W)) as above, however, ((mvar|W((sup|n)))) calculated by the usual [[matrix multiplication]] only encodes the number of paths between any two vertices of length exactly ((mvar|n)). 
<!--
==Comparison with related matrices==

===Comparison with adjacency matrix===
Distance matrices are related to [[Adjacency matrix|adjacency matrices]], with the differences that (a)&nbsp;the latter only provides the information which vertices are connected but does not tell about ''costs'' or ''distances'' between the vertices and (b)&nbsp;an entry of a distance matrix is smaller if two elements are closer, while "close" (connected) vertices yield larger entries in an adjacency matrix.

===Comparison with Euclidean distance matrix===
Unlike a [[Euclidean distance matrix]], the matrix does not need to be [[Symmetric matrix|symmetric]]—that is, the values ((math|''x''((sub|''i'',''j'')))) do not necessarily equal ((math|''x''((sub|''j'',''i'')))).  Similarly, the matrix values are not restricted to non-negative [[Real number|reals]] (as they would be in the Euclidean distance matrix) but rather can have negative values, zeros or [[imaginary number]]s depending on the cost metric and specific use.  Although it is often the case, distance matrices are not restricted to being [[hollow matrix|hollow]]—that is, they can have non-zero entries on the main diagonal.
-->

==Metric distance matrix==
The value of a distance matrix formalism in many applications is in how the distance matrix can manifestly encode the [[Metric (mathematics)|metric axioms]] and in how it lends itself to the use of linear algebra techniques.  That is, if ((math|1=''M'' = (''x((sub|ij))''))) with ((math|1 ≤ ''i'', ''j'' ≤ ''N'')) is a distance matrix for a metric distance, then

# the entries on the main diagonal are all zero (that is, the matrix is a [[hollow matrix]]), i.e. ((math|1=''x((sub|ii))'' = 0)) for all ((math|1 ≤ ''i'' ≤ ''N'')),
# all the off-diagonal entries are positive (((math|''x((sub|ij))'' > 0)) if ((math|''i'' ≠ ''j''))), (that is, a [[Nonnegative matrix|non-negative matrix]]),
# the matrix is a [[symmetric matrix]] (((math|1=''x((sub|ij))'' = ''x((sub|ji))''))), and
# for any ((mvar|i)) and ((mvar|j)), ((math|''x((sub|ij))'' ≤ ''x((sub|ik))'' + ''x((sub|kj))'')) for all ((mvar|k)) (the triangle inequality). This can be stated in terms of [[min-plus matrix multiplication|tropical matrix multiplication]]

When a distance matrix satisfies the first three axioms (making it a semi-metric) it is sometimes referred to as a pre-distance matrix.  A pre-distance matrix that can be embedded in a Euclidean space is called a [[Euclidean distance matrix]]. For mixed-type data that contain numerical as well as categorical descriptors, [[Gower's distance]] is a common alternative.

Another common example of a metric distance matrix arises in [[coding theory]] when in a [[block code]] the elements are strings of fixed length over an alphabet and the distance between them is given by the [[Hamming distance]] metric. The smallest non-zero entry in the distance matrix measures the error correcting and error detecting capability of the code.

== Additive distance matrix ==
An additive distance matrix is a special type of matrix used in [[bioinformatics]] to build a [[phylogenetic tree]]. Let ((mvar|x)) be the lowest common ancestor between two species ((mvar|i)) and ((mvar|j)), we expect ((mvar|1=''M((sub|ij))'' = ''M((sub|ix))'' + ''M((sub|xj))'')). This is where the additive metric comes from. A distance matrix ((mvar|M)) for a set of species ((mvar|S)) is said to be additive if and only if there exists a phylogeny ((mvar|T)) for ((mvar|S)) such that:

* Every edge ((math|(''u'',''v''))) in ((mvar|T)) is associated with a positive weight ((mvar|d((sub|uv))))
* For every ((math|''i'',''j'' ∈ ''S'')), ((mvar|M((sub|ij)))) equals the sum of the weights along the path from ((mvar|i)) to ((mvar|j)) in  ((mvar|T))

For this case, ((mvar|M)) is called an additive matrix and ((mvar|T)) is called an additive tree. Below we can see an example of an additive distance matrix and its corresponding tree:

[[File:Additive_distance_matrix.png|444x444px|Additive distance matrix (left) and its phylogeny tree (right)|center|frameless]]

== Ultrametric distance matrix ==
The ultrametric distance matrix is defined as  an additive matrix which models the constant [[molecular clock]]. It is used to build a phylogenetic tree. A matrix ((mvar|M)) is said to be ultrametric  if there exists a tree ((mvar|T)) such that:

* ((mvar|M((sub|ij)))) equals the sum of the edge weights along the path from ((mvar|i)) to ((mvar|j)) in ((mvar|T))
* A root of the tree can be identified  with the distance to all the leaves being the same

Here is an example of an ultrametric distance matrix with its corresponding tree:

[[File:Ultrametric tree.png|frameless|395x395px|center]]

== Bioinformatics ==
((missing information|section|alignment-free distance measures (Mash, K(r), FastANI, Skmer etc.); need less weight on how to do alignment (especially with "dumb" DP) and more weight on how to get distance from alignment|date=December 2023))
The distance matrix is widely used in the bioinformatics field,  and it is present in several methods, algorithms and programs. Distance matrices are used to represent [[protein]] structures in a coordinate-independent manner, as well as the pairwise distances between two sequences in [[sequence space]]. They are used in [[structural alignment|structural]] and [[sequence alignment|sequential]] alignment, and for the determination of protein structures from [[Nuclear magnetic resonance|NMR]] or [[X-ray crystallography]].

Sometimes it is more convenient to express data as a [[similarity matrix]].

It is also used to define the [[distance correlation]].

=== [[Sequence alignment]] ===
An alignment of two sequences is formed by inserting spaces in arbitrary locations along the sequences so that they end up with the same length and there are no two spaces at the same position of the two augmented sequences.<ref name=":0">((Cite book |last=Sung |first=Wing-Kin |title=Algorithms in bioinformatics: A practical introduction |publisher=Chapman & Hall |year=2010 |isbn=978-1-4200-7033-0 |pages=29))</ref> One of the primary methods for sequence alignment is [[dynamic programming]]. The method is used to fill the distance matrix and then obtain the alignment. In typical usage, for sequence alignment a matrix is used to assign scores to amino-acid matches or mismatches, and a gap penalty for matching an amino-acid in one sequence with a gap in the other.

==== Global alignment ====
The [[Needleman-Wunsch algorithm]] used to calculate global alignment uses dynamic programming to obtain the distance matrix.

==== Local alignment ====
The [[Smith-Waterman algorithm]] is also dynamic programming based which consists also in obtaining the distance matrix and then obtain the local alignment.

==== Multiple sequence alignment ====
[[Multiple sequence alignment]] is an extension of pairwise alignment to align several sequences at a time. Different MSA methods are based on the same idea of the distance matrix as global and local alignments.

* Center star method. This method defines a center sequence ((Math|''S''<sub>c</sub>)) which minimizes the distance between the sequence ((Math|''S''<sub>c</sub>)) and any other sequence ((Math|''S''<sub>i</sub>)). Then it generates a multiple alignment ((Math|M)) for the set of sequences ((Math|''S'')) so that for every ((Math|''S''<sub>i</sub>)) the alignment distance ((Math|''d''<sub>''M''</sub>(''S''<sub>c</sub>,''S''<sub>i</sub>))) is the optimal pairwise alignment. This method has the characteristic that the computed alignment for ((Math|''S'')) whose sum-of-pair distance is at most twice the optimal multiple alignment.
* Progressive alignment method. This heuristic method to create MSA first aligns the two most related sequences, and then it progressively aligns the next two most related sequences until all sequences are aligned.

There are other methods that have their own program due to their popularity:

* [[Clustal|ClustalW]]
* [[MUSCLE (alignment software)|MUSCLE]]
* [[MAFFT]]
* MANGO
* And many more

===== MAFFT =====
Multiple alignment using fast Fourier transform (MAFFT) is a program with an algorithm based on progressive alignment, and it offers various multiple alignment strategies. First, MAFFT constructs a distance matrix based on the number of shared 6-tuples. Second, it builds the guide tree based on the previous matrix. Third, it clusters the sequences with the help of the [[fast Fourier transform]] and starts the alignment. Based on the new alignment, it reconstructs the guide tree and align again.

=== Phylogenetic analysis ===
((cleanup section|reason=Should be trimmed down and mostly packed up to the main article. That's how [[WP:SPINOFF]] works, right?|date=December 2023))
((main|Distance matrices in phylogeny))
To perform [[phylogenetic]] analysis, the first step is to reconstruct the phylogenetic tree: given a collection of species, the problem is to reconstruct or infer the ancestral relationships among the species, i.e., the phylogenetic tree among the species. Distance matrix methods perform this activity.

==== Distance matrix methods ====
Distance matrix methods of phylogenetic analysis explicitly rely on a measure of "genetic distance" between the sequences being classified, and therefore require multiple sequences as an input. Distance methods attempt to construct an all-to-all matrix from the sequence query set describing the distance between each sequence pair. From this is constructed a phylogenetic tree that places closely related sequences under the same [[interior node]] and whose branch lengths closely reproduce the observed distances between sequences. Distance-matrix methods may produce either rooted or unrooted trees, depending on the algorithm used to calculate them.<ref name=":1">((Cite book |last=Felsenstein |first=Joseph |title=Inferring phylogenies |publisher=Sinauer Associates |year=2003 |isbn=9780878931774))</ref> Given  ((Math|''n'')) species, the input is an ((Math|''n'' × ''n'')) distance matrix ((Math|M)) where ((Math|''M''<sub>ij</sub>)) is the mutation distance between species ((Math|''i'')) and ((Math|''j'')) . The aim is to output a tree of degree ((Math|3)) which is consistent with the distance matrix.

They are frequently used as the basis for progressive and iterative types of [[multiple sequence alignment]]. The main disadvantage of distance-matrix methods is their inability to efficiently use information about local high-variation regions that appear across multiple subtrees.<ref name=":1" /> Despite potential problems, distance methods are extremely fast, and they often produce a reasonable estimate of phylogeny. They also have certain benefits over the methods that use characters directly. Notably, distance methods allow use of data that may not be easily converted to character data, such as [[DNA-DNA hybridization]] assays.

The following are distance based methods for phylogeny reconstruction:

* Additive tree reconstruction
* [[UPGMA]]
* [[Neighbor joining]]
* [[Distance matrices in phylogeny|Fitch-Margoliash]]

===== Additive tree reconstruction =====
Additive tree reconstruction is based on additive and ultrametric distance matrices. These matrices have a special characteristic:

Consider an additive matrix ((Math|M)). For any three species ((Math|i, j, k,)) the corresponding tree is unique.<ref name=":0" /> Every ultrametric distance matrix is an additive matrix. We can observe this property for the tree below, which consists on the species ((Math|i, j, k)).
[[File:Unique_tree_additive_matrix.png|center|frameless|Phylogenetic tree from 3 species]]
The additive tree reconstruction technique starts with this tree. And then adds one more species each time, based on the distance matrix combined with the property mentioned above. For example, consider an additive matrix ((Math|M)) and 5 species ((Math|''a'', ''b'', ''c'', ''d'')) and ((Math|''e'')). First we form an additive tree for two species ((Math|''a'')) and ((Math|''b'')). Then we chose a third one, let's say ((Math|''c'')) and attach it to a point ((Math|''x'')) on the edge between ((Math|''a'')) and ((Math|''b'')). The edge weights are computed with the property above. Next we add the fourth species ((Math|''d'')) to any of the edges. If we apply the property then we identify that ((Math|''d'')) should be attached to only one specific edge. Finally, we add ((Math|''e'')) following the same procedure as before.

===== UPGMA =====
The basic principle of UPGMA (Unweighted Pair Group Method with Arithmetic Mean) is that similar species should be closer in the phylogenetic tree. Hence, it builds the tree by clustering similar sequences iteratively. The method works by building the phylogenetic tree bottom up from its leaves. Initially, we have ((Math|''n'')) leaves (or ((Math|''n'')) singleton trees), each representing a species in ((Math|''S'')). Those ((Math|''n'')) leaves are referred as ((Math|''n'')) clusters. Then, we perform ((Math|''n''-1)) iterations. In each iteration, we identify two clusters ((Math|''C''<sub>1</sub>)) and ((Math|''C''<sub>2</sub>)) with the smallest average distance and merge them to form a bigger cluster ((Math|''C'')). If we suppose ((Math|M)) is ultrametric, for any cluster ((Math|''C'')) created by the UPGMA algorithm, ((Math|''C'')) is a valid ultrametric tree.

===== Neighbor joining =====
Neighbor is a bottom-up clustering method. It takes a distance matrix specifying the distance between each pair of sequences. The algorithm starts with a completely unresolved tree, whose topology corresponds to that of a [[star network]], and iterates over the following steps until the tree is completely resolved and all branch lengths are known:

# Based on the current distance matrix calculate the matrix (defined below).
# Find the pair of distinct taxa i and j (i.e. with) for which has its lowest value. These taxa are joined to a newly created node, which is connected to the central node. 
# Calculate the distance from each of the [[Taxon|taxa]] in the pair to this new node.
# Calculate the distance from each of the taxa outside of this pair to the new node.
# Start the algorithm again, replacing the pair of joined neighbors with the new node and using the distances calculated in the previous step.<ref>((Cite journal |last=Saitou |first=Naruya |date=1987 |title=The neighbor-joining method: A new method for reconstructing phylogenetic trees |url=https://academic.oup.com/mbe/article/4/4/406/1029664?login=false |journal=Molecular Biology and Evolution |volume=4|issue=4 |pages=406–425 |doi=10.1093/oxfordjournals.molbev.a040454 |pmid=3447015 |doi-access=free ))</ref>

===== Fitch-Margoliash =====
The Fitch–Margoliash method uses a weighted [[least squares]] method for clustering based on genetic distance. Closely related sequences are given more weight in the tree construction process to correct for the increased inaccuracy in measuring distances between distantly related sequences. The least-squares criterion applied to these distances is more accurate but less efficient than the neighbor-joining methods. An additional improvement that corrects for correlations between distances that arise from many closely related sequences in the data set can also be applied at increased computational cost.<ref>((Cite journal |last=Fitch |first=Walter M. |date=1967 |title=Construction of Phylogenetic Trees: A method based on mutation distances as estimated from cytochrome c sequences is of general applicability. |url=https://www.science.org/doi/10.1126/science.155.3760.279 |journal=Science |volume=155|issue=3760 |pages=279–284 |doi=10.1126/science.155.3760.279 |pmid=5334057 ))</ref>

== Data Mining and Machine Learning ==

=== Data Mining ===
A common function in data mining is applying [[cluster analysis]] on a given set of data to group data based on how similar or more similar they are when compared to other groups. Distance matrices became heavily dependent and utilized in [[cluster analysis]] since similarity can be measured with a distance metric. Thus, distance matrix became the representation of the similarity measure between all the different pairs of data in the set.

==== Hierarchical clustering ====
A distance matrix is necessary for traditional [[hierarchical clustering]] algorithms which are often heuristic methods employed in biological sciences such as phylogeny reconstruction. When implementing any of the hierarchical clustering algorithms in data mining, the distance matrix will contain all pair-wise distances between every point and then will begin to create clusters between two different points or clusters based entirely on distances from the distance matrix.

If N be the number of points, the complexity of hierarchical clustering is:

* Time complexity is <math>O(N^3)</math> due to the repetitive calculations done after every cluster to update the distance matrix
* Space complexity is <math>O(N^2)</math>

=== Machine Learning ===
Distance metrics are a key part of several machine learning algorithms, which are used in both [[Supervised learning|supervised]] and [[unsupervised learning]]. They are generally used to calculate the similarity between data points: this is where the distance matrix is an essential element. The use of an effective distance matrix improves the performance of the machine learning model, whether it is for classification tasks or for clustering.<ref>((Cite web |date=February 25, 2020 |title=4 types of distance metrics in machine learning |url=https://www.analyticsvidhya.com/blog/2020/02/4-types-of-distance-metrics-in-machine-learning/ ))</ref>

==== K-Nearest Neighbors ====
A distance matrix is utilized in the [[k-NN algorithm]] which is one of the slowest but simplest and most used instance-based machine learning algorithms that can be used both in classification and regression tasks. It is one of the slowest machine learning algorithms since each test sample's predicted result requires a fully computed distance matrix between the test sample and each training sample in the training set. Once the distance matrix is computed, the algorithm selects the K number of training samples that are the closest to the test sample to predict the test sample's result based on the selected set's majority (classification) or average (regression) value.

* Prediction time complexity is <math>O(k * n * d)</math>, to compute the distance between each test sample with every training sample to construct the distance matrix where:

# k = number of nearest neighbors selected 
# n = size of the training set
# d = number of dimensions being used for the data
This classification focused model predicts the label of the target based on the distance matrix between the target and each of the training samples to determine the K-number of samples that are the closest/nearest to the target.
((Photo montage|
| photo1a =DistanceMatrix_KNN.png((!))The distance matrix used to select K train samples for K-nn
| photo1b =K_nearestNeighborVisual.png((!))Machine Learning model predicting target value with K-NN
| size = 650
| border = 0
| color = transparent
))

=== Computer Vision ===
A distance matrix can be used in [[Neural network|neural networks]] for 2D to 3D regression in image predicting machine learning models.

== Information retrieval ==

=== Distance matrices using Gaussian mixture distance ===

*[https://www.researchgate.net/publication/220723359_Evaluation_of_Distance_Measures_Between_Gaussian_Mixture_Models_of_MFCCs]* Gaussian mixture distance for performing accurate [[nearest neighbor search]] for information retrieval. Under an established Gaussian finite mixture model for the distribution of the data in the database, the Gaussian mixture distance is formulated based on minimizing the [[Kullback-Leibler divergence]] between the distribution of the retrieval data and the data in database. In the comparison of performance of the Gaussian mixture distance with the well-known [[Euclidean distance|Euclidean]] and [[Mahalanobis distance|Mahalanobis]] distances based on a precision performance measurement, experimental results demonstrate that the Gaussian mixture distance function is superior in the others for different types of testing data.

Potential basic algorithms worth noting on the topic of information retrieval is [[Fish School Search]] algorithm an information retrieval that partakes in the act of using distance matrices in order for gathering collective behavior of fish schools. By using a feeding operator to update their weights

Eq. A:

:<math>
x_i(t+1)=x_{i}(t)- step_{vol} rand(0,1)\frac{x_{i}(t) - B(t)}{distance(x_{i}(t),B(t))},
</math>

Eq. B:

:<math>
x_i(t+1)=x_{i}(t)+step_{vol} rand(0,1)\frac{x_{i}(t) - B(t)}{distance(x_{i}(t),B(t))},
</math>

Stepvol defines the size of the maximum volume displacement preformed with the distance matrix, specifically using a [[Euclidean distance]] matrix.

=== Evaluation of the similarity or dissimilarity of Cosine similarity and Distance matrices ===

[[File:SimilarityTOidistance.png|none|thumb|Conversion formula between cosine similarity and Euclidean Distance]]

*[https://www.sciencedirect.com/science/article/pii/S0020025507002630]* While the [[Cosine similarity]] measure is perhaps the most frequently applied proximity measure in information retrieval by measuring the angles between documents in the search space on the base of the cosine. Euclidean distance is invariant to mean-correction. The sampling distribution of a mean is generated by repeated sampling from the same population and recording of the sample means obtained. This forms a distribution of different means, and this distribution has its own mean and variance. For the data which can be negative as well as positive, the [[null distribution]] for cosine similarity is the distribution of the [[dot product]] of two independent random unit vectors. This distribution has a mean of zero and a variance of 1/n. While [[Euclidean distance]] will be invariant to this correction.

=== Clustering Documents ===
The implementation of [[hierarchical clustering]] with distance-based metrics to organize and group similar documents together will require the need and utilization of a distance matrix. The distance matrix will represent the degree of association that a document has with another document that will be used to create clusters of closely associated documents that will be utilized in retrieval methods of relevant documents for a user's query.

=== Isomap ===
[[Isomap]] incorporates distance matrices to utilize [[geodesic distance]]s to able to compute lower-dimensional embeddings. This helps to address a collection of documents that reside within a massive number of dimensions and empowers to perform document clustering.

=== Neighborhood Retrieval Visualizer (NeRV) ===
An algorithm used for both unsupervised and supervised visualization that uses distance matrices to find similar data based on the similarities shown on a display/screen.

The distance matrix needed for Unsupervised NeRV can be computed through fixed input pairwise distances.

The distance matrix needed for Supervised NeRV requires formulating a supervised distance metric to be able to compute the distance of the input in a supervised manner.

== Chemistry ==
The distance matrix is a mathematical object widely used in both graphical-theoretical (topological) and geometric (topographic) versions of chemistry.<ref name=":2">((Cite journal |last=Mihalic |first=Zlatko |date=1992 |title=The distance matrix in chemistry |journal=Journal of Mathematical Chemistry |volume=11 |pages=223–258|doi=10.1007/BF01164206 |s2cid=121181446 ))</ref> The distance matrix is used in chemistry in both explicit and implicit forms.

=== Interconversion mechanisms between two permutational isomers ===
Distance matrices were used as the main approach to depict and reveal the shortest path sequence needed to determine the rearrangement between the two permutational isomers.

=== Distance Polynomials and Distance Spectra ===
Explicit use of Distance matrices is required in order to construct the distance polynomials and distance spectra of molecular structures.

=== Structure-property model ===
Implicit use of Distance matrices was applied through the use of the distance based metric [[Wiener index|Weiner number]]/[[Wiener index|Weiner Index]] which was formulated to represent the distances in all chemical structures. The Weiner number is equal to half-sum of the elements of the distance matrix. 
[[File:WeinerNumtoDistanceMatrix.png|thumb|Conversion formula between Weiner Number and Distance Matrix|none]]

=== Graph-theoretical Distance matrix ===
Distance matrix in chemistry that are used for the 2-D realization of molecular graphs, which are used to illustrate the main foundational features of a molecule in a myriad of applications.

[[File:Chem_DistanceMtrix.png|thumb|335x335px|Labeled tree representation of C<sub>6</sub>H<sub>14</sub>'s carbon skeleton based on its distance matrix]]
# Creating a label tree that represents the [[Skeletal formula|carbon skeleton]] of a molecule based on its distance matrix. The distance matrix is imperative in this application because similar molecules can have a myriad of label tree variants of their [[Skeletal formula|carbon skeleton]]. The labeled tree structure of [[hexane]] (C<sub>6</sub>H<sub>14</sub>) carbon skeleton that is created based on the distance matrix in the example, has different carbon skeleton variants that affect both the distance matrix and the labeled tree  
# Creating a labeled graph with edge weights, used in [[chemical graph theory]], that represent molecules with hetero-atoms.
# Le Verrier-Fadeev-Frame (LVFF) method is a computer oriented used to speed up the process of detecting the graph center in polycyclic graphs. However, LVFF requires the input to be a diagonalized distance matrix which is easily resolved by implementing the Householder tridiagonal-QL algorithm that takes in a distance matrix and returns the diagonalized distance needed for the LVFF method.

=== Geometric-Distance Matrix ===
[[File:Geometric_distance_matrix.png|thumb|338x338px|Geometric distance matrix for 2,4-dimethylhexane]]
While the graph-theoretical distance matrix 2-D captures the constitutional features of the molecule, its three-dimensional (3D) character is encoded in the geometric-distance matrix. The geometric-distance matrix is a different type of distance matrix that is based on the graph-theoretical distance matrix of a molecule to represent and graph the 3-D molecule structure.<ref name=":2" /> The geometric-distance matrix of a molecular structure ((Math|''G'')) is a real symmetric ((Math|''n'' x ''n'')) matrix defined in the same way as a 2-D matrix. However, the matrix elements ((Math|''D''<sub>ij</sub>)) will hold a collection of shortest Cartesian distances between ((Math|''i'')) and ((Math|''j'')) in ((Math|''G'')). Also known as topographic matrix, the geometric-distance matrix can be constructed from the known geometry of the molecule. As an example, the geometric-distance matrix of the carbon skeleton of ''2,4-dimethylhexane'' is shown below:

== Other Applications ==

=== Time Series Analysis ===
[[Dynamic time warping|Dynamic Time Warping]] distance matrices are utilized with the clustering and classification algorithms of a collection/group of time series objects.

==Examples==
For example, suppose these data are to be analyzed, where [[pixel]] [[Euclidean distance]] is the [[Metric (mathematics)|distance metric]].

[[Image:Clusters.svg|frame|none|Raw data]]

The distance matrix would be:

{| class="wikitable"
|-
! !! a !! b !! c !! d !! e !! f
|-
! a 
| 0 || 184 || 222 || 177 || 216 || 231
|-
! b 
| 184 || 0 || 45 || 123 || 128 || 200
|-
! c 
| 222 || 45 || 0 || 129 || 121 || 203
|-
! d 
| 177 || 123 || 129 || 0 || 46 || 83
|-
! e 
| 216 || 128 || 121 || 46 || 0 || 83
|-
! f 
| 231 || 200 || 203 || 83 || 83 || 0
|}

These data can then be viewed in graphic form as a [[heat map]].  In this image, black denotes a distance of 0 and white is maximal distance.

[[Image:Distance matrix.PNG|frame|none|Graphical View]]

==See also==
* [[Computer Vision]]
* [[Data clustering]]
* [[Distance set]]
* [[Hollow matrix]]
* [[Min-plus matrix multiplication]]

==References==
((reflist))

((Matrix classes))

[[Category:Metric geometry]]
[[Category:Bioinformatics]]
[[Category:Matrices]]
[[Category:Graph distance]]

Pages transcluded onto the current version of this page (help):

Distance matrix (edit)
Template:Ambox (view source) (template editor protected)
Template:Catalog lookup link (view source) (template editor protected)
Template:Category handler (view source) (protected)
Template:Cite book (view source) (protected)
Template:Cite journal (view source) (protected)
Template:Cite web (view source) (protected)
Template:Cleanup (view source) (template editor protected)
Template:Cleanup section (view source) (semi-protected)
Template:DMC (view source) (template editor protected)
Template:DMCA (view source) (template editor protected)
Template:Dated maintenance category (view source) (template editor protected)
Template:Dated maintenance category (articles) (view source) (template editor protected)
Template:FULLROOTPAGENAME (view source) (template editor protected)
Template:Find sources mainspace (view source) (template editor protected)
Template:Hlist/styles.css (view source) (protected)
Template:MR (view source) (semi-protected)
Template:Main (view source) (template editor protected)
Template:Main other (view source) (protected)
Template:Math (view source) (template editor protected)
Template:Matrix classes (edit)
Template:Missing information (view source) (template editor protected)
Template:More citations needed (view source) (template editor protected)
Template:Mr (edit)
Template:Mvar (view source) (template editor protected)
Template:Navbox (view source) (template editor protected)
Template:Ns has subpages (view source) (protected)
Template:Pagetype (view source) (protected)
Template:Photo montage (view source) (template editor protected)
Template:Portal-inline (view source) (template editor protected)
Template:Reflist (view source) (protected)
Template:Reflist/styles.css (view source) (protected)
Template:SDcat (view source) (protected)
Template:Short description (view source) (protected)
Template:Short description/lowercasecheck (view source) (protected)
Template:Sub (view source) (template editor protected)
Template:Sup (view source) (template editor protected)
Template:Terminate sentence (view source) (template editor protected)
Module:Arguments (view source) (protected)
Module:Catalog lookup link (view source) (template editor protected)
Module:Category handler (view source) (protected)
Module:Category handler/blacklist (view source) (protected)
Module:Category handler/config (view source) (protected)
Module:Category handler/data (view source) (protected)
Module:Category handler/shared (view source) (protected)
Module:Check for unknown parameters (view source) (protected)
Module:Citation/CS1 (view source) (protected)
Module:Citation/CS1/COinS (view source) (protected)
Module:Citation/CS1/Configuration (view source) (protected)
Module:Citation/CS1/Date validation (view source) (protected)
Module:Citation/CS1/Identifiers (view source) (protected)
Module:Citation/CS1/Utilities (view source) (protected)
Module:Citation/CS1/Whitelist (view source) (protected)
Module:Citation/CS1/styles.css (view source) (protected)
Module:Disambiguation/templates (view source) (protected)
Module:Find sources (view source) (template editor protected)
Module:Find sources/config (view source) (template editor protected)
Module:Find sources/links (view source) (template editor protected)
Module:Find sources/templates/Find sources mainspace (view source) (template editor protected)
Module:Format link (view source) (template editor protected)
Module:Hatnote (view source) (template editor protected)
Module:Hatnote/styles.css (view source) (template editor protected)
Module:Hatnote list (view source) (template editor protected)
Module:Labelled list hatnote (view source) (template editor protected)
Module:Message box (view source) (protected)
Module:Message box/ambox.css (view source) (protected)
Module:Message box/configuration (view source) (protected)
Module:Namespace detect/config (view source) (protected)
Module:Namespace detect/data (view source) (protected)
Module:Navbar (view source) (protected)
Module:Navbar/configuration (view source) (protected)
Module:Navbar/styles.css (view source) (protected)
Module:Navbox (view source) (template editor protected)
Module:Navbox/configuration (view source) (template editor protected)
Module:Navbox/styles.css (view source) (template editor protected)
Module:Ns has subpages (view source) (protected)
Module:Pagetype (view source) (protected)
Module:Pagetype/config (view source) (protected)
Module:Pagetype/disambiguation (view source) (protected)
Module:Pagetype/rfd (view source) (template editor protected)
Module:Pagetype/setindex (view source) (protected)
Module:Pagetype/softredirect (view source) (protected)
Module:Photo montage (view source) (template editor protected)
Module:Portal (view source) (template editor protected)
Module:Portal-inline (view source) (template editor protected)
Module:Portal/images/m (view source) (template editor protected)
Module:SDcat (view source) (protected)
Module:String (view source) (protected)
Module:Text (view source) (template editor protected)
Module:Unsubst (view source) (protected)
Module:Wikitext Parsing (view source) (protected)
Module:Yesno (view source) (protected)

Return to Distance matrix.

Retrieved from "https://en.wikipedia.org/wiki/Distance_matrix"