A minimum spanning tree (MST) or minimum weight spanning tree is a subset of the edges of a connected, edgeweighted undirected graph that connects all the vertices together, without any cycles and with the minimum possible total edge weight.^{[1]} That is, it is a spanning tree whose sum of edge weights is as small as possible.^{[2]} More generally, any edgeweighted undirected graph (not necessarily connected) has a minimum spanning forest, which is a union of the minimum spanning trees for its connected components.
There are many use cases for minimum spanning trees. One example is a telecommunications company trying to lay cable in a new neighborhood. If it is constrained to bury the cable only along certain paths (e.g. roads), then there would be a graph containing the points (e.g. houses) connected by those paths. Some of the paths might be more expensive, because they are longer, or require the cable to be buried deeper; these paths would be represented by edges with larger weights. Currency is an acceptable unit for edge weight – there is no requirement for edge lengths to obey normal rules of geometry such as the triangle inequality. A spanning tree for that graph would be a subset of those paths that has no cycles but still connects every house; there might be several spanning trees possible. A minimum spanning tree would be one with the lowest total cost, representing the least expensive path for laying the cable.
If there are n vertices in the graph, then each spanning tree has n − 1 edges.
There may be several minimum spanning trees of the same weight; in particular, if all the edge weights of a given graph are the same, then every spanning tree of that graph is minimum.
If each edge has a distinct weight then there will be only one, unique minimum spanning tree. This is true in many realistic situations, such as the telecommunications company example above, where it's unlikely any two paths have exactly the same cost. This generalizes to spanning forests as well.
Proof:
More generally, if the edge weights are not all distinct then only the (multi)set of weights in minimum spanning trees is certain to be unique; it is the same for all minimum spanning trees.^{[3]}
If the weights are positive, then a minimum spanning tree is, in fact, a minimumcost subgraph connecting all vertices, since if a subgraph contains a cycle, removing any edge along that cycle will decrease its cost and preserve connectivity.
For any cycle C in the graph, if the weight of an edge e of C is larger than any of the individual weights of all other edges of C, then this edge cannot belong to an MST.
Proof: Assume the contrary, i.e. that e belongs to an MST T_{1}. Then deleting e will break T_{1} into two subtrees with the two ends of e in different subtrees. The remainder of C reconnects the subtrees, hence there is an edge f of C with ends in different subtrees, i.e., it reconnects the subtrees into a tree T_{2} with weight less than that of T_{1}, because the weight of f is less than the weight of e.
For any cut C of the graph, if the weight of an edge e in the cutset of C is strictly smaller than the weights of all other edges of the cutset of C, then this edge belongs to all MSTs of the graph.
Proof: Assume that there is an MST T that does not contain e. Adding e to T will produce a cycle, that crosses the cut once at e and crosses back at another edge e'. Deleting e' we get a spanning tree T∖{e' } ∪ {e} of strictly smaller weight than T. This contradicts the assumption that T was a MST.
By a similar argument, if more than one edge is of minimum weight across a cut, then each such edge is contained in some minimum spanning tree.
If the minimum cost edge e of a graph is unique, then this edge is included in any MST.
Proof: if e was not included in the MST, removing any of the (larger cost) edges in the cycle formed after adding e to the MST, would yield a spanning tree of smaller weight.
If T is a tree of MST edges, then we can contract T into a single vertex while maintaining the invariant that the MST of the contracted graph plus T gives the MST for the graph before contraction.^{[4]}
In all of the algorithms below, m is the number of edges in the graph and n is the number of vertices.
The first algorithm for finding a minimum spanning tree was developed by Czech scientist Otakar Borůvka in 1926 (see Borůvka's algorithm). Its purpose was an efficient electrical coverage of Moravia. The algorithm proceeds in a sequence of stages. In each stage, called Boruvka step, it identifies a forest F consisting of the minimumweight edge incident to each vertex in the graph G, then forms the graph G_{1} = G \ F as the input to the next step. Here G \ F denotes the graph derived from G by contracting edges in F (by the Cut property, these edges belong to the MST). Each Boruvka step takes linear time. Since the number of vertices is reduced by at least half in each step, Boruvka's algorithm takes O(m log n) time.^{[4]}
A second algorithm is Prim's algorithm, which was invented by Vojtěch Jarník in 1930 and rediscovered by Prim in 1957 and Dijkstra in 1959. Basically, it grows the MST (T) one edge at a time. Initially, T contains an arbitrary vertex. In each step, T is augmented with a leastweight edge (x,y) such that x is in T and y is not yet in T. By the Cut property, all edges added to T are in the MST. Its runtime is either O(m log n) or O(m + n log n), depending on the datastructures used.
A third algorithm commonly in use is Kruskal's algorithm, which also takes O(m log n) time.
A fourth algorithm, not as commonly used, is the reversedelete algorithm, which is the reverse of Kruskal's algorithm. Its runtime is O(m log n (log log n)^{3}).
All four of these are greedy algorithms. Since they run in polynomial time, the problem of finding such trees is in FP, and related decision problems such as determining whether a particular edge is in the MST or determining if the minimum total weight exceeds a certain value are in P.
Several researchers have tried to find more computationallyefficient algorithms.
In a comparison model, in which the only allowed operations on edge weights are pairwise comparisons, Karger, Klein & Tarjan (1995) found a linear time randomized algorithm based on a combination of Borůvka's algorithm and the reversedelete algorithm.^{[5]}^{[6]}
The fastest nonrandomized comparisonbased algorithm with known complexity, by Bernard Chazelle, is based on the soft heap, an approximate priority queue.^{[7]}^{[8]} Its running time is O(m α(m,n)), where α is the classical functional inverse of the Ackermann function. The function α grows extremely slowly, so that for all practical purposes it may be considered a constant no greater than 4; thus Chazelle's algorithm takes very close to linear time.
If the graph is dense (i.e. m/n ≥ log log log n), then a deterministic algorithm by Fredman and Tarjan finds the MST in time O(m).^{[9]} The algorithm executes a number of phases. Each phase executes Prim's algorithm many times, each for a limited number of steps. The runtime of each phase is O(m + n). If the number of vertices before a phase is n', the number of vertices remaining after a phase is at most . Hence, at most log*n phases are needed, which gives a linear runtime for dense graphs.^{[4]}
There are other algorithms that work in linear time on dense graphs.^{[7]}^{[10]}
If the edge weights are integers represented in binary, then deterministic algorithms are known that solve the problem in O(m + n) integer operations.^{[11]} Whether the problem can be solved deterministically for a general graph in linear time by a comparisonbased algorithm remains an open question.
Given graph G where the nodes and edges are fixed but the weights are unknown, it is possible to construct a binary decision tree (DT) for calculating the MST for any permutation of weights. Each internal node of the DT contains a comparison between two edges, e.g. "Is the weight of the edge between x and y larger than the weight of the edge between w and z?". The two children of the node correspond to the two possible answers "yes" or "no". In each leaf of the DT, there is a list of edges from G that correspond to an MST. The runtime complexity of a DT is the largest number of queries required to find the MST, which is just the depth of the DT. A DT for a graph G is called optimal if it has the smallest depth of all correct DTs for G.
For every integer r, it is possible to find optimal decision trees for all graphs on r vertices by bruteforce search. This search proceeds in two steps.
A. Generating all potential DTs
B. Identifying the correct DTs To check if a DT is correct, it should be checked on all possible permutations of the edge weights.
Hence, the total time required for finding an optimal DT for all graphs with r vertices is:^{[4]}
which is less than
See also: Decision tree model 
Seth Pettie and Vijaya Ramachandran have found a provably optimal deterministic comparisonbased minimum spanning tree algorithm.^{[4]} The following is a simplified description of the algorithm.
The runtime of all steps in the algorithm is O(m), except for the step of using the decision trees. The runtime of this step is unknown, but it has been proved that it is optimal  no algorithm can do better than the optimal decision tree. Thus, this algorithm has the peculiar property that it is provably optimal although its runtime complexity is unknown.
Further information: Parallel algorithms for minimum spanning trees 
Research has also considered parallel algorithms for the minimum spanning tree problem. With a linear number of processors it is possible to solve the problem in O(log n) time.^{[12]}^{[13]}
The problem can also be approached in a distributed manner. If each node is considered a computer and no node knows anything except its own connected links, one can still calculate the distributed minimum spanning tree.
Alan M. Frieze showed that given a complete graph on n vertices, with edge weights that are independent identically distributed random variables with distribution function satisfying , then as n approaches +∞ the expected weight of the MST approaches , where is the Riemann zeta function (more specifically is Apéry's constant). Frieze and Steele also proved convergence in probability. Svante Janson proved a central limit theorem for weight of the MST.
For uniform random weights in , the exact expected size of the minimum spanning tree has been computed for small complete graphs.^{[14]}
Vertices  Expected size  Approximate expected size 

2  1/2

0.5 
3  3/4

0.75 
4  31/35

0.8857143 
5  893/924

0.9664502 
6  278/273

1.0183151 
7  30739/29172

1.053716 
8  199462271/184848378

1.0790588 
9  126510063932/115228853025

1.0979027 
There is a fractional variant of the MST, in which each edge is allowed to appear "fractionally". Formally, a fractional spanning set of a graph (V,E) is a nonnegative function f on E such that, for every nontrivial subset W of V (i.e., W is neither empty nor equal to V), the sum of f(e) over all edges connecting a node of W with a node of V\W is at least 1. Intuitively, f(e) represents the fraction of e that is contained in the spanning set. A minimum fractional spanning set is a fractional spanning set for which the sum is as small as possible.
If the fractions f(e) are forced to be in {0,1}, then the set T of edges with f(e)=1 are a spanning set, as every node or subset of nodes is connected to the rest of the graph by at least one edge of T. Moreover, if f minimizes, then the resulting spanning set is necessarily a tree, since if it contained a cycle, then an edge could be removed without affecting the spanning condition. So the minimum fractional spanning set problem is a relaxation of the MST problem, and can also be called the fractional MST problem.
The fractional MST problem can be solved in polynomial time using the ellipsoid method.^{[15]}^{: 248 } However, if we add a requirement that f(e) must be halfinteger (that is, f(e) must be in {0, 1/2, 1}), then the problem becomes NPhard,^{[15]}^{: 248 } since it includes as a special case the Hamiltonian cycle problem: in an vertex unweighted graph, a halfinteger MST of weight can only be obtained by assigning weight 1/2 to each edge of a Hamiltonian cycle.
Minimum spanning trees have direct applications in the design of networks, including computer networks, telecommunications networks, transportation networks, water supply networks, and electrical grids (which they were first invented for, as mentioned above).^{[29]} They are invoked as subroutines in algorithms for other problems, including the Christofides algorithm for approximating the traveling salesman problem,^{[30]} approximating the multiterminal minimum cut problem (which is equivalent in the singleterminal case to the maximum flow problem),^{[31]} and approximating the minimumcost weighted perfect matching.^{[32]}
Other practical applications based on minimal spanning trees include: