We can take the locations as vertices of a graph, and the canals as the edges of the graph, where the edges are labelled with a weight, giving us the cost of constructing that canal. We must then find a subgraph that passes through each vertex. Clearly this subgraph must be a tree, because if it wasn't a tree it must contain a cycle, and the cycle can be broken by deleting an edge, reducing the cost, and not disconnecting the graph.
What follows are two algorithms that compute the MST, and a proof that they are correct.
Kruskal's Algorithm
This is a greedy algorithm, at each stage making the greediest
possible choice (ie. locally best) with no concern to the implications
of this choice on future choices. Such techniques generally do not
deliver optimal solutions, but in this case it does, and we give a
proof of this.
Kruskal's algorithm can be stated as follows:
0. Create a minimum spanning tree T that initially contains no edges, 1. Choose an edge e in G, where (a) e is not in T and ... (b) e is of minimum weight and ... (c) e does not create a cycle in T 2. If T does not contain all the vertices of G go to step 1.The tree T, produced by the above algorithm, is a MST. We prove this by contradiction.
Proof of Kruskal's algorithm
Let T be a minimum spanning tree of G, and assume (wrongly, as we will
show) that there exists another minimum spanning tree S, such that
weight of S (ie. W(S)) is less than weight of T (ie. W(S) < W(T))
1. Let e be the edge of least weight in T that is not in S 2. Add edge e to S (a) This creates a cycle C, and C contains e (b) Cycle C contains an edge e', where e' is not in T. Proof: This must be the case otherwise all edges in C - e are already in T, and T would also contain a cycle, and would not be a tree. (c) If we replace e' in S by e we get a spanning tree S' where (i) cost of e <= cost e' Proof: e and e' must be coincident on one vertex in common, and the above algorithm would have chosen e in preference to e' to create T. (ii) S' is now one edge closer to T than S' is to T. 3. W(S') <= W(S). We can now repeat from step 1 until S' = T 4. Process terminates with S' = T and W(T) <= W(S). This contradicts initial assumption, that there can be another MST, S< of less cost than T. Consequently Kurskal's algorithm is sound.Prim's algorithm
1. Put an arbitrary vertex into T 2. Add an edge e of minimum weight to T, where e joins a vertex in T to a vertex not in T.More specifically,
0. let U be the set of vertices not in the minimum spanning tree T, and initialise U to be the set {1} let V-U be the set of vertices not in the minimum spanning tree, and initialise V-U to be {2,3,4,...,n} let T be the set of edges in the minimum spanning tree, and initialise T to be the empty set {} 1. let e be the lowest cost edge in G that joins a vertex in U to a vertex in V-U 2. Add edge e to T 3. let w be the (second e), ie w is the vertex in V-U that is an end vertex of the edge e 4. Add vertex w to U 5. Remove vertex w from V-U 6. If the set V-U is not empty then go to step 1 otherwise deliver TThe above algorithm can be shown to be sound using a similar proof to the one above.
Complexity of Prim
Assume the graph G is complete. At each iteration, step 1 compares the
distance of each of the vertices in U to each of the vertices in V-U.
Initially there will be 1 vertex in U and n-1 vertices in V-U, leading to
n-1 looks at edge weights. Next time round there will be 2 vertices in U and
n-2 vertices in V-U, and we examine 2(n-2) edges. Next time there will
be 3 vertices in U, (n-3) vertices in V-U and 3(n-3) edges will be
examined ... and so on. So we have
Sn = 1.(n-1) + 2(n-2) + 3(n-3) ... n(n-(n-1)) = n + 2n + 3n + 4n ... - (1 + 4 + 9 + 16 ...) = n(1 + 2 + 3 + 4 ...) - (1 + 4 + 9 + 16 ...) = (n*n*n - n*n - n)/6 = O(n*n*n)Therefore, for a complete graph, the above implementation of Prim's algorithm is of cubic complexity.
Prim Demo
A scheme encoding of the above algorithm can be found in
/home/s7/pat/scheme/pub/prim.scm
When you load the file it will give a small guide to using a demo
for Prim's algorithm. This will create a complete graph of n vertices
and will compute the MST. As it goes it will draw out the edges of the
MST. I would suggest that you do not go beyond a graph with 100
vertices, as it take considerable time to create the graph.