History of Path-based DFS for Strong Components


Several versions of the path-based algorithm for strongly connected components have been proposed over the years and are surveyed below. There may be others I don't know of! For completeness note Bob Tarjan's original SC algorithm from 1972 [T]. 1

Paul Purdom [P] included an SC algorithm in his 1970 paper on computing the transitive closure. The algorithm uses an array Stack for
            "a list of nodes on the path being investigated for cycles",
and the path is grown depth-first. Cycles are contracted when they are discovered, thus maintaining the path. Since transitive closure time is superlinear, optimizing the SC algorithm wasn't necessary, and the contraction is done by simply combining adjacency lists.

In 1971 Ian Munro [M] stated the path-based SC algorithm. Contraction is done using a disjoint set-union algorithm. The paper cites the strategy of merging the smaller set into the larger, giving the algorithm a time bound of O(m+n log n). The faster set merging algorithms developed later improve the time bound, and in fact using the incremental tree set merging algorithm of [GT] gives linear running time O(m+n).

In his 1976 book on beautiful algorithms Edsger Dijkstra [D] develops the path-based algorithm from first principles. Dijkstra does not mention depth-first search explicitly (nor do Purdom or Munro) but when it comes time to structure the search he says [D,p.195] the active part of the search will contain
            "no more and no less than a single directed path"
because
            "Firstly ... there does not seem to be much advantage in introducing disconnected [ie, nonpath] maximal strong components"
and
            "Secondly, the directed path ... is easily maintained".2
The top-down development is similar to [G], and the three arrays used -- rank, cc, and knar -- are respectively I, B and S of [G]. Additionally Dijkstra uses two arrays of edges, to avoid recursion. (Both arrays are essentially provided automatically in the recursive versions below.) Dijkstra's algorithm has running time O(m+n) and is more practical than one using general set merging techniques to do the contractions.

In 1996 Joseph Cheriyan and Kurt Mehlhorn [CM; see also MN] presented a very similar algorithm (as part of a larger work on algorithms for dense graphs). The algorithm is developed in the context of the depth-first spanning tree, which by then had become synonymous with dfs. Three arrays are used -- dfsnum, unfinished, and root, where unfinished is S and the other two are variants of I and B respectively.

In 2000 I rediscovered the path-based SC algorithm [G]. (The set-merging version was described in passing in [G2, 1991 TR].)

All these independent discoveries give credence to path-based dfs as a natural approach.


References


[CM] J. Cheriyan and K. Mehlhorn, Algorithms for dense graphs and networks on the random access computer, Algorithmica 15(1996), 521-549.

[D] E.W. Dijkstra, A Discipline of Programming, Prentice Hall, NJ, 1976, Ch.25.

[G] H.N. Gabow, Path-based depth-first search for strong and biconnected components, Information Processing Letters 74(2000), 107-114.

[G2] H.N. Gabow, Applications of a poset representation to edge connectivity and graph rigidity, Proc. 32nd Annual IEEE Symp. on Foundations of Comp. Sci. (1991), 812-821; also Tech. Rept. CU-CS-545-91, Sept. 1991, 66 pages; final version ``The minimal-set poset for edge connectivity,'' 2011, 44 pages, submitted for publication.

[GT] H.N. Gabow and R.E. Tarjan, A linear-time algorithm for a special case of disjoint set union, J. Comp. and Sys. Sci 30(1985), 209-221.

[M] I. Munro, Efficient determination of the transitive closure of a directed graph, Information Processing Letters 1(1971), 56-58.

[MN] K. Mehlhorn and S. Naher, LEDA: A platform for combinatorial and geometric computing, Cambridge University Press, NY, 1999, Sec. 7.4.2-7.4.3.

[P] P. Purdom Jr., A transitive closure algorithm, BIT 10 (1970), 76-94.

[T] R.E. Tarjan, Depth-first search and linear graph algorithms, SIAM J. Computing 1 (1972), 146-160.



1 I've translated data structures into the language of [G], even though the earlier algorithms might warrant translating [G] into their data structures.

2 In Dijkstra-ese the path is "cycle in statu nascendi".