Computing Strongly Connected Components in Parallel on CUDA (full version)

by Jiří Barnat, Petr Bauch, Luboš Brim, Milan Češka, July 2010, 24 pages.

FIMU-RS-2010-10. Available as Postscript, PDF.


The problem of decomposition of a directed graph into its strongly connected components is a fundamental graph problem inherently present in many scientific and commercial applications. In this paper we show how existing parallel algorithms can be reformulated in order to be accelerated by NVIDIA CUDA technology. In particular, we design a new CUDA-aware procedure for pivot selection and we redesign the parallel algorithms in order to allow for CUDA accelerated computation. We also experimentally demonstrate that with a single GTX 280 GPU card we can easily outperform optimal serial CPU implementation, which is particularly interesting result as unlike the serial CPU case, the asymptotic complexity of the parallel algorithms is not optimal.