Skip to content
icoming edited this page Jun 12, 2014 · 22 revisions

FlashGraph is an SSD-based semi-external memory graph processing engine, and is optimized for a high-speed SSD array. It enables us to process a billion-node graph in a single machine and has performance comparable to or exceed in-memory graph engines such as PowerGraph. It also has very short loading time.

Here is some performance result of FlashGraph and PowerGraph on large graphs in a single machine. We have five graph applications: breadth-first search (BFS), triangle counting (TC), weakly connected components (WCC), scan statistics (SS), page rank (PR). Each of them has a FlashGraph implementation and a PowerGraph implementation. We use these applications to compare the performance of FlashGraph and PowerGraph on the twitter graph (42M vertices and 1.5B edges) and the subdomain graph (89M vertices and 2B edges). Figure 8 shows the runtime of the applications in both graph engines and Figure 9 shows the memory consumption of FlashGraph and PowerGraph. Although FlashGraph runs in the semi-external memory mode, it can still significantly outperform PowerGraph in most graph applications and it uses a small fraction of the memory used by PowerGraph.

runtime on twitter and subdomain

memory consumption

FlashGraph enables us to process a very large graph in a single machine. Here we demonstrate that FlashGraph can process a real-world hyperlink page graph (3.4B vertices and 129B edges) in a single machine. The table below show the performance of the graph applications as above as well as diameter estimation (DE) on the page graph.

runtime on page graph

Clone this wiki locally