About

Distributed Graph Analytics (DGA) is a compendium of graph analytics written for Bulk-Synchronous-Parallel (BSP) processing frameworks such as Giraph and GraphX. The analytics included are High Betweenness Set Extraction, Weakly Connected Components, Page Rank, Leaf Compression, and Louvain Modularity.

Who can use this?

Anyone who has a data set and wants to do data analysis! We package analytics implemented in both Giraph and GraphX. Some knowledge of a cluster, java, and Linux is required, but it not necessary.

Why would I want this?

Tools like Gephi are nice, but can only handle small data sets on a single machine. DGA uses the power of Hadoop, Giraph, and GraphX to create a distributed approach to the analytics, so it can handle a much larger data sets in parallel.

Current Supported Analytics

Giraph

GraphX

Not Included In GraphX

How Do I Get It?

See: How To Get DGA