Leaf Compression

Leaf Compression is an analytic used to compress a graph; nodes on the periphery of the graph that do not show an extensive network of connections from them will inform the nodes connected to them to remove them from the graph

Algorithm

  1. If your number of edges equals 1, notify the vertex that the edge is pointing to, and mark yourself for deletion.
  2. Process all incoming messages and set a vertex value based on the value that was sent to you and + 1 for the one who sent the message.
  3. Remove all vertices that sent messages.
  4. Repeat until all leaves have been pruned.

What kind of data can be used?

Any kind of data can be used with Leaf Compression. The graph can be directional or undirected.

Leaf Compression Configuration

There are no configuration options.

How To Run

Note: First see How To Build

How To Run DGA-Giraph
    
        $ ./bin/dga-giraph lc /path/to/input /path/to/output
    
How To Run HBSE with DGA-GraphX
    
        $ ./dga-graphx lc -i hdfs://url.for.namenode:port/path/to/input -o hdfs://url.for.namenode:port/path/to/output