Graph Builder is built around the hole that exists with in analysis of graph based data where there are frameworks to process and analyze graphs but the data scientists do not have many options to actually create large graphs from unstructured data that can be digested by such frameworks.
It comes with a graph construction library with algorithms for parallel graph construction, transformation and verification that ultimately helps in graph mining. GraphBuilder can also partition and serialize large scale graphs for ingest by other machine learning frameworks. The library connects to Apache Hadoop and can create graphs from the resulting big data sets that can further be analyzed.
This innovative tool is written in Java making it possible for a Java programmer to build an internet-scale graph for PageRank in about 100 lines of code and a Wikipedia-sized graph for LDA in about 130. The idea stemmed from initial efforts of University of Washington in Seattle who developed a new framework, called GraphLab, that is specifically designed for graph-based parallel machine learning. In many cases, GraphLab can process such graphs 20-50X faster than Hadoop MapReduce. However it was soon found that there is no credible solution to construct large scale graphs that frameworks like GraphLab could digest. GraphBuilder takes its birth from the resultant research.
Intel says GraphBuilder differs itself from other products where it not only constructs large-scale graphs fast but also offloads many of the complexities of graph construction, including graph formation, cleaning, compression, partitioning, and serialization. Licensed under Apache 2.0 license, GraphBuilder is currently in beta and can be downloaded from its website.