Cloud Journal

 

 



OpenSource GraphBuilder From Intel To Graph Big Data Insights Like None Other


Written by  Sudheer Raju | 10 December 2012
E-mail PDF

graphbuilderIntel's latest effort in big data innovation is the beta release of new opensource tool "Graph Builder" that allows the development community to build large graphs from volumes of unstructured data and process them further to result in actionable insights.

Graph Builder is built around the hole that exists with in analysis of graph based data where there are frameworks to process and analyze graphs but the data scientists do not have many options to actually create large graphs from unstructured data that can be digested by such frameworks.

It comes with a graph construction library with algorithms for parallel graph construction, transformation and verification that ultimately helps in graph mining. GraphBuilder can also partition and serialize large scale graphs for ingest by other machine learning frameworks. The library connects to Apache Hadoop and can create graphs from the resulting big data sets that can further be analyzed.

This innovative tool is written in Java making it possible for a Java programmer to build an internet-scale graph for PageRank in about 100 lines of code and a Wikipedia-sized graph for LDA in about 130. The idea stemmed from initial efforts of University of Washington in Seattle who developed a new framework, called GraphLab, that is specifically designed for graph-based parallel machine learning.  In many cases, GraphLab can process such graphs 20-50X faster than Hadoop MapReduce. However it was soon found that there is no credible solution to construct large scale graphs that frameworks like GraphLab could digest. GraphBuilder takes its birth from the resultant research.

Intel says GraphBuilder differs itself from other products where it not only constructs large-scale graphs fast but also offloads many of the complexities of graph construction, including graph formation, cleaning, compression, partitioning, and serialization. Licensed under Apache 2.0 license, GraphBuilder is currently in beta and can be downloaded from its website.

Sudheer Raju

Sudheer Raju

Founder of ToolsJournal, a technology journal on software tools and services. Sudheer has overall accountability for the webiste product development and is responsible for Sales and Marketing. With a flair to write, Sudheer himself writes for toolsjournal across all journal categories.


blog comments powered by Disqus