Thursday, January 17, 2013

Why the most important part of Facebook Graph Search is "Graph"

You've heard the news: Facebook has announced a new offering called Graph Search. Use of graph technologies has been growing over the last few years, and there’s been quite the buzz around graph databases of late.

We believe that Graph Search is part of a trend that is much bigger than Facebook, and more widespread than search. Facebook is tapping into a fundamentally new way to exploit the information that exists in all the world’s databases. In this post, we will look at the Facebook announcement from a different angle, that of connected data: a growing trend that is on the verge of changing how companies large and small understand their data.

Graphs and Search: a bit of history

Web search and graphs have a long history. Throughout most of the 1990s, the technology behind web search was based on “atomic data”: it indexed each page and ranked it in isolation, based solely on its contents, and without any reference to other pages. 

But in 1999, a small startup called Google adopted a new graph-centric approach, invented by co-founder Larry Page, called PageRank. PageRank changed the fundamentals of web search, and catapulted Google ahead of its competitors, who to this day have not caught up. What was novel about this new algorithm is that instead of ranking pages in isolation, without any reference to one another, it achieved markedly better results by taking into account how the pages are connected.

Connected Data as a New Source of Insight

In his keynote at last year’s GraphConnect Conference in San Francisco, social researcher James Fowler (author of the book “Connected”) shared his latest research findings, indicating how one can learn more about someone by knowing how they interact with the people and things around them, than by just learning discrete facts about that person. The difference between insights gained from atomic data, and the intelligence that can be discovered from connected data, is vast, and calls for specialized technologies that are designed to exploit connectedness.

How Does Graph Search Work?

Graphs are inherently visual. It’s not so difficult to understand how the technology works, even if you’re not that technical. Let’s take one of Facebook’s example Graph Search queries, which is to find all of the Sushi restaurants in New York that my friends like. Below is an illustration of what the underlying graph looks like:



The data stored inside of the graph database looks exactly like the drawing. Getting the answer is a very simple matter for a graph database. You just need to formulate the question in a way that the database understands. Those who are more technically inclined can see an example below for the query that answers the question: 

Sushi restaurants in New York that my friends like


START me=node:person(name = 'Philip'),  
location=node:location(location='New York'), 
cuisine=node:cuisine(cuisine='Sushi')

MATCH (me)-[:IS_FRIEND_OF]->(friend)-[:LIKES]->(restaurant)
-[:LOCATED_IN]->(location),(restaurant)-[:SERVES]->(cuisine)

RETURN restaurant

Cypher Query Language Example: Sushi restaurants in New York that my friends like


Other Applications for Graphs

Thinking in graphs is natural, and contagious. The more you think in terms of connections, the more you realize that graphs are the way that we implicitly think. What is a decision tree, for example, but a graph of possibilities? The more you look, the more you start to notice that graphs are, in fact, everywhere

Graph database users regularly use queries like the one above to answer questions, and the more you ask, the more you think of new questions that never occurred to you to ask previously. Graph queries can get quite elaborate, and it’s entirely possible to run queries that scan within a social network that is two, three, or more levels of friends apart.

Opportunities for leveraging connected data extend far beyond social and search. The pattern that applies to Graph Search is also applicable to bioinformatics, fraud detection, network management, logistics, and a variety of other use cases. Neo Technology has customers in all these areas (and more!) using the Neo4j graph database to achieve new and higher levels of insight.

I’m not Facebook... How can I get this?

Technology giants such as Facebook, Google, and Twitter have all built graph technologies from the ground up to differentiate and grow their business. Building and maintaining one’s own database management system however is not a practical solution if you’re not Facebook. 

The good news is that companies wanting functionality like Graph Search are a click away from getting the tools they need to build it. At its core, Graph Search is a database. Unlike a decade ago, one can now find commercial off-the-shelf graph databases that are proven and robust, and built from the ground up to support connected data. 

Neo4j is the most widely used graph database today. Companies have adopted it because it's 1000 times faster than relational databases for working with connected data, and much easier to work with than by shoehorning graphs into tables. 

Neo4j is freely available as open source software, with a Community Edition available under same open source license as MySQL, and an Enterprise edition. Commercial subscriptions are available from Neo4j creator and sponsor Neo Technology

Commercial users include Cisco, Adobe, Deutsche Telekom, Accenture, and many more; as well as lots of startups, including FiftyThree (makers of Paper, winner of Apple’s 2012 iPad App of the Year), Seth Godin’s Squidoo, and Justdial (one of India’s most talked-about startups).

As we move into an era where more and more companies are benefiting from understanding connected data, having the right tools available to anyone means that no one needs to get left behind. Neo4j is available for download today. Give it a try, or check out the interactive Cypher web console, to try out the Cypher graph query language immediately from your web browser.

Emil Eifrem and Philip Rathle, co-authors

Click on the image below to view the example query above in an online interactive Cypher console:

1 comment:

mariewallace said...

Great post guys... I would add one additional in response to your question "I'm not Facebook... how can I get this?".

There are two sides to getting a great Graph Search (or any graph analytics, such as social recommendations): (a) the database (which you eloquently describe), and (b) the data.

On the data side of things; I would argue that enterprise applications have data which is way more interesting for graph analytics than Facebook does. Why?

Firstly, Facebook's graph is purely a Social Graph and doesn't appear to integrate the Knowledge Graph (which is critical for good analytics).

Secondly, the quality of Facebook's data is questionable. Take the "Likes" edge as an example; in many cases this is purchased by companies and therefore is not really a signal for how much someone likes a company, but rather a signal for how much they liked the incentive they were given for clicking on the Like. This muddies any analytics applied to this data.

The enterprise has a much richer data set spread across their business and social applications, which could give them an amazing business graph on which to base their search and analytics.