Tuesday, January 31, 2012

Spring Data Neo4j Book Release: Good Relationships

Good Relationships, the Spring Data Neo4j Guide Book, is available now for download from InfoQ. Go get it and read all the details about becoming productive with Spring Data Neo4j.

But first, allow me a few words. Like any of you, I'd generally rather be writing code than documentation.  Getting through an entire book would've been impossible without the help of many fine people both prodding and contributing. And now that this duplex book has been bound into a cover, I'm very pleased with it.

Here's why you should stop reading this blog and go get Good Relationships:

Cineasts Tutorial

The book opens with a narrative tutorial about creating Cineasts.net, a full social web application for movie enthusiasts. From inspiration to complete application, we follow the normal progression of application development, introducing Neo4j concepts coupled with new application features. The result is a powerful demonstration of the possibilities enabled by Spring Data Neo4j.

Spring Data Neo4j Reference

The second part of the book provides a thorough reference to the graph facilities available in Spring Data Neo4j. It covers core graph concepts, querying and the simple annotated POJO programming model familiar to Spring developers.
There are two alternative facilities you'll use in development. One is Neo4jTemplate, which offers the convenient API of Spring templates for working with the Neo4j graph database. Entity repositories built upon the Neo4jTemplate infrastructure to perform CRUD- and advanced query operations.

Spring Data Neo4j let's you extend your existing techniques of working with annotations and object mapping, now with the capabilities of a Neo4j's high performance graph database.

OK, go get a copy of the book and let me know what you think.

Cheers,
Michael

Michael Hunger is the project lead of Spring Data Neo4j and the author of "Good Relationships: The Spring Data Neo4j Guide Book". As a developer he loves to work with many aspects of programming languages, learning new things every day, participating in exciting and ambitious open source projects and contributing to different programming related books. Michael is also an active editor and interviewer at InfoQ.

Thursday, January 26, 2012

We won the Rapidus award!

I was running late - meeting across time zones is a hassle. Standing in the street I could
hear the heavy rock music from the night club. Was this really the place for a big media
event in Malmö? Stepping into the dark it felt totally right though. More than 150 people
had dressed down to participate in the mingle and awards that night. Rock away!

Rapidus is an online newsletter here in Sweden. Each year they award an up and coming company in the region of Malmö. This prize might not be too well known around the globe, but the list of winners include examples like QlikTech (NASDAQ: QLIK). Their database report application, QlikView, is used by some 22 000 customers in 100 contries - impressive. Another example is TAT, a high profile interface designer for mobile devices. If you have Android then check the clock widget. In the very clean face you can make out the name of "Malmo". They were bought by RIM (Blackberry) in 2010. More examples: Cellavision, Jolife, IO Interactive and Illusion Labs.
Just being nominated in this list of companies was a very pleasant surprise. But still, I didn't want to get my hopes up. It wasn't until that I heard the presenter say the magic words "graph db" that I could let go of one big smile.

The jury's statement:
...to the excellence and tireless work have developed a product that brought not only
international admiration but also already laid the foundation for commercial success. The
company's graph-based database has attracted both reputed clients and new, large investors, in
bright stiff competition with the world giants in the field. Neo Technology rocks throughout their
world, all the way from London to San Francisco and back again.”

Or in other words from one of the jury members - Björn H. Lindbäck:
Neo Technology has maxed all our criteria on the level of innovation and business
model. And they have an economic platform of savvy venture capitalists who give us QlikTech
vibes.”

Joy.

Figure: We're the guys with flowers and smiles, but no guitars. Johan Svensson CTO and Björn Granvik Director of Engineering. 

To the right in the image above you can barely make out the quote from Werner Vogels, the CTO of Amazon:
For anything with multiple relationships, multiple connections, Neo4j absolutely ROCKS!”
Very fitting words, given the nightclub and the hard rock setup.

Johan did a splendid thank-you-speech with the following punchline aimed at the entrepreneurs in the audience:
We've been working on this for 10 years. Don't give up. Be persistent!”
As a graphista and old time data monkey, I could appreciate that last pun on persistence.

Emil Eifrém, our CEO, participated via link from Silicon Valley. To the audience's laughter he managed both to compare Rapidus jury and their secrecy with the current state of Swedish politics (you have to live here) and to squeeze in infotainment around our product Neo4j. Way to go Emil!

Figure: An ecstatic CTO. I know that we both felt pure joy over the award. Johan's just better than me at hiding this.


In the mingling afterwards I got a chance to talk to one of the jury members:
I've been following you for a month on Twitter - it's incredible what programmers say about you!”

What can I say: Community matters!

Thanks everyone!
Thank you Rapidus - we're happy and still surprised.

Thank you all Neoites!
Keep up the great work.

Björn Granvik
Director of Engineering @ Neo Technology

Wednesday, January 25, 2012

Released Neo4j 1.6 GA “Jörn Kniv”!

Three milestones later and we’re proud and happy to announce the release of Neo4j 1.6 GA.

We are excited about a host of great new features, all ready to be used. Let's get to it.

Highlights

What features have been included in this release?
  • Cloud - Public beta on Heroku of the Neo4j Add-on
  • Cypher - Supports older Cypher versions, better pattern matching, better performance, improved api
  • Web admin - Full Neo4j Shell commands, including versioned Cypher syntax.
  • Kernel - Improvements, for instance the ability to ensure that key-value pairs for entities are unique.
  • Lucene upgrade - Now version 3.5.

Also, there have been many improvements behind-the-scenes:
Infrastructure - Our library repositories have moved to Amazon, providing significantly faster download times.
Quality - High availability features better logging and operational support.
Process - Better handling of breaking changes in our api and how we handle deprecated features.

If you want more info on all of this - sure you do - please keep reading. Here is a run down of the major new features in Neo4j 1.6.

Heroku Public Beta

The public beta of the Neo4j Add-on for Heroku is available. We're taking a careful approach with our cloud services, evaluating the best supporting infrastructure and user experience in preparation for a general release in the coming months. Already, we've been pleased with the positive response.

Documentation on how to get started with the Heroku Neo4j Add-on can be found at the Heroku DevCenter. We’ll be posting additional guides for getting started on Heroku with Neo4j.

For pioneering adopters, we welcome you to join our Neo4j Heroku Challenge. You can win fabulous prizes while proudly blazing a path into the cloud for our community.

Latest on Cypher

Most the work in Cypher for this release has been internal changes that are not immediately visible to an end user. The type system has been rebuilt and revamped, and a second, simpler, pattern matcher has been added. The first change makes the Cypher code base faster to work with, and the second makes your queries faster.

End user facing changes include: possibility to get all shortest paths, the COALESCE function, column aliasing, and the possibility for variable length relationships to introduce an iterable of the relationships.

More, array properties have been supported in Neo4j for a long time, but until now it wasn’t possible to query on them. This release makes it possible to filter on array properties in Cypher. We have also improved aggregation performance.

Finally, there are two breaking changes - the syntax for the ALL/NONE/ANY/SINGLE predicates has changed, and the ExecutionResult is now a read-once, forward only iterable.

New to Cypher? Then you should watch this updated "Introduction to Cypher" screencast by Alistair Jones:



New on the web admin
I’m quite happy to announce that the web admin interface has initial support for Cypher calls directly in the data browser. It’s so sweet to be able to query your way around the node space! And, the Cypher console is now supports full Neo4j Shell commands.
Moreover, Gremlin has been updated to version 1.4, with major improvements and bug fixes.

Kernel changes

This release includes a popular feature request: the ability to ensure that key-value pairs for entities are unique!

If you look up entities (nodes or relationships) using an external key, you’ll want exactly one entity to correspond to each value of the key.  For example, if you have nodes representing people, and you look these up using Social Security Number (SSN), you’ll want exactly one node for each SSN.  This is easily achieved if you load all your data sequentially, because you can add a new node each time you meet a value of the key (a new SSN).  However, up to now, it has been awkward to maintain this uniqueness when multiple processes are adding data simultaneously (via web requests for example).

Since this is a common use-case, we’ve improved the API to make it easy to enforce entity uniqueness for a given key-value pair.  At the index level, we’ve added a new method putIfAbsent which ensures that only one entity will indexed for the key-value pair, even if lots of threads are using the same key-value pair at the same time.  Alternatively, if you’d prefer to work with nodes or relationships rather than with the underlying indexes, there’s a higher level API provided by UniqueFactory. This makes it easy to retrieve an entity using get-or-create semantics, i.e. it returns a matching entity if one exists, otherwise it creates one. Again, this mechanism is thread-safe, so it doesn’t matter how many threads call getOrCreate simultaneously, only one entity will be created for each key-value pair. This functionality is also exposed through the REST API, via a ?unique query parameter.

Lucene upgrade

Neo4j uses Apache Lucene as the default implementation for its indexing features - this allows you to find “entry points” into the graph before starting graph-based queries.  Lucene is an actively developed project in its own right, and is constantly being enhanced and improved.  In this Neo4j release, we’re taking the opportunity to upgrade to a newer stable release of Apache Lucene, so that all users get the benefits of recent enhancements in Lucene.  We’ve moved to Lucene 3.5; for details on all the changes, have a look at their changelog.

Breaking changes and deprecating

We’re introducing a new way to handle breaking changes. They will be flagged in the change logs as “BREAKING CHANGE.”

Where we do introduce a breaking change, we will continue to support the older functionality for 2 GA releases. This would typically be six months heads up and will allow you to adopt new GA releases quickly while giving plenty of time to develop against the new API. This policy applies to published and stable APIs, including Cypher.

In the same vein: We now have a deprecated feature. Cypher execution is now part of the core REST API, the cypher plugin is deprecated.

This policy does not cover third-party add-ons (like Gremlin from Tinkerpop) which have their own release strategy.

Looking Forward

Community member Pablo Pareja Tobes had organized a poll around feature requests, which really helps us prioritize our development focus. Thanks everyone making their voice heard!

Here are the results:
Filter relationships natively by their name (supernodes issue)
Sharding and horizontal scalability 
Mandatory node types
Node insertion with checking of uniq external (get_or_create) 
N-ary relationships
20
24
6
17
12
Let's consider each of these features more closely.

Sharding

The write-scaling complement to high-availability, sharding distributes a graph across multiple machines in a cluster. We (and many others) have researched the general graph sharding problem for years. This year, we're embarking upon a pragmatic approach to sharding, providing the benefit without obsessing about academic perfection.

Supernodes

In Twitter-culture, you'd call these the "Ashton Kutcher" nodes, the nodes in a graph with an extreme number of connections. We've been working on a branch that has a promising approach for mitigating the performance challenge of traversing these supernodes.

Node types

In Neo4j, there is no schema, only structure. Relationships indicate the effective type of the connected Nodes, and Indexes imply membership in a set. Often, though, it would be helpful to know the designated type of a Node. So, we're considering the appropriate way to introduce just enough schema. If you have any thoughts or desires to share, please chime in on the issue page.

Unique indexing

Indexes provide a quick look-up for sets of Nodes or Relationships. With unique indexes, Neo4j will guarantee that only one Node is mapped to a property key, providing support for domain-specific identifiers. This new feature is available now with 1.6GA.

N-ary relationships

Neo4j's property graph model restricts a relationship to connecting two nodes. In some domains, it is useful to consider relationships having multiple end-points. For now, we think this is best solved with domain-specific solutions.

Fixes and details

Of course, this release includes a slew of bug fixes. For details about all the fixes and additions please read the various CHANGES.txt files included in the packaging.

Also, an impressive array of community-contributed development has been included in this release. Thank you all for the good ideas and pull requests - everyone is really appreciating it!

Go for it

Your feedback is of great value and we would love for you to join our community mailing list.
The Neo4j 1.6 is ready - download now and get involved!

Björn Granvik et al
Director of Engineering @ Neo Technology

Wednesday, January 18, 2012

Neo4j - Heroku Application Template Challenge

Dear Developer Community,

Today, we challenge you to create the best Heroku-hosted demo or template applications for the Neo4j Add-on.

Every participant will get a Neo4j-Heroku t-shirt and awesome prizes will be given to the best contributions.

Throughout the next month you have the chance to provide others with ready-made applications that are educational, tested and working well. At the same time we would love to get some feedback for the Neo4j Add-on.

You are free to choose the type of application, programming language, frameworks and Neo4j-driver that you would want to work with.

To leverage Neo4j on Heroku, please add the Neo4j add-on to your application.

Please register your application after (or also during) the development at the Gensen - Heroku Template Repository.

The Gensen Repository is also running on Neo4j and will hopefully be a popular sharing place for Heroku templates for lots of different domains, use-cases, languages and add-ons in the future.

Gensen will be the place where we will look for contributions, and where others can rate and comment on your contribution.

The challenge will use the Gensen rating system and Twitter promotions to crowdsource the prize winning contributions. After the end of the challenge on Feb 13, a guest panel will review the top  applications to identify the winners.

Check out the Official Neo4j Challenge website for all the details.

Looking forward to all your contributions,

The Neo4j Community Team

Peter, Andreas, Michael

Friday, January 13, 2012

Spring Data Neo4j Webinar Follow Up

Hey everyone,

This week, we had a great turnout for our Intro to Spring Data Neo4j webinar, presented by Michael Hunger.


As promised, here are the rest of the questions that we weren't able to cover during the session:

What's the difference between @RelatedTo and @RelatedToVia?
  • @RelatedTo refers to the node-entities at the other end of the relationship
  • @RelatedToVia refers to relationships themselves (as relationship-entities)

Is there any facility for "supernode" relationship navigation ?
ie. if Movie has 1M Ratings, but I only care about obtaining the Director, to have an "automatic" IndexedRelationshipExpander
  • You might add indexing to the relationship-entities to write the index (with @Indexed)
  • You might look into using computed properties with @GraphTraversal and TraversalBuilder or go for template.traverse() or repository.traverse() to retrieve those (and use a custom expander there)

Can I open the same Neo4j database in two different applications using Embedded?
  • No. For embedded, only one JVM can access the store files at a time
  • You would have to make your own mini-server exposing a protocol (TCP, RMI, HTTP, Websockets, JMX) on top of that db to interact with it from other processes

For more info on Spring Data Neo4j check out:

Introduction to Spring Data Neo4j
Presentation Slides
Presented by Michael Hunger
Recorded at SpringOne2GX

Developer Notes

Thursday, January 12, 2012

Neo4j 1.6.M03 “Jörn Kniv”

Another milestone is waiting for you - Neo4j 1.6.M03. Highlights in this release are: support for indexing unique entities, array queries in Cypher, and a Lucene update to version 3.5. It’s now available for download, and you can try it out right now on Heroku. Enjoy!

Kernel changes
Rickard Öberg
This release includes a popular feature request: the ability to ensure that key-value pairs for entities are unique!

If you look up entities (nodes or relationships) using an external key, you’ll want exactly one entity to correspond to each value of the key.  For example, if you have nodes representing people, and you look these up using Social Security Number (SSN), you’ll want exactly one node for each SSN.  This is easily achieved if you load all your data sequentially, because you can add a new node each time you meet a value of the key (a new SSN).  However, up to now, it has been awkward to maintain this uniqueness when multiple processes are adding data simultaneously (via web requests for example).

Since this is a common use-case, we’ve improved the API to make it easy to enforce entity uniqueness for a given key-value pair.  At the index level, we’ve added a new method putIfAbsent which ensures that only one entity will indexed for the key-value pair, even if lots of threads are using the same key-value pair at the same time.  Alternatively, if you’d prefer to work with nodes or relationships rather than with the underlying indexes, there’s a higher level API provided by UniqueFactory. This makes it easy to retrieve an entity using get-or-create semantics, i.e. it returns a matching entity if one exists, otherwise it creates one. Again, this mechanism is thread-safe, so it doesn’t matter how many threads call getOrCreate simultaneously, only one entity will be created for each key-value pair. This functionality is also exposed through the REST API, via a ?unique query parameter.
Alistair Jones and Julian Simpson

Cypher
Array properties have been supported in Neo4j for a long time, but until now it wasn’t possible to query on them. This milestone makes it possible to filter on array properties in Cypher. We have also improved aggregation performance.

Lucene upgrade
Neo4j uses Apache Lucene for its indexing features - this allows you to find “entry points” into the graph before starting graph-based queries.  Lucene is an actively developed project in its own right, and is constantly being enhanced and improved.  In this Neo4j release, we’re taking the opportunity to upgrade to a newer stable release of Apache Lucene, so that all users get the benefits of recent enhancements in Lucene.  We’ve moved to Lucene 3.5; for details on all the changes, have a look at their changelog.

Bug fixes
This milestone includes a number of bug fixes. For more information please have a look at the “CHANGES.txt” files in each of our release packages.

Go for it
It’s available on as a hosted cloud service on Heroku right now, or download the packages from our website.  If you’d like to stay up to date with new features in the pipeline, and share your experience with other users, we would love for you to join our community mailing list.

Rickard Öberg, Alistair Jones and Julian Simpson
Developers @ Neo Technology

Thursday, January 5, 2012

Spring onto Heroku

Andreas Kollegger
Deploying your application into the cloud is a great way to scale from "wouldn't it be cool if.." to giving interviews to Forbes, Fast Company, and Jimmy Fallon. Heroku makes it super easy to provision everything you need, including a Neo4j Add-on. With a few simple adjustments, your Spring Data Neo4j application is ready to take that first step into the cloud.
Let's walk through the process, assuming this scenario:
  • you have an account on Heroku, and have installed the heroku tool
  • git is your friend 
  • you've developed a killer Spring MVC application 
  • and of course you're using Spring Data Neo4j 
Ready? OK, first let's look at your application.

Create a Self-Hosted Web Application 
Usually, a Spring MVC application is bundled into a war and deployed to an application server like Tomcat. But Heroku can host any kind of Java application; it just needs to know what to launch. So, we'll transform the war into a self-hosted servlet using an embedded Jetty server, then add a start-up script to launch it.
First, add the dependencies for Jetty to the pom.xml:


  org.eclipse.jetty
  jetty-webapp
  7.4.4.v20110707


  org.mortbay.jetty
  jsp-2.1-glassfish
  2.1.v20100127

Then change the scope of the servlet-api artifact from provided to compile. This library is normally provided at runtime by the application container. Since we're self-hosting, it needs to be included directly. Make sure the servlet-api dependency looks like this:

  javax.servlet
  servlet-api
  2.5
  compile

We could provide a complicated command-line to Heroku to launch the app. Instead, we'll simplify the command-line by using the appassembler-maven-plugin to create a launch script. Add the plugin to your pom's build/plugins section:

  org.codehaus.mojo
  appassembler-maven-plugin
  1.1.1
  
    
      package
      assemble
      
        target
        -Xmx512m
        
          
            Main
            webapp
          
        
      
    
  

Finally, switch the packaging from war to jar. That's it for the pom. Now that the application is ready to be self-hosted, create a simple Main to bootstrap Jetty and host the servlet. The pom snippet above assumes a src/main/java/Main.java that looks like:
import org.eclipse.jetty.server.Server;
import org.eclipse.jetty.webapp.WebAppContext;
public class Main {
  public static void main(String[] args) throws Exception {
    String webappDirLocation = "src/main/webapp/";
    String webPort = System.getenv("PORT");
    if(webPort == null || webPort.isEmpty()) {
      webPort = "8080";
    }
    Server server = new Server(Integer.valueOf(webPort));
    WebAppContext root = new WebAppContext();
    root.setContextPath("/");
    root.setDescriptor(webappDirLocation+"/WEB-INF/web.xml");
    root.setResourceBase(webappDirLocation);
    root.setParentLoaderPriority(true);
    server.setHandler(root);
    server.start();
    server.join();
  }
}
Notice the use of environment variable PORT for discovering which port to use. Heroku and the Neo4j Add-on use a number of environment variables to configure the application.
Next, we'll modify the Spring application context to use the Neo4j variables for specifying the connection to Neo4j itself.
For example, if you have src/main/resources/META-INF/spring/applicationContext-graph.xml, then modify it to look like this:

  
    
    
    

Before provisioning at Heroku, test the application locally. First make sure you've got Neo4j server running at http://localhost:7474, using default configuration. Then set the following environment variables (here, assuming bash):
export NEO4J_REST_URL=http://localhost:7474/db/data
export NEO4J_LOGIN=""
export NEO4J_PASSWORD=""
Now you can launch the app by running sh target/bin/webapp. If all went well, your application will be available just as if it were in a Tomcat container.

Deploy to Heroku 
With a self-hosted application ready, deploying to Heroku needs a few more steps. First, create a Procfile at the top-level of the project, which will contain a single line identifying the command line which launches the application. The contents of the Procfile should contain one line:
sh target/bin/webapp
Then use git to magically deploy to Heroku:
# Initialize a local git repository, adding all the project files
git init
git add .
git commit -m "initial commit"
# Provision a Heroku stack, add the Neo4j Add-on and deploy the appication

heroku create --stack cedar
heroku addons:add neo4j
git push heroku master
Note that the heroku stack must be "cedar" to support running Java. Check that the process is running by using heroku ps, which should show a "web.1" process in the "up" state.
Success! Your application is now live in the cloud.
To see the Neo4j graph you just created through Heroku, use heroku config to reveal the NEO4J_URL environment variable, which will take you to Neo4j's Webadmin.

Summary
All that typing really comes down to three steps:
  1. make your application self-hosting
  2. access Heroku environment variables for configuration
  3. deploy to the cloud
Now go upgrade your application and join us in the cloud. It's always sunny up here.

Cheers,
Andreas