Tuesday, April 13, 2010

The Neo4j REST Server - Part1: Get it going!


WARNING: This post is outdated. The Neo4j Server is now documented here and the manual section on REST is the single source for documentation of the REST API.


Introduction

As requested and wished by many, finally Neo4j got its own standalone server mode, based on interaction via REST. The code is still very fresh and not thoroughly tested, but I thought I might write up some first documentation on it, based on the Getting Started with REST Wiki page

Installation

The first version of the distribution can be downloaded from here: zip, tar.gz. After unpacking, you just go to the unpacked directory and run (on OSX/Linux - see the wiki entry for details on Windows)
$ ./bin/neo4j-rest start
which will start the Neo4j REST server at port 9999 and put the database files under a directory neo4j-rest-db/ (lazily with the first request). Now, let's point our browser (not Internet Explorer since it doesn't send any useful Accept-headers and will get JSON back, this will be fixed later) to http://localhost:9999 and we will see the following:



Things seem to be running! The reason for the HTML interface is the Browser sending Accept: text/html. Now, setting the Accept to application/json will produce
peterneubauer$ curl -H Accept:application/json -H Content-Type:application/json -v http://localhost:9999
* About to connect() to localhost port 9999 (#0)
*   Trying 127.0.0.1... connected
* Connected to localhost (127.0.0.1) port 9999 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.19.7 (i386-apple-darwin10.2.0) libcurl/7.19.7 zlib/1.2.3
> Host: localhost:9999
> Accept:application/json
> Content-Type:application/json
<  
* Connection #0 to host localhost left intact 
* Closing connection #0 
{ 
  "index":"http://localhost:9999/index",
  "node":"http://localhost:9999/node",
  "reference node":"http://localhost:9999/node/0" 
}
 

Now, with "200 OK" this is a good starting point. We can see full references to the interesting starting points -the reference node and the index subsystem. Let's check out the reference node:
peterneubauer$ curl -H Accept:application/json -H Content-Type:application/json -v http://localhost:9999/node/0
* About to connect() to localhost port 9999 (#0)
*   Trying 127.0.0.1... connected
* Connected to localhost (127.0.0.1) port 9999 (#0)
> GET /node/0 HTTP/1.1
> User-Agent: curl/7.19.7 (i386-apple-darwin10.2.0) libcurl/7.19.7 zlib/1.2.3
> Host: localhost:9999
> Accept:application/json
> Content-Type:application/json
>
{
"incoming typed relationships":"http://localhost:9999/node/0/relationships/in/{-list|&|types}",
"incoming relationships":"http://localhost:9999/node/0/relationships/in",
"all relationships":"http://localhost:9999/node/0/relationships/all",
"create relationship":"http://localhost:9999/node/0/relationships",
"data":{},
"traverse":"http://localhost:9999/node/0/traverse/{returnType}",
"property":"http://localhost:9999/node/0/properties/{key}",
"self":"http://localhost:9999/node/0",
"properties":"http://localhost:9999/node/0/properties",
"all typed relationships":"http://localhost:9999/node/0/relationships/all/{-list|&|types}",
"outgoing typed relationships":"http://localhost:9999/node/0/relationships/out/{-list|&|types}",
"outgoing relationships":"http://localhost:9999/node/0/relationships/out"
}
Which gives us some info about what the Node 0 can do, how to get its relationships and properties and the syntax of how to construct queries for getting properties, creating relationships etc.

Insert some data

According to RESTful thinking, data creation is handled be POST, updates by PUT. Let's insert a node:
peterneubauer$ curl -X POST -H Accept:application/json -v localhost:9999/node
* About to connect() to localhost port 9999 (#0)
*   Trying 127.0.0.1... connected
* Connected to localhost (127.0.0.1) port 9999 (#0)
> POST /node HTTP/1.1
> User-Agent: curl/7.19.7 (i386-apple-darwin10.2.0) libcurl/7.19.7 zlib/1.2.3
> Host: localhost:9999
> Accept:application/json
>
{
...
"self":"http://localhost:9999/node/1",
"data":{},
...
}
Resulting in a new node with the URL localhost:9999/node/1 (described by the "self" property in the JSON representation) and no properties set ("data":{}). The Neo4j REST API is really trying to be explicit about possible further destinations, making it self-describing even for new users, and of course abstracting away the server instance in the future. This makes dealing with multiple Neo4j servers easier in the future. We can see the URIs for traversing, listing properties and relationships. The PUT semantics on properties work like for nodes.
We delete the node again with
curl -X DELETE  -v localhost:9999/node/1

and get 204 - No Content back. The Node is gone and will give a 404 - Not Found if we try to GET it again.

The Matrix

Now with properties encoded in JSON we can easily start to create our little Matrix example:



In order to create relationships, we do a POST on the originating Node and post the relationship data along with the request (escaping the whitespaces and others special characters):
curl -X POST -H Accept:application/json -H Content-Type:application/json -d '{"name":"Mr. Andersson"}' -v localhost:9999/node
curl -X POST -H Accept:application/json -H Content-Type:application/json -d '{"name":"Morpheus"}' -v localhost:9999/node
curl -X POST -H Accept:application/json -H Content-Type:application/json -d '{"name":"Trinity"}' -v localhost:9999/node
curl -X POST -H Accept:application/json -H Content-Type:application/json -d '{"name":"Cypher"}' -v localhost:9999/node
curl -X POST -H Accept:application/json -H Content-Type:application/json -d '{"name":"Agent Smith"}' -v localhost:9999/node
curl -X POST -H Accept:application/json -H Content-Type:application/json -d '{"name":"The Architect"}' -v localhost:9999/node

Getting http://localhost:9999/node/1, http://localhost:9999/node/2, http://localhost:9999/node/3 as the new URIs back. Now, we can connect the persons (escaping ruining readability a bit ...):
curl -X POST -H Accept:application/json -H Content-Type:application/json -d '{"to":"http://localhost:9999/node/1","type":"ROOT"}' -v http://localhost:9999/node/0/relationships
curl -X POST -H Accept:application/json -H Content-Type:application/json -d '{"to":"http://localhost:9999/node/2","type":"KNOWS"}' -v http://localhost:9999/node/1/relationships
curl -X POST -H Accept:application/json -H Content-Type:application/json -d '{"to":"http://localhost:9999/node/3","type":"KNOWS"}' -v http://localhost:9999/node/2/relationships
curl -X POST -H Accept:application/json -H Content-Type:application/json -d '{"to":"http://localhost:9999/node/4","type":"KNOWS"}' -v http://localhost:9999/node/2/relationships
curl -X POST -H Accept:application/json -H Content-Type:application/json -d '{"to":"http://localhost:9999/node/5","type":"KNOWS"}' -v http://localhost:9999/node/4/relationships
curl -X POST -H Accept:application/json -H Content-Type:application/json -d '{"to":"http://localhost:9999/node/6","type":"CODED BY"}' -v http://localhost:9999/node/5/relationships
curl -X POST -H Accept:application/json -H Content-Type:application/json -d '{"to":"http://localhost:9999/node/1","type":"LOVES"}' -v http://localhost:9999/node/3/relationships

Now, pointing our browser at http://localhost:9999/node/3/relationships/all will list all relationships of Trinity:



Our first traversal

To start with, the Neo4j default Traverser framework (updated to be more powerful than the current) is supported in REST, and other implementations like Gremlin and Pipes to follow. The documentation on the traversals is in the making here. There are a number of different parameters:
http://localhost:9999/node/3/traverse/node specifies a return type of "node", returning node references. There are other return types such as relationship, position and path returning other interesting info respective. The Traverser description is pluggable and has default values - a full description looks like
{
"order": "depth first",
"uniqueness": "node path",
"relationships": [
{ "type": "KNOWS", "direction": "out" },
{ "type": "LOVES" }
],
"prune evaluator": {
"language", "javascript",
"body", "position.node().getProperty('date')>1234567;"
},
"return filter": {
"language": "builtin",
"name", "all"
},
"max depth": 2
}

To note here is the pluggable description of the "return filter" (what to include in the return) and "prune evaluator" (where to stop traversing). Right now only JavaScript is supported for writing these more complicated constructs up, but other languages are coming. Very cool. To finish, let's get all the nodes at depth 1 from Trinity via trivial traversal:
curl -X POST -H Accept:application/json -H Content-Type:application/json -d '{"order":"breadth first"}' -v http://localhost:9999/node/3/traverse/node

Which just returns all nodes of all relationships types at depth one (default) as a JSON Array of node descriptions as above, in this case http://localhost:9999/node/1 and http://localhost:9999/node/2.

Summary

Having the Neo4j REST API and with it the Neo4j REST Server coming along is great news for all that want to use a graph database over the network, especially PHP or .NET clients that have no good Java bindings. Already a first client wrapper for .NET by Magnus MÃ¥rtensson from Jayway is underway, and a first PHP client is on Al James' GIThub.
This will even pave the way for higher-level sharding and distribution scenarios and can be used in many other ways. Stay tuned for a deeper explanation of the different traversal possibilities with Neo4j and REST in a next post!