Thursday, November 21, 2013

Neo4j 2.0.0-RC1 – Final preparations

WARNING: This release is not compatible with earlier 2.0.0 milestones. See details below.

The next major version of Neo4j has been under development for almost a year now, methodically elaborated and refined into a solid foundation. Neo4j 2.0 is now feature-complete. We're pleased to announce the first Release Candidate build is available today.

With that feature-completeness in mind, let’s see what’s on offer...

Cypher Syntax - finishing touches
Two guiding principles of Cypher language design are readability and internal consistency. The basic syntax should be easy to understand, with little ambiguity about the intent of a query. This release has a number of changes that follow those principles, resulting in much nicer looking syntax, with more clear semantics.

MATCH with properties
Cypher CREATE and MERGE clauses can have patterns with properties in them, but that syntax wasn’t previously supported in the MATCH clause. Now you can include properties in any pattern. This makes simple queries more concise; a query like this:

MATCH (a:Person) WHERE = "Joe" RETURN a

Can now be written like this:

MATCH (a:Person {name:"Joe"}) RETURN a

Sometimes your data isn’t all the same – you are looking for a core pattern, but some matches have additional detail attached. We call this additional detail ‘optional’ because it isn’t required to match the core pattern (a bit like an outer join in SQL).

Previously, we expressed optional details with optional relationships, using the -[?]-> syntax. However, this sometimes proved confusing. To resolve the ambiguity, Stefan Plantikow came up with an excellent solution: separate the concerns.

Everything in a MATCH is now required, so the ? operator has been removed. For optional patterns, use the new OPTIONAL MATCH, which either returns matching data, or null if nothing is found.

For example, you can write:

MATCH (a:Person)

This will find all Person nodes. If a person has a spouse, Neo4j find them, otherwise b will be null. Lovely!

MERGE for relationships
Cypher has built-in support for ‘get-or-create’: with a single query you can find existing data, or create it if it’s missing. Since this is a common operation for any database, we wanted to make it work very well in Cypher. To get-or-create, you use MERGE for single nodes or MERGE for relationships (but not both at the same time). MERGE for relationships replaces the old CREATE UNIQUE clause.

For example, to get-or-create a relationship between two nodes:

MATCH (a:Person {name: "Joe"}), (b:Person {name: "Steve"})
MERGE (a)-[r:KNOWS]->(b)

Simpler syntax for MERGE ON MATCH and ON CREATE
When use a MERGE clause in your query, there are two possible outcomes: Neo4j will either find all existing matching data, or create entirely new data. The special sub-clauses ON MATCH and ON CREATE allow to you to distinguish between these outcomes.

We’ve simplified the syntax of the ON MATCH and ON CREATE clauses, removing the need to cite an identifier from the related MERGE pattern. Where you used to write:

MERGE (a:Person {name: "Joe"}) ON CREATE a SET a.created = {}

You can now write (dropping the 'a' in ON CREATE):

MERGE (a:Person {name: "Joe"}) ON CREATE SET a.created = {}

We’ve changed the way Cypher handles null in important ways: Many expressions now return null for invalid arguments (HEAD([]), Slicing, e.g. [][1..3]). We’ve embraced ternary logic by allowing null to be used as a “maybe” value in expressions with AND, OR, NOT meaning it’s easier to compute predicates when information (like property values) is missing.
For more details, refer to the Neo4j Manual section on working with null.

Caution: manual upgrade between milestones
Data stores created with any previous milestone version can not be used with 2.0.0-RC1 unless a manual upgrade is performed. This is due to incompatible changes made to the store files. Please proceed with caution, backing up your data before attempting to manually upgrade.

Manual upgrade (only from 2.0.0-M06, and after you've backed up):
  1. Cleanly shut down on the old version on Neo4j 2.0.0-M06
    $ bin/neo4j stop
  2. Navigate to the database directory
    $ cd data/graph.db
  3. Delete the label scan store (this is the critical part that has a new format). It will be recreated on startup.
    $ rm -rf schema/label
  4. Start with the new version of Neo4j 2.0.0-RC1
    $ bin/neo4j start


Of course, as always you can safely upgrade between GA versions like 1.9.5 and the coming 2.0.0.

Breaking changes
While this release is feature complete, there are some breaking changes since milestone 6.

Breaking changes include:
  • textual status codes, which alter the error response from the transactional endpoint
  • clean-up of deprecated APIs
  • removal of reference node (use labels instead)

For all the details, please refer to the 2.0.0.RC1 changelogs.

Next steps

Now that the Release Candidate is ready, we’d love for you to try it out. Between now and the GA release, we will only be including bug fixes. Give us feedback about any issues you might encounter, reporting problems on our google group and asking questions on Stack Overflow.

Andreas, on behalf of Team Neo


wouter said...

Is it now possible to automatically upgrade a 1.9 database to a 2.0 databaase?

Jan Van den bosch said...

Was it really necessary to downright disallow the "old" optional match?

That's a lot of queries that I have to update now.

Falmarri said...

Is this tagged in git? I don't see the tag.

Falmarri said...

Nevermind, I'm just stupid

Chris Leishman said...

Hi Jan!

It was necessary for the new 2.0 syntax, as the old ? operator relied on the presence of the START clause to determine which end of the pattern was optional. Without having a START clause, we can't work that out anymore.

However, you can always prefix your queries with 'CYPHER 1.9 ', and you'll be able to continue using the same queries as before whilst using Neo4j 2.0. Of course, you won't get any of the benefits of the newer syntax, and some underlying performance improvements, but you will then be able to update your queries over time.

Thanks for playing with Neo4j!

Paul T. said...

Ooooh, nice! I like MATCH with properties :)

Tom Zeppenfeldt said...

he devil is in the detail ..

when doing exactly as described in the manual 2.0.0 M06 to 2.0.0 RC1 upgrade, things go wrong, because

$ rm -rf schema/label

should be followed by

$ cd ../..

Tom Zeppenfeldt said...

Cool , even this works

match (a)-[r:hasParent* {hierarchy:"SomeHierarchy"}]-> (b) return,

Anonymous said...

The shortestPath function is only returning 16 nodes at a time, no matter where in the graph I start. Is this a bug?

Here is the query:

START s=node(0), n=node(*)
MATCH p=shortestPath((s)-[*]->(n:label))
RETURN length(p), id(n),
ORDER BY length(p) asc

NeoMD said...

Regarding shortestPath it seems that it only returns paths of length up to 15.