Tuesday, March 19, 2013

Neo4j.org 3.0 Launch


A new year, a new look - after gathering a lot of feedback for the launch of neo4j.org after GraphConnect last year we decided to invest more time and effort to make the site the main hub for information around Neo4j.



Goals


One major goal is to make it easier for you to get up and running with Neo4j. We hope to achieve this by providing everything in one place, from the download, set-up screencasts and step by step instructions to the rich choice of language support and the appropriate drivers.

We also want to make it easier for people that never worked with graph databases before to learn about Neo4j. So we created the infrastructure and started to work on learning paths that will tell a consistent story around a use-case or technology involving Neo4j. Currently we feature a learning path for Java and for Cypher but there will be many more to come. Any input in how to structure the paths and present the material is highly welcome!


Layout & Usability


Another aspect is a more consistent layout, unlike the previous site it isn’t done by database developers ;) So besides the main page content there is a number of related content items of different types per page. Some of those items will be featured in a big box (several in a slider) others will be listed below in a tiled view. That helps us to provide content relevant to the page in an unintrusive manner.


Finding interesting content quickly should be easy. From the interactive graph showing important concepts on the start page, the footer quick links and the main navigation areas and the site search - everything should allow you to find relevant content. The related items per page and learning paths will help a lot too, we hope.

Content


We now incorporate many types of content, from links, videos, books and presentations to static textual information or content pulled from our manual or github repositories. We also integrate event information from google calendar, tweets and data from google spreadsheets.

Open Source Site


The whole site is hosted on github, so we would love to invite you to contribute. Feel free to fork the repository and issue a pull-request. Be it new pages, content (most appreciated) or layout fixes.

If you have any interesting related content to add to any page - be it a blog post, presentation, video github project or sample application you came across, please don’t hesitate to ping us either via email or an github issue or issue the pull request.

As we also try to expand the topic pages, for instance the language pages (e.g. for ruby) into learning paths and add more content there - so we want to encourage everyone to contribute to these pages and perhaps even take mentorship for a content page.

Cheers Axel, Peter and Michael


Wednesday, March 6, 2013

Neo4j 1.9.M05 released - wrapping up


Hi all,

We are very proud to announce the next milestone of the Neo4j 1.9 release cycle. This time, we have been trying to introduce as few big changes as possible and instead concentrate on things that make the production environment a more pleasant experience. That means Monitoring, Cypher profiling, Java7 and High Availability were the targets for this work. Let’s look at some of them:


Java7 Support

As of Neo4j 1.9.M05, Java7, Oracle JDK, is officially supported as the default runtime. We verified and adjusted some of the differences around e.g. sorting so that Java7 is a stable runtime for Neo4j.

High Availability and clustering

There have been a number of improvements around the chattiness of the HA protocol, making the cluster communication more efficient.

Kernel, Monitoring and Server

The IndexProvider interfaces are now in line with normal kernel extensions, making the system more consequent in design.
For the Neo4j REST Server, we added support for X-Forwarded-Host and X-Forwarded-Proto headers to allow parameterising of  links in data for hosting behind proxy servers.
Also, the JMX information beans will now provide info on all configuration values, including the defaults not explicitly set, enabling better diagnosing and tuning of your neo4j database.

Cypher


DISTINCT is now lazy, and keeps the incoming ordering, making these kind of queries much faster and memory-efficient. Also handling of iterators from index lookups and global graph operations is lazy as it should have been. Thanks to Wes Freeman for spotting this.

A first version of support for profiling cypher statements has been introduced, together with a matching PROFILE neo4j-shell command. This reports an additional execution plan with metrics together with the result output, like

Welcome to the Neo4j Shell! Enter 'help' for a list of commands

neo4j-sh (0)$ profile START n = node(0) MATCH (n)-[r]-(e) WITH n RETURN count(*);
+----------+
| count(*) |
+----------+
| 0        |
+----------+
1 row
0 ms

ColumnFilter(symKeys=["  INTERNAL_AGGREGATE-939275295"], 

returnItemNames=["count(*)"], _rows=1, _db_hits=0)
EagerAggregation(keys=[], aggregates=["(INTERNAL_AGGREGATE-939275295,CountStar)"], 
_rows=1, _db_hits=0)
ColumnFilter(symKeys=["e", "n", "r"], returnItemNames=["n"], _rows=0, _db_hits=0)
TraversalMatcher(trail="(n)-[r WHERE true AND true]-(e)", _rows=0, _db_hits=1)
ParameterPipe(_rows=1, _db_hits=0)
neo4j-sh (0)$


The profiling information is also available via the Java and REST-APIs and the Neo4j Console.


As always, there has been a lot of small fixes and improvements not listed here. Many thanks for all the contributions to Neo4j for many of the issues - it helps a lot to get comments, feedback and patches for things that need improvement!

For the full list of changes, as always, see
https://github.com/neo4j/neo4j/blob/master/packaging/standalone/standalone-enterprise/src/main/distribution/text/enterprise/CHANGES.txt

Yours sincerely,

Peter for the Neo4j Team

Importing data into Neo4j - the spreadsheet way

I am sure that many of you are very technical people, very knowledgeable about all things Java, Dr. Who and many other things - but I in case you have ever met me, you would probably have noticed that I am not. And I don’t want to be. I love technology, but have never had the talent, inclination or education to program - so I don’t. But I still want to get data into Neo4j - so how do I do that?

There are many technical tools out there (definitely look here, here and here, but I needed something simple. So my friend and colleague Michael Hunger came to the rescue, and offered some help to create a spreadsheet to import into Neo4j.

You will find the spreadsheet here, and you will find two components:

  1. an instruction sheet. I will get to that later.
  2. a data import sheet. Let’s look at that first.

The Data Import Sheet

This sheet is composed of two parts:
  • columns A, B and C: these contain the data for the Nodes of our graph, using an “id”, a “name”, and a “type
  • columns F, G and H: these contain the data for the Relationships of our graph, having a “from-id” (where the relationship starts), a “to-id” (where the relationship ends), and a “relationship type”. Columns F and G reference the nodes and their id’s in column A.

And then comes the seccret sauce: how to create Cypher statements from these nodes and relationships. For this we use very simple statements that leverage the columns mentioned above, the cypher syntax and string concatenation. Look at the columns D and I:
  • cypher statements to create the nodes:

="create n={id:'"&A2&"', name:'"&B2&"', type:'"&C2&"'};"


output for row 2:


create n={id:'1', name:'Amada Emory', type:'Female'};

As you can see, it takes that id, name and type properties from columns A, B and C, and puts these into a “create” cypher statement.

  • cypher statements to create the relationships:

="start n1=node:node_auto_index(id='"&F2&"'), n2=node:node_auto_index(id='"&G2&"')  create n1-[:"&H2&"]->n2;"

output for row 2:


start n1=node:node_auto_index(id='1'), n2=node:node_auto_index(id='11') create n1-[:MOTHER_OF]->n2;

This one is a little bit more complicated, as it will be using Neo4j’s auto-index: in order to create the relationship, we first have to look up start node and end node from the auto-index using the ID property. And then the create-statement creates the relationship based on the relationship-type in column H.

So with this, we end up with two columns containing a bunch of cypher statements. So then what?

The Instructions Sheet

In the first sheet of the spreadsheet, you will find a bunch of instructions. Basically, you need to go through the following steps:
  • download and unzip Neo4j server.
  • copy/paste the cypher statements from the Import Sheet into a text file.
  • wrap these with a neo4j transaction (begin, commit) - so that all of the statements get persisted to disk in the same transaction (or not in case of an error). Not important for smaller datasets, more important for larger datasets.
  • some instructions on how to enable auto-indexing on Neo4j. This is important, because as you insert data into the database, it needs to get indexed for setting up the relationships properly (see above), and future use.
  • and some instructions on how you can pipe the text file into the neo4j shell - if necessary. For small datasets (and therefore, a limited number of cypher statements) you can do with copy/pasting the textfile into the Web-UI console - but that might not always work.
  • starting the server and browsing the Web-UI



And there we go: the dataset gets created, and Neo4j is ready for use. I hope this little overview was useful for you - it sure was useful for me when getting my hands dirty for the first time :) …