Wednesday, June 27, 2012

Neo4j 1.8.M05 - In the Details

Neo4j 1.8 Milestone 5 is available for immediate download, polishing up Neo4j with some nice detail work. Here are the highlighted changes...

Cypher

Cypher is Neo4j's friendly and expressive data language. Familiar seeming yet fresh, Cypher is your best option for creating and querying data.
Updates include:
  • The data browser in Neo4j’s web interface now supports multi-line Cypher queries.
  • CREATE and RELATE can now introduce path identifiers, like this:
    CREATE p=(n {name:'Miles'})-[:PLAYS]->
          (m {instrument:'Trumpet'}) 
        return p;
    
  • String literals can now contain some escape characters, like:
    CREATE (n {text:"single \' and double \" quotes"});
Also, these fixes have been incorporated: 
  • Fixes #600: Double optional with no matching relationships returns too many rows
  • Fixes #613: Missing dependencies not reported correctly for queries with RELATE/SET/DELETE
  • And some fixes in the handling of optional paths
For more details about Cypher see the excellent tutorials and reference in the Neo4j Manual.

Kernel

Neo4j's Kernel refers to the core engine that performs highly optimized database operations, all the way down to the bare metal. It's what makes Neo4j a database rather than a simple store or an abstraction layer.

Updates here include:
  • Logical log configuration now accepts more specification options, like number of days or size on disk. The keep_logical_logs configuration supports values such as: "10 days", "200M size" or “12 files”. Regardless of configuration there will always be at least the latest non-empty logical log left.
  • Increased multithreaded performance, thanks to a reduced amount of synchronization while memory mapping files.
More details about Neo4j's Kernel can be found in the Neo4j Manual

Indexing

While a graph can itself be seen as an index, Neo4j's indexing is used for direct lookup of data based on simple criteria. Typically, this is used to start a graph query using Cypher.

Changes to indexing include:
  • Removal of lucene_writers_cache_size. Now,  only lucene_searcher_cache_size needs to be specified. The value will be used for both, since it's doesn't make sense to have a writer without a searcher and isn't possible to have a searcher without a related writer. One less configuration point to worry about. 
  • Contention when getting an index searcher for querying has been significantly loosened, improving overall performance.
For more about indexing, look to Neo4j Manual Chapter 15

Get Neo4j 1.8.M05

Neo4j 1.8.M05 is available for:

Cheers,
the Neo4j Team

Tuesday, June 19, 2012

Wanted: Your Help in Testing Neo4j JDBC Driver

Help us testing the Cypher-JDBC-Driver

Update The results that our awesome community provided in testing the driver are available.

As many of you know Rickard Öberg did a lab project last December developing a first prototype of a JDBC driver that connects to the Neo4j Server Cypher endpoint. It implements the JDBC API's to execute Cypher statements remotely and returns the tabular results. The Driver is published on GitHub as an open source project

The first tests of the driver covered JDBC-use-cases like:

  • LibreOffice/OpenOffice
  • IntelliJ IDEA
  • DbVisualizer
  • JDBC-ODBC bridge on Windows

JDBC as Integration Approach

Besides making it work it was fun to be able to make this happen. Meanwhile some of our customers are looking into integrating Neo4j into their BI solutions and so we suggested that they should give the JDBC driver a try.

Ralf Becher from TIQView worked on integrating it with Qlikview and published quite impressive results.

With the existing feedback we worked on improving the driver and updated the following aspects:

  • Internal Refactoring and Bug fixes
  • Support for Streaming Mode of Neo4j-Server
  • Support to use it with an embedded graph database or an in-memory variant.

Next stop: Public Testing

To cover a greater range of SQL/JDBC tools than we know and use - you know we're mainly working on a NOSQL Graph Database, we would like to ask for YOUR HELP.

You certainly know, use and like some of JDBC-related tools, and could try the Neo4j JDBC driver with those. We pepared a form to add your findings which is connected to a public Google Spreadsheet.

How to Test

Choose your tool, download the driver from the resources link and set it up to point to a Neo4j Server which has the Dataset.

You can download and set up the Server on your own. Then the jdbc-url is jdbc:neo4j://localhost:7474 We have also prepared a Heroku instance that hosts the dataset, so it is accessible to everyone at http://jdbc.herokuapp.com which would use the jdbc-url: jdbc:neo4j://jdbc.herokuapp.com

Some Sample Queries

 // user node
 START n=node(1) 
 RETURN n

 // number of nodes
 START n=node(*) 
 RETURN count(*)

 // user and friends
 START user=node:User(login="micha") 
 MATCH user-[:FRIEND]-friend
 RETURN user.name,ID(user),friend.name

 // other movies with these actors
 START user=node:User(login="micha") 
 MATCH user-[:RATED]-movie<-[:ACTS_IN]-actor-[:ACTS_IN]->other_movie
 RETURN other_movie.title, count(*) as occ
 ORDER BY occ DESC
 LIMIT 5

If you want you can also test out the integration with an embedded or in-memory Neo4j-Instance, e.g. by integrating it with Spring's JdbcTemplate

Please make sure to take some notes and a screenshot. If you want to take it to the next level, please record a screencast or write a short blog post about your experience. Armed with this information, fill out the Survey Google Form and let us know what you think.

Resources

We will compile a blog post with all your contributions, update the Driver with all the necessary fixes and then make it available as part of the Neo4j distribution.

Thanks a lot for your support,

Michael, Peter, Andreas - the Neo4j Community Team

Saturday, June 16, 2012

Lab-friday - from ASCIIDOC to HTML 5 slides

Hi all!

Last lab-friday, Anders Nawroth made the neo4j manual toolchain produce deck.js HTML 5 presentations instead of the Neo4j Manual.

The interesting part isn’t really the example itself, but that the toolchain now is able to generate HTML directly, instead of the normal DocBook XML output. This can be used in any context where you want to publish (tested) code, graphs or integrated consoles etc. on a web page.

He is generating the slideshow at http://docs.neo4j.org/lab/decks/, even containing the live console (click on the "Console" button in the slide), like
All this is generated an ASCIIDOC source file like this snippet: 


The source can be found at https://github.com/nawroth/manual/blob/deckjs/src/main/resources/slides/example/index.txt, feel free to fork and improve. We hope to bring this into the Neo4j Manual master toolchain soon, so material can be even better reused in other formats. Also, the toolchain itself is explained in detail at http://docs.neo4j.org/chunked/snapshot/community-docs.html .

Cool stuff Anders!

Happy hacking,

Anders and Peter

Tuesday, June 12, 2012

Neo4j 1.8.M04 - Happy Paths


Neo4j 1.8 Milestone 4 is available today, offering a few new ways to help you find happy paths. To query a graph you use a traversal, which identifies paths of nodes and relationships. This release updates the capabilities of Neo4j's core Traversal Framework and introduces new ways to use paths in Cypher.

Graph Sherpa Mattias Persson

Mattias Persson works throughout the Neo4j code base, but is particularly well acquainted with the Traversal Framework, a core component of the Neo4j landscape. He's agreed to guide us on a traversal tour:
AK: So, what exactly is a Traversal?
MP: I would say from one or more given nodes in your graph move around to other nodes via their connected relationships in search of your answer. The traversal can be controlled in different ways, for example which relationships to traverse at any given position, ordering and so on. The general outcome is a list of paths from which the relevant information can be extracted.
AK: And the Traversal Framework, then. Is it just for describing a Traversal?
MP: Sure, it's for describing where the traversal should go and also implementation to execute the traversal itself.
AK: Can you give an example, like how would I find the friends of my friends?
MP: So here the starting point is you, the node representing you. And you'd tell the traversal to follow KNOWS relationships or similar down to depth 2. Also every friend of friend should only be returned once (such uniqueness is by default). So in embedded code:
Iterable<Node> friendsOfFriends = traversal()
  .breadthFirst()
  .relationships(KNOWS)
  .evaluator(Evaluators.atDepth(2))
  .traverse().nodes();
AK: OK, interesting. It's such a different way of querying, though. For people who are new to Traversals, what's your advice for how to 'get it'?
MP: Look at traversals as local, where instead of having your entire database and query globally by matching values, you start at a known point where your relationships becomes your index and lead you to what you're looking for. So you describe how the traversal will behave, where it should go and not go and you receive callbacks about relevant data, as per your description.
AK: And what are the benefits of the new update to the Traversal Framework?
MP: There are some additions here. One is bidirectional traversals, which is essentially like describing two traversals, one from each side (meaning one or more given start nodes) and where they collide in the middle will produce results in the form of paths. In most scenarios where you know both the start and end node(s) a bidirectional traversal will get you your answer with much less relationships traversed, i.e. faster traversal. Reason being that number of relationships needed to be traversed on each depth increases exponentially, so by traversing half the depth from each side cuts down on that growth. The "all paths" and "all simple paths" implementations in the graph-algo collection uses bidirectional traversals now. Dijkstra and A* will probably move over to that as well, and it's essentially just a small change in your traversal description to make it bidirectional.
There's also an addition to the "expander", i.e. the one responsible for deciding which relationships to follow given a position in the traversal. Previously it could only make decisions based on the node for the current position, but now it can view the whole path leading up to the current position.
Also some minor things like being able to get metadata about the traversal (number of relationships visited and so forth), more convenience methods on Path interface.
AK: Nice. That's a lot of good stuff. How will REST users be able to take advantage of these new capabilities?
MP: Well, you can soon expect Cypher to optimize queries that can take advantage of it. That's the usual thing, just keep writing queries and we'll keep making them faster.
AK: Thanks so much Mattias for all the hard work.

Paths as Expressions

In Cypher, much of the work in a statement involves working with paths. Now, paths themselves can be treated as expressions. This is most immediately explained with a simple example. Prior to 1.8.M04, you could capture a path with an identifier like this:
START n=node(...), m=node(...) 
    match p=n-->()<--m 
    return collect(p) as allPaths
With paths as expressions, that can be re-written as:
START n=node(...), m=node(...) 
    return n-->()<--m as allPaths
Simply return the path that you want. There are, of course, much more fun things that can be done with this, which we'll leave to explore another time. Because the best thing to do right now is...

Try Neo4j 1.8.M04

Neo4j 1.8.M04 is available:

Cheers,
the Neo4j Team

Friday, June 8, 2012

GraphConnect: graph leadership visualized


Heyo,
It's been a while, and I have missed you all. I am extremely stoked to announce GraphConnect, a conference that is all about graphs, and the graphs you find in your day to day life.
http://3.bp.blogspot.com/-5eF5mHfPkuY/T9E1AxfXUvI/AAAAAAAAAKA/PjAt5vVe1nI/s1600/cloud+shirt.jpg

Who:

We have some great featured speakers lined up

James Fowler, Co-Author of Connected. His book Connected discusses the importance and impact of your social graph. Here's his interview with Stephen Colbert.
Rebecca Parsons, CTO of ThoughtWorks With more than 20 years of application development experience, Rebecca will present on today's evolving data, and how it can be used in today's modern society. Here's a video of one of her talks at QCon London.

What:

What is this conference about? It's about connecting and celebrating Graph's new relevance with modern data. GraphConnect SF is a place where developers, technical decision makers, and thought leaders alike will convene to celebrate modern data and its expanding connectedness, through graph databases, network analysis, social applications and open-source projects. Oh and we will also have a sweet After Party with food, drinks and a fantastic view.

GraphConnect will include Tutorials, an unConference, and some great presentations, jam packed into two days.

When:

5-6 November 2012. Going to QCon SF? Perfect, you are in town to join us before.

Where:

The Hyatt Regency downtown, with the After Party right across the street.

graphconnect.jpg

GraphConnect 2012

Hyatt Regency Embarcadero
5 Embarcadero Center,
San Francisco, California, USA 94111

After Party

Sens Restaurant
4 Embarcadero Center
Promenade Level
San Francisco, California 94111

Why:

I don't see any reason why not. A Graph Conference is extremely fitting for a company like Neo, where we value relationships just as much as the nodes themselves. Also, with Neo4j being the world's leading database, it makes sense that we are pioneering the graph space and its dialogue.

How:

How can you participate? We have multiple ways you can get involved, based on your love for graphs.

Register: Hop on the Early Adopter price. Community heard it first.

Sign up for GraphConnect Updates: Keep up to date with new speakers, new promos, and fun events leading up to the conference.
Click on the home page, fill in your email to the right, under the section "Keep Me Informed on GraphConnect!"

Join our Advisory Board: Want to help mold GraphConnect? We are devising a group that will be intimately involved with the preparation and structure of the conference. Again, we are nothing without our Community, and would love to get you all involved from the get-go.
Click on the home page, fill in your email to the right, and check "Yes, I would like to be part of the GraphConnect Advisory Board."

Propose your Graph: Want to be a part of GraphConnect? Have a cool project on a graph database, or some cool data visualized? Show us, we would love to hear from you.

Sponsor: Get your brand out in front of the GraphConnect community. Email us at graphconnect@neotechnology.com


-ayeeson