Navigating XML Graph using Cypher

Cypher is a neat way to manipulate a Neo4j database. It would be equally amazing if the Xml graph could be queried with Cypher as well.

Honestly, I must put credits to Michael for suggesting such a possibility here..

Well, let’s start with a simple xml file.

<library>
<author firstname="Earnest" lastname="Hemingway">
<works>
<book name="A Farewell to Arms" year="1929" />
<book name="For Whom the bell tolls" year="1940" />
<book name="The Old man and the sea" year="1951" />
</works>
<awards>
<award name = "Pulitzer Prize" category="Fiction" year="1953"></award>
<award name = "Nobel Prize" category="Literature" year="1954"></award>
</awards>
</author>
<author firstname="Victor" lastname="Hugo">
<works>
<book name="The Hunchback of Notre-Dame" year="1831" />
<book name="Les Misérables" year="1862" />
</works>
</author>
</library>

It’s a simple xml with nothing fancy in it. As explained in the previous posts here and here.. A neat neo4j graph can be made out of this…

Screenshot from 2014-07-22 21:16:46

So, let’s go about traversing this graph using Cypher.. And since we are trying to traverse an XML, let’s make a rough comparison to XPath.

Let’s fetch all the book nodes,

The Xpath to get all the book nodes, no matter where they are in the document, is

//book

For the same purpose, the Cypher query would be,

MATCH (books:book) RETURN books

This will fetch the following output for the above Graph,


bookNodes

 

Let’s now try to fetch the name of all books. The XPath will require only a slight modification,

//book/@name

The XPath will return the list as,

Attribute='name="A Farewell to Arms"'
Attribute='name="For Whom the bell tolls"'
Attribute='name="The Old man and the sea"'
Attribute='name="The Hunchback of Notre-Dame"'
Attribute='name="Les Misérables"'

The Cypher will only require a small modification. Instead of returning the entire node, fetch the ‘name’ attribute for the nodes.

MATCH (books:book) RETURN books.name

bookName Next up, let’s query the awards honoured to Earnest Hemingway,

This can be achieved via XPath as,

//author[@firstname=’Earnest’]/awards

which gives the output

<awards>
  <award name="Pulitzer Prize" category="Fiction" year="1953" />
  <award name="Nobel Prize" category="Literature" year="1954" />
</awards>

As for Cypher,

MATCH (author {firstname: “Earnest”})-[*]->(award:award) RETURN award

We try to fetch any node of the type ‘award’ connected to a node of type ‘author’ with firstname = Earnest

awards

The above examples are very much trivial, and aims to prove the possibility of using Cypher to traverse a database. I am looking for a huge and most importantly meaningful xml content which can be queried to get some useful information. Keep watching this space for more..