Neo4j is an open-source graph database that queries and manipulates data using its own Cypher Query Language and can export in GraphML, an XML-based file format for graphs. Given that OrientDB can read GraphML, it is relatively straightforward to import data from Neo4j into OrientDB. You can manage the imports using the Console or the Java API.
Neo4j is a registered trademark of Neo Technology, Inc. For more information on the differences between Neo4j and OrientDB, see OrientDB vs. Neo4j.
In order to export data from Neo4j into GraphML, you need to install the Neo4j Shell Tools plugin. Once you have this package installed, you can use the
export-graphml utility to export the database.
Change into the Neo4j home directory:
Download the Neo4j Shell Tools:
curl http://dist.neo4j.org/jexp/shell/neo4j-shell-tools_2.3.2.zip \ -o neo4j-shell-tools.zip
neo4j-shell-tools.zipfile into the
unzip neo4j-shell-tools.zip -d lib
Restart the Neo4j Server. In the event that it's not running,
Once you have Neo4j restarted with the Neo4j Shell Tools, launch the Neo4j Shell tool, located in the
./bin/neo4j-shellWelcome to the Neo4j Shell! Enter 'help' for a list of commands NOTE: Remote Neo4j graph database service 'shell' at port 1337 neo4j-sh (0)$
Export the database into GraphML:
export-graphml -t -o /tmp/out.graphmlWrote to GraphML-file /tmp/out.graphml 0. 100%: nodes = 302 rels = 834 properties = 4221 time 59 sec total 59 sec
This exports the database to the path
There are three methods available in importing the GraphML file into OrientDB: through the Console, through Gremlin or through the Java API.
For more recent versions of OrientDB, you can import data from GraphML through the OrientDB Console. If you have version 2.0 or greater, this is the recommended method given that it can automatically translate the Neo4j labels into classes.
Log into the OrientDB Console.
In OrientDB, create a database to receive the import:
CREATE DATABASE PLOCAL:/tmp/db/testCreating database [plocal:/tmp/db/test] using the storage type [plocal]... Database created successfully. Current database is: plocal:/tmp/db/test
Import the data from the GraphML file:
IMPORT DATABASE /tmp/out.graphmlImporting GRAPHML database database from /tmp/out.graphml... Transaction 8 has been committed in 12ms
This imports the Neo4j database into OrientDB on the
For older versions of OrientDB, you can import data from GraphML through the Gremlin Console. If you have a version 1.7 or earlier, this is the method to use. It is not recommended on more recent versions, given that it doesn't consider labels declared in Neo4j. In this case, everything imports as the base vertex and edge classes, (that is,
E). This means that, after importing through Gremlin you need to refactor you graph elements to fit a more structured schema.
To import the GraphML file into OrientDB, complete the following steps:
Launch the Gremlin Console:
$ORIENTDB_HOME/bin/gremlin.sh\,,,/ (o o) -----oOOo-(_)-oOOo-----
From the Gremlin Console, create a new graph, specifying the path to your Graph database, (here
g = new OrientGraph("plocal:/tmp/db/test");==>orientgraph[plocal:/db/test]
Load the GraphML file into the graph object (that is,
Exit the Gremlin Console:
This imports the GraphML file into your OrientDB database.
OrientDB Console calls the Java API. Using the Java API directly allows you greater control over the import process. For instance,
new OGraphMLReader(new OrientGraph("plocal:/temp/bettergraph")).inputGraph("/temp/neo4j.graphml");
This line imports the GraphML file into OrientDB.
Beginning in version 2.1, OrientDB allows you to modify the import process through custom strategies for vertex and edge attributes. It supports the following strategies:
com.orientechnologies.orient.graph.graphml.OIgnoreGraphMLImportStrategyDefines attributes to ignore.
com.orientechnologies.orient.graph.graphml.ORenameGraphMLImportStrategyDefines attributes to rename.
Ignore the vertex attribute
new OGraphMLReader(new OrientGraph("plocal:/temp/bettergraph")).defineVertexAttributeStrategy("__type__", new OIgnoreGraphMLImportStrategy()).inputGraph("/temp/neo4j.graphml");
Ignore the edge attribute
new OGraphMLReader(new OrientGraph("plocal:/temp/bettergraph")).defineEdgeAttributeStrategy("weight", new OIgnoreGraphMLImportStrategy()).inputGraph("/temp/neo4j.graphml");
Rename the vertex attribute
new OGraphMLReader(new OrientGraph("plocal:/temp/bettergraph")).defineVertexAttributeStrategy("__type__", new ORenameGraphMLImportStrategy("type")).inputGraph("/temp/neo4j.graphml");
In the event that you experience memory issues while attempting to import from Neo4j, you might consider reducing the batch size. By default, the batch size is set to
1000. Smaller value causes OrientDB to process the import in smaller units.
Import with adjusted batch size through the Console:
IMPORT DATABASE /tmp/out.graphml batchSize=100
Import with adjusted batch size through the Java API:
new OGraphMLReader(new OrientGraph("plocal:/temp/bettergraph")).setBatchSize(100).inputGraph("/temp/neo4j.graphml");
By default, OrientDB updates the import to use its own ID's for vertices. If you want to preserve the original vertex ID's from Neo4j, use the
Import with the original vertex ID's through the Console:
IMPORT DATABASE /tmp/out.graphml storeVertexIds=true
Import with the original vertex ID's through the Java API:
new OGraphMLReader(new OrientGraph("plocal:/temp/bettergraph")).setStoreVertexIds(true).inputGraph("/temp/neo4j.graphml");