Optimising Graph Data with a Multi-Model Engine

“Once we began using OrientDB and working with graphs, our algorithms were 2.5 times faster!”
– Andrey Kozak, CEO, New.sc

What are Graph Databases?

Graph databases are NoSQL databases which use the graph data model comprised of Vertices and Edges.

A Vertex is an entity such as a person, place, object, or relevant piece of data and an Edge represents the relationship between two nodes.

Similar to how we naturally form links between relevant data, graph databases make relationships first class citizens.

 

orientdb_logo_hatchsmall Standards


Tinkerpop


As part of Apache’s TinkerPop, Gremlin is the open source standard language for Graph Databases and creates a shared set of basic interfaces to abstract the concepts of Graph, Vertex, Edge and Property. Both OrientDB community and Enterprise editions are compatible with Tinkerpop. As an official OrientDB partner, TinkerPop and OrientDB staff worked together to built the OrientDB implementation of the Blueprints.

SQL/NoSQL


OrientDB focuses on standards and is a NoSQL Multi-model database that supports SQL. Why? Not only is SQL more readable and concise than most map reduce scripts, you shouldn’t have to learn a new query language in order to utilise graphs.With an SQL based query language extended to support trees and graphs, OrientDB makes moving to the NoSQL world familiar to those coming to the NoSQL world for the first time.

JDBC


The custom JDBC driver for OrientDB enables connection to a remote server using the standard and consolidated way of interacting with databases in the Java world. Simply add a dependency inside your project and you are ready to connect. Our JDBC driver is compatible with most tools that support the JDBC standard. Take a look at our Integration page for a complete list of tools and drivers.

 

No need for Relational database JOINs.

Advantages of Native Graph Databases

Unlike relational databases, a graph database doesn’t make use of foreign keys and JOIN operations. Instead, all relationships are natively stored within vertices (as documents in OrientDB), resulting in deep traversal capabilities, increased flexibility and agility.

OrientDB
connects Documents by using fast, direct links from the graph database world.

Native graph databases are equipped to handle rapidly scaling data.

Modern day applications such as recommendation engines, social media, fraud detection, forensic analysis and medical research all use graph data to process highly connected data.

Native graph databases that apply Index free adjacency show reduced latency on CRUD operations.

 

Where Graph Databases Fall Short

One of the main issues with graph databases is most make use of their own custom query language, making query optimisation more difficult.

Though graph database use is growing, these are generally used to store highly connected data, but are rarely utilised as the primary database.

This causes increased reliance on polyglot persistence or making use of multiple systems to handle different data types.

Graph databases are optimised to form relationships, but are generally ill equipped to manage data aggregation for sum operations, complex data such as collections, embedded objects, currencies or dates.

 

Applying Multiple Data Models to One System

Aside from incorporating multiple data models within their core engine, true Multi-model databases are those which not only make use of several models, but those which also facilitate integration between other database systems, help standardise differences in Query languages and remove restrictions imposed by data model standards.

The overall strategy of Multi-model databases is to act as a drop in replacement for a relational, graph or document database. However these can also work alongside them to synchronise data first. Real world scenarios rarely present opportunities to simply substitute databases, these changes occur gradually. Multi-model databases acknowledge these challenges and help make that process easier.

This process enables inefficient systems to be gradually removed without compromising the data integrity of production environments. Multi-model databases such as OrientDB allow for relational and other graph data to be either synchronised or migrated into OrientDB and alhough it supports embedding documents, the ability to connect them removes duplicate information. Systems such as OrientDB exploit the advantages of graph databases, but add transactional complexity, query optimisation, SQL familiarity, and the data integrity needed to run stable, secure environments and process massive data sets.

 

Polyglot persistence makes use of multiple systems, impacting performance and costs.

Graph databases vs Polyglot Persistence

Though graph databases allow for large amounts of highly connected data to be retrieved quickly, most production environments are comprised of multiple systems.

These systems lack common standards, making data synchronisation costly and inefficient.

This causes most popular graph databases (such as Neo4j*) to be utilised as data stores for large data sets and adds another layer of complexity. Production environments generally utilise graph databases simply to resolve complex relationships, with remaining data still residing on other databases.

OrientDB does away with multiple systems.

Therefore, while graph databases might store recommendations for an application, financial data is still stored in relational database and product data is typically stored in a document database. Multi-model databases, on the other hand, allow all data to be stored in a single system, not only optimising performance but reducing licensing as well as operational costs.

*All trademarks are the property of their respective owners.

 

An independent benchmark shows that OrientDB is 10X faster than Neo4j* on graph operations across all workloads.

Multi-model Databases for Optimised Graph Performance

By exploiting multiple data models and facilitating the integration of multiple systems, OrientDB optimises graph data and enables applications to harness graph database speeds with transactional data for modern day use cases,
Spatial Module combines spatial awareness and graph data.

Though OrientDB can be used as a single system, it is also equipped to make integration with multiple systems easier. This makes transitioning to modern day applications easier, without impacting production environments.

Harnessing graph data relationships with Document metadata for better RAM use and improved caching, true Multi-model databases also eliminate restrictions imposed by data models and database vendors. Furthermore, OrientDB’s Reactive Model optimises resources by pushing query results when changes occur instead of regularly polling database. When compared to Neo4j’s* graph database,
An independent benchmark by Tokyo Institute of Technology* and IBM Research*, shows that OrientDB is ten times faster than Neo4j on graph operations among all the workloads.

No schema restrictions enables users to use schema-full, schema-less, or schema-mixed modes. Users decide what constraints are set and when to enforce schemas.

 

Multi-model Database Use Cases

By combining the power of graphs with the flexibility of documents, Multi-model databases such as OrientDB are suitable for virtually any use case. Some of the uses include:

  • Recommendation engines: Graphs naturally lend themselves to quickly form links and analyse large amounts of data. Recommendation engines are one of the main examples of how graphs can find relationships between datasets to provide best matches.
  • Banking and Financial applications: RDBMS systems are simply not capable of quickly exploring relations to uncover crimes such as fraud rings, identity theft and protect sensitive banking data. For that, you need a graph database in order to navigate connections in real time to discover patterns, match data and stop fraud before it happens.
  • Biologic Applications: Universities, governments and pharmaceutical companies are turning to graph databases to create innovative applications, study DNA sequencing and help discover new treatments for diseases.
  • Online Retail applications: In today’s world of rapidly expanding data, staying ahead of the curve means providing the fastest and more efficient method to handle online purchases. Multi-model databases exploit the advantages of graphs to quickly find links between data but also maximise application efficiency by handling, financial, product,
    user session, and search engine data.

Take a look at our case studies, use cases and success stories to find out more about specific OrientDB applications and how Multi-model databases are being used for everything from traffic management to forensic analysis and fraud prevention.

If you are currently using a relational database and are interested in converting you data to a graph, be sure to try out OrientDB Teleporter. Download your free 45 day OrientDB Enterprise trial and synchronise all your relational data with OrientDB with a few clicks.

Companies all over the world use OrientDB to power their applications. Some of these include:

sky foxsports au Comcast-Logo Warner Music Group kpmglogo Accenture