London, February 6th 2014
We are glad to announce the new engine to manage relationships in graph database. According to data loading benchmark it can be up to 15 times faster than current implementation!
The new architecture is based on new data structure SB-Tree and optimized for usage not only for embedded, but for remote storage too. To achieve such kind of optimization we have introduced new data type LINKBAG, it represents set of RIDs, but allows duplication of values, also it does not implement Collection interface. LINKBAG has two binary presentations, in form of modified B-Tree and in form of collection managed as embedded in document, but collection is deserialized only on demand, in case of iteration for example.
Below the comparison on load speed on importing Wikipedia page structure (without page content) which consist of 130 millions of vertexes and more than 1 billion of edges.
To prevent duplication of vertexes we used unique index by page key. Data were taken from http://downloads.dbpedia.org/3.6/en/page_links_en.nt.bz2. Load test was ran on PC with 24 Gb RAM, 7500 RPM HDD, Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz.
Test 1 consists on loading data in Transactional mode, blue line is OrientDB 1.6.x, the red line is OrientDB 1.7-rc1 with the new LINKBAG structure.
On the X axis there is the amount of imported pages, on the Y axis the time which was spent to import these pages.
Orient Technologies LTD