In-Depth Benchmarking of Graph Database Systems with the Linked Data Benchmark Council (LDBC) Social Network Benchmark (SNB)

Florin Rusu; Zhiyi Huang

arxiv: 1907.07405 · v1 · pith:BJCGOENXnew · submitted 2019-07-17 · 💻 cs.DB

In-Depth Benchmarking of Graph Database Systems with the Linked Data Benchmark Council (LDBC) Social Network Benchmark (SNB)

Florin Rusu , Zhiyi Huang This is my paper

Pith reviewed 2026-05-24 20:01 UTC · model grok-4.3

classification 💻 cs.DB

keywords graph databasesbenchmarkingNeo4jTigerGraphLDBC SNBperformance evaluationscalabilitysocial network data

0 comments

The pith

TigerGraph outperforms Neo4j by two or more orders of magnitude on most LDBC SNB queries and alone scales to the largest datasets.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper delivers the first complete run of the LDBC Social Network Benchmark across all three query categories on two native graph systems. It measures execution time for every one of the 46 queries at four increasing data sizes plus loading time and storage footprint. TigerGraph finishes the work faster on the great majority of queries, with the margin reaching 100x on some complex and business-intelligence tasks, and it is the only system that processes the full SF-1000 workload. Neo4j loads smaller graphs more quickly. Anyone selecting a graph database for social-network-style analytics would treat these numbers as direct evidence of relative capability under realistic conditions.

Core claim

TigerGraph consistently outperforms Neo4j on the majority of the 46 LDBC SNB queries, reaching two or more orders of magnitude on certain interactive complex and business intelligence queries. The gap widens with data size because only TigerGraph finishes the entire SF-1000 workload while Neo4j completes just 12 of the 25 business intelligence queries. Neo4j remains faster at bulk loading up to SF-100. All platforms were tuned with active vendor participation, and the authors release code, scripts, and configuration files for reproducibility.

What carries the argument

Full LDBC SNB benchmark implementation (interactive short, interactive complex, and business intelligence query sets) executed on Neo4j and TigerGraph across scale factors SF-1 to SF-1000.

If this is right

TigerGraph can be expected to handle social-network workloads at SF-1000 scale where Neo4j cannot complete all queries.
The relative advantage of TigerGraph increases as dataset size grows from SF-100 to SF-1000.
Neo4j retains an edge in bulk-loading time for datasets up to SF-100.
Public release of the tuned configurations and query implementations enables direct reproduction or extension by other users.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Removing vendor assistance from the tuning process could alter the observed speed ratios and therefore merits a follow-up neutral study.
The same benchmark suite could be applied to additional graph engines to produce a broader ranking beyond the two systems tested here.
Query patterns in the LDBC SNB may appear in domains other than social networks, so the relative rankings could inform choices in fraud detection or recommendation workloads as well.

Load-bearing premise

Active involvement of the vendors in tuning their platforms produces representative and unbiased performance numbers for each system.

What would settle it

An independent execution of the same benchmark using only publicly available default configurations or neutral tuning that yields substantially smaller performance gaps or reverses the ranking.

Figures

Figures reproduced from arXiv: 1907.07405 by Florin Rusu, Zhiyi Huang.

**Figure 2.** Figure 2: Loading data size split into actual data size and indexes size. Raw corresponds to the size of [PITH_FULL_IMAGE:figures/full_fig_p011_2.png] view at source ↗

**Figure 3.** Figure 3: Loading time split into ingestion time and indexing time. The numbers inside the bars represent [PITH_FULL_IMAGE:figures/full_fig_p012_3.png] view at source ↗

**Figure 4.** Figure 4: Execution time in milliseconds (msec) for interactive short (IS) queries over scale factor 1 (a), 10 [PITH_FULL_IMAGE:figures/full_fig_p013_4.png] view at source ↗

**Figure 5.** Figure 5: Execution time (sec) for interactive complex (IC) queries over scale factor 1 (a), 10 (b), 100 (c), [PITH_FULL_IMAGE:figures/full_fig_p015_5.png] view at source ↗

**Figure 6.** Figure 6: Execution time (sec) for business intelligence (BI) queries over scale factor 1 (a), 10 (b), 100 (c), [PITH_FULL_IMAGE:figures/full_fig_p016_6.png] view at source ↗

read the original abstract

In this study, we present the first results of a complete implementation of the LDBC SNB benchmark -- interactive short, interactive complex, and business intelligence -- in two native graph database systems---Neo4j and TigerGraph. In addition to thoroughly evaluating the performance of all of the 46 queries in the benchmark on four scale factors -- SF-1, SF-10, SF-100, and SF-1000 -- and three computing architectures -- on premise and in the cloud -- we also measure the bulk loading time and storage size. Our results show that TigerGraph is consistently outperforming Neo4j on the majority of the queries---by two or more orders of magnitude (100X factor) on certain interactive complex and business intelligence queries. The gap increases with the size of the data since only TigerGraph is able to scale to SF-1000---Neo4j finishes only 12 of the 25 business intelligence queries in reasonable time. Nonetheless, Neo4j is generally faster at bulk loading graph data up to SF-100. A key to our study is the active involvement of the vendors in the tuning of their platforms. In order to encourage reproducibility, we make all the code, scripts, and configuration parameters publicly available online.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

First complete LDBC SNB run on both Neo4j and TigerGraph with public code, but vendor tuning leaves the large performance gaps hard to attribute cleanly to the systems.

read the letter

The paper's real contribution is running every one of the 46 LDBC SNB queries—short, complex, and BI—on Neo4j and TigerGraph, across four scale factors, with all scripts and configs released. That has not been done before, and the public artifacts make the numbers checkable. They also report load times and storage sizes, which adds practical detail for anyone sizing a deployment. The headline finding is that TigerGraph finishes most queries faster, sometimes by two orders of magnitude, and reaches SF-1000 while Neo4j times out on many BI queries; Neo4j is quicker at loading up to SF-100. Those are concrete data points worth having on record. The authors credit the vendors for tuning help, which they call central to the study, and they publish the resulting configurations. That is more transparent than many benchmark papers. The main limitation is that the tuning process itself is not documented with time budgets, iteration counts, or any third-party check that the effort was comparable. When one system shows 100x gains on complex queries, it matters whether the difference comes from the engine or from unequal optimization work. The paper presents the numbers as system comparisons, so this gap in the method is load-bearing. The work is aimed at practitioners choosing a graph store and at groups that run or extend the LDBC benchmark. It is not trying to advance theory or new algorithms. The empirical execution and the released code are solid enough that a referee should see it; the tuning discussion would need tightening, but that is fixable. I would send it out for review rather than desk-reject.

Referee Report

1 major / 1 minor

Summary. The manuscript reports the first complete implementation of the LDBC SNB benchmark (interactive short, interactive complex, and business intelligence queries) on Neo4j and TigerGraph. It measures performance, bulk loading time, and storage across SF-1 to SF-1000 on on-premise and cloud hardware for all 46 queries. The central empirical claims are that TigerGraph outperforms Neo4j on the majority of queries (by two or more orders of magnitude on certain complex and BI queries), scales to SF-1000 while Neo4j completes only 12 of 25 BI queries, and that Neo4j is generally faster at loading up to SF-100. Vendor involvement in tuning is described as a key methodological feature, with all code, scripts, and configurations released publicly.

Significance. If the measured gaps can be attributed to intrinsic engine differences, the work supplies a useful, large-scale empirical comparison on a standard community benchmark. The public release of configurations and scripts is a clear strength that enables reproducibility and independent verification.

major comments (1)

[Abstract] Abstract: The headline claims (TigerGraph outperforming Neo4j by up to 100X and scaling to SF-1000 while Neo4j does not) are presented as direct comparisons between the two systems. However, these results rest on configurations obtained via 'active involvement of the vendors in the tuning of their platforms,' with no described protocol, time budget, or verification mechanism ensuring equivalent tuning effort. This is load-bearing for the attribution of performance differences to the engines rather than unequal optimization investment.

minor comments (1)

The manuscript would benefit from an explicit summary table (or figure) listing, for each scale factor, the number of queries completed by each system within the timeout; this would make the scaling claims immediately verifiable without scanning individual result tables.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful review and for highlighting the strengths in significance and reproducibility. We address the single major comment below and are willing to revise the manuscript accordingly.

read point-by-point responses

Referee: [Abstract] Abstract: The headline claims (TigerGraph outperforming Neo4j by up to 100X and scaling to SF-1000 while Neo4j does not) are presented as direct comparisons between the two systems. However, these results rest on configurations obtained via 'active involvement of the vendors in the tuning of their platforms,' with no described protocol, time budget, or verification mechanism ensuring equivalent tuning effort. This is load-bearing for the attribution of performance differences to the engines rather than unequal optimization investment.

Authors: We agree that the manuscript does not describe a formal protocol, time budget, or verification mechanism for the vendor tuning process, and that this detail would help readers assess whether performance gaps reflect engine differences. The paper already stresses public release of all configurations, scripts, and code to support reproducibility and independent verification. In revision we will add a dedicated methods subsection describing the tuning interactions, any time or resource constraints applied, and steps taken to ensure both vendors received comparable opportunity. The abstract claims will be qualified to note that results reflect expert-tuned configurations obtained via vendor involvement. revision: yes

Circularity Check

0 steps flagged

No circularity: purely empirical benchmark measurements

full rationale

The paper reports direct runtime, loading, and storage measurements obtained by executing the externally-defined LDBC SNB query workload on Neo4j and TigerGraph after vendor-assisted configuration. No equations, fitted parameters, predictions, or derivations appear anywhere in the text; the central claims are observational comparisons against a fixed external benchmark specification. The vendor-tuning detail is a methodological choice whose fairness can be debated, but it does not create any self-referential reduction of the reported numbers to the paper's own inputs. Consequently the derivation chain is empty and the circularity score is 0.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the representativeness of the LDBC SNB workload and the fairness of vendor-tuned configurations; no free parameters, new entities, or mathematical derivations are introduced.

axioms (2)

domain assumption The LDBC SNB benchmark queries and data generator accurately model real-world social-network workloads.
The study treats the benchmark as a faithful proxy without additional validation reported in the abstract.
domain assumption Vendor-tuned configurations represent the best achievable and comparable performance for each system.
Active vendor involvement is presented as a strength, yet the abstract does not quantify controls against benchmark-specific over-optimization.

pith-pipeline@v0.9.0 · 5761 in / 1323 out tokens · 28925 ms · 2026-05-24T20:01:32.764847+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

28 extracted references · 28 canonical work pages · 2 internal anchors

[1]

Angles, P

R. Angles, P. Boncz, J. Larriba-Pey, I. Fundulaki, T. Neumann, O. Erling, P. Neubauer, N. Martinez- Bazan, V . Kotsev, and I. Toma. The Linked Data Benchmark Council: A Graph and RDF Industry Benchmarking Effort. ACM SIGMOD Record, 43(1), 2014

work page 2014
[2]

Angles, M

R. Angles, M. Arenas, P. Barcelo, P. Boncz, G. Fletcher, C. Gutierrez, T. Lindaaker, M. Paradies, S. Plantikow, J. Sequeda, O. van Rest, and H. V oigt. G-CORE: A Core for Future Graph Query Languages. In SIGMOD 2018

work page 2018
[3]

TigerGraph: A Native MPP Graph Database

A. Deutsch, Y . Xu, M. Wu, and V . Lee. TigerGraph: A Native MPP Graph Database. arXiv:1901.08248, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1901
[4]

Erling, A

O. Erling, A. Averbuch, J. Larriba-Pey, H. Chaﬁ, A. Gubichev, A. Prat-Perez, M.-D. Pham, and P. Boncz. The LDBC Social Network Benchmark: Interactive Workload. In SIGMOD 2015

work page 2015
[5]

Iosup, T

A. Iosup, T. Hegeman, W.L. Ngai, S. Heldens, A. Prat-Perez, T. Manhardt, H. Chaﬁ, M. Capota, N. Sundaram, M. Anderson, I.G. Tanase, Y . Xia, L. Nai, and P. Boncz. LDBC Graphalytics: A Benchmark for Large-Scale Graph Analysis on Parallel and Distributed Platforms. PVLDB, 9(13), 2016

work page 2016
[6]

Y . Low, J. Gonzalez, A. Kyrola, D. Bickson, C. Guestrin, and J. Hellerstein. GraphLab: A New Framework for Parallel Machine Learning. In UAI 2010. 18

work page 2010
[7]

Malewicz, M

G. Malewicz, M. Austern, A. Bik, J. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: A System for Large-Scale Graph Processing. In SIGMOD 2010

work page 2010
[8]

Needham and A.E

M. Needham and A.E. Hodler. Graph Algorithms—Practical Examples in Apache Spark and Neo4j. O’Reilly, 2019

work page 2019
[9]

Pacaci, A

A. Pacaci, A. Zhou, J. Lin, and M.T. Ozsu. Do We Need Specialized Graph Databases? Benchmarking Real-Time Social Networking Applications. In GRADES@SIGMOD 2017

work page 2017
[10]

van Rest, S

O. van Rest, S. Hong, J. Kim, X. Meng, and H. Chaﬁ. PGQL: A Property Graph Query Language. In GRADES@SIGMOD 2016

work page 2016
[11]

Robinson, J

I. Robinson, J. Webber, and E. Eifrem. Graph Databases—New Opportunities for Connected Data, 2nd Edition. O’Reilly, 2015

work page 2015
[12]

Szarnyas, A

G. Szarnyas, A. Prat-Perez, A. Averbuch, J. Marton, M. Paradies, M. Kaufmann, O. Erling, P. Boncz, V . Haprian, and J.B. Antal. An Early Look at the LDBC Social Network Benchmark’s Business Intelligence Workload. In GRADES-NDA@SIGMOD 2018

work page 2018
[13]

M. Wu. A Property Graph Type System and Data Deﬁnition Language. arXiv:1810.08755, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[14]

https://giraph.apache.org/

Apache Giraph. https://giraph.apache.org/

work page
[15]

https://tinkerpop

Apache TinkerPop: The Gremlin Graph Traversal Machine and Language. https://tinkerpop. apache.org/gremlin.html

work page
[16]

Z. Huang. LDBC SNB Benchmark. https://github.com/zhuang29/graph_database_ benchmark

work page
[17]

https://janusgraph.org/

JanusGraph. https://janusgraph.org/

work page
[18]

http://www.ldbcouncil.org/

Linked Data Benchmark Council (LDBC). http://www.ldbcouncil.org/

work page
[19]

http://ldbcouncil.org/benchmarks/snb

LDBC Social Network Benchmark (SNB). http://ldbcouncil.org/benchmarks/snb

work page
[20]

https://github.com/ldbc/ldbc_snb_datagen

LDBC SNB Data Generator. https://github.com/ldbc/ldbc_snb_datagen

work page
[21]

https://github.com/ldbc/ldbc_snb_docs

LDBC SNB Documentation. https://github.com/ldbc/ldbc_snb_docs

work page
[22]

https://github.com/ldbc/ldbc_snb_ implementations

LDBC SNB Implementations. https://github.com/ldbc/ldbc_snb_ implementations

work page
[23]

https://neo4j.com/

Neo4j. https://neo4j.com/

work page
[24]

https://neo4j.com/developer/ cypher-query-language/

Neo4j Cypher Query Language. https://neo4j.com/developer/ cypher-query-language/

work page
[25]

https://aws.amazon.com/neptune/

Amazon Neptune. https://aws.amazon.com/neptune/

work page
[26]

https://www.tigergraph.com/

TigerGraph. https://www.tigergraph.com/

work page
[27]

https://www.tigergraph.com/gsql/

TigerGraph GSQL Query Language. https://www.tigergraph.com/gsql/

work page
[28]

https://github.com/tigergraph/ecosys/ tree/ldbc/ldbc_benchmark/tigergraph/queries

TigerGraph GSQL Queries for LDBC SNB. https://github.com/tigergraph/ecosys/ tree/ldbc/ldbc_benchmark/tigergraph/queries. 19

work page

[1] [1]

Angles, P

R. Angles, P. Boncz, J. Larriba-Pey, I. Fundulaki, T. Neumann, O. Erling, P. Neubauer, N. Martinez- Bazan, V . Kotsev, and I. Toma. The Linked Data Benchmark Council: A Graph and RDF Industry Benchmarking Effort. ACM SIGMOD Record, 43(1), 2014

work page 2014

[2] [2]

Angles, M

R. Angles, M. Arenas, P. Barcelo, P. Boncz, G. Fletcher, C. Gutierrez, T. Lindaaker, M. Paradies, S. Plantikow, J. Sequeda, O. van Rest, and H. V oigt. G-CORE: A Core for Future Graph Query Languages. In SIGMOD 2018

work page 2018

[3] [3]

TigerGraph: A Native MPP Graph Database

A. Deutsch, Y . Xu, M. Wu, and V . Lee. TigerGraph: A Native MPP Graph Database. arXiv:1901.08248, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1901

[4] [4]

Erling, A

O. Erling, A. Averbuch, J. Larriba-Pey, H. Chaﬁ, A. Gubichev, A. Prat-Perez, M.-D. Pham, and P. Boncz. The LDBC Social Network Benchmark: Interactive Workload. In SIGMOD 2015

work page 2015

[5] [5]

Iosup, T

A. Iosup, T. Hegeman, W.L. Ngai, S. Heldens, A. Prat-Perez, T. Manhardt, H. Chaﬁ, M. Capota, N. Sundaram, M. Anderson, I.G. Tanase, Y . Xia, L. Nai, and P. Boncz. LDBC Graphalytics: A Benchmark for Large-Scale Graph Analysis on Parallel and Distributed Platforms. PVLDB, 9(13), 2016

work page 2016

[6] [6]

Y . Low, J. Gonzalez, A. Kyrola, D. Bickson, C. Guestrin, and J. Hellerstein. GraphLab: A New Framework for Parallel Machine Learning. In UAI 2010. 18

work page 2010

[7] [7]

Malewicz, M

G. Malewicz, M. Austern, A. Bik, J. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: A System for Large-Scale Graph Processing. In SIGMOD 2010

work page 2010

[8] [8]

Needham and A.E

M. Needham and A.E. Hodler. Graph Algorithms—Practical Examples in Apache Spark and Neo4j. O’Reilly, 2019

work page 2019

[9] [9]

Pacaci, A

A. Pacaci, A. Zhou, J. Lin, and M.T. Ozsu. Do We Need Specialized Graph Databases? Benchmarking Real-Time Social Networking Applications. In GRADES@SIGMOD 2017

work page 2017

[10] [10]

van Rest, S

O. van Rest, S. Hong, J. Kim, X. Meng, and H. Chaﬁ. PGQL: A Property Graph Query Language. In GRADES@SIGMOD 2016

work page 2016

[11] [11]

Robinson, J

I. Robinson, J. Webber, and E. Eifrem. Graph Databases—New Opportunities for Connected Data, 2nd Edition. O’Reilly, 2015

work page 2015

[12] [12]

Szarnyas, A

G. Szarnyas, A. Prat-Perez, A. Averbuch, J. Marton, M. Paradies, M. Kaufmann, O. Erling, P. Boncz, V . Haprian, and J.B. Antal. An Early Look at the LDBC Social Network Benchmark’s Business Intelligence Workload. In GRADES-NDA@SIGMOD 2018

work page 2018

[13] [13]

M. Wu. A Property Graph Type System and Data Deﬁnition Language. arXiv:1810.08755, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[14] [14]

https://giraph.apache.org/

Apache Giraph. https://giraph.apache.org/

work page

[15] [15]

https://tinkerpop

Apache TinkerPop: The Gremlin Graph Traversal Machine and Language. https://tinkerpop. apache.org/gremlin.html

work page

[16] [16]

Z. Huang. LDBC SNB Benchmark. https://github.com/zhuang29/graph_database_ benchmark

work page

[17] [17]

https://janusgraph.org/

JanusGraph. https://janusgraph.org/

work page

[18] [18]

http://www.ldbcouncil.org/

Linked Data Benchmark Council (LDBC). http://www.ldbcouncil.org/

work page

[19] [19]

http://ldbcouncil.org/benchmarks/snb

LDBC Social Network Benchmark (SNB). http://ldbcouncil.org/benchmarks/snb

work page

[20] [20]

https://github.com/ldbc/ldbc_snb_datagen

LDBC SNB Data Generator. https://github.com/ldbc/ldbc_snb_datagen

work page

[21] [21]

https://github.com/ldbc/ldbc_snb_docs

LDBC SNB Documentation. https://github.com/ldbc/ldbc_snb_docs

work page

[22] [22]

https://github.com/ldbc/ldbc_snb_ implementations

LDBC SNB Implementations. https://github.com/ldbc/ldbc_snb_ implementations

work page

[23] [23]

https://neo4j.com/

Neo4j. https://neo4j.com/

work page

[24] [24]

https://neo4j.com/developer/ cypher-query-language/

Neo4j Cypher Query Language. https://neo4j.com/developer/ cypher-query-language/

work page

[25] [25]

https://aws.amazon.com/neptune/

Amazon Neptune. https://aws.amazon.com/neptune/

work page

[26] [26]

https://www.tigergraph.com/

TigerGraph. https://www.tigergraph.com/

work page

[27] [27]

https://www.tigergraph.com/gsql/

TigerGraph GSQL Query Language. https://www.tigergraph.com/gsql/

work page

[28] [28]

https://github.com/tigergraph/ecosys/ tree/ldbc/ldbc_benchmark/tigergraph/queries

TigerGraph GSQL Queries for LDBC SNB. https://github.com/tigergraph/ecosys/ tree/ldbc/ldbc_benchmark/tigergraph/queries. 19

work page