Property Graph Exchange Format

Hirokazu Chiba; Ryota Yamanaka; Shota Matsumoto

arxiv: 1907.03936 · v1 · pith:FMARB6DBnew · submitted 2019-07-09 · 💻 cs.DB

Property Graph Exchange Format

Hirokazu Chiba , Ryota Yamanaka , Shota Matsumoto This is my paper

Pith reviewed 2026-05-25 00:16 UTC · model grok-4.3

classification 💻 cs.DB

keywords property graphserialization formatinteroperabilitygraph databasedata exchangedata model

0 comments

The pith

A redefined property graph model with associated serialization formats enables interoperable data exchange across different graph database implementations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper redefines the property graph model to incorporate variations found in existing database systems. It then introduces serialization formats based on this unified model. These formats aim to be general enough to handle differences while remaining intuitive for users creating and maintaining graph data. Practical converters were built to translate the new format into existing ones, allowing data to be loaded into various graph databases. The work positions the model as a foundation for interoperable platforms that support creating, exchanging, and using property graph data.

Core claim

The authors redefine the property graph model by incorporating differences from existing models and propose interoperable serialization formats. The model is independent of specific implementations and provides a basis of interoperable management of property graph data. The proposed serialization is not only general but also intuitive, thus it is useful for creating and maintaining graph data. Converters from the serialization into existing formats were implemented and demonstrated to load into various graph databases.

What carries the argument

The redefined property graph model that captures differences across implementations, paired with the proposed serialization formats that serve as the exchange mechanism.

If this is right

Graph data created in the neutral format can be loaded into multiple existing databases without custom per-system adjustments.
Maintenance tasks such as updating or sharing graph datasets become feasible across different vendor implementations.
An interoperable platform emerges that supports creating, exchanging, and utilizing property graph data from a common base.
Database vendors can adopt the format to improve compatibility without redesigning their core models.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Standardized migration tools could be developed around the serialization to automate transfers between graph stores.
Application developers might build cross-database query layers that rely on this exchange format as an intermediate representation.
The model could be extended in future work to include features like temporal properties or hyperedges if they appear in new implementations.

Load-bearing premise

Differences across existing property graph implementations can be captured in a single redefined model without losing essential features or requiring major changes to current database behavior.

What would settle it

Loading data serialized in the proposed format via converters into multiple distinct graph databases and verifying that all vertices, edges, properties, and labels are preserved without alteration or loss.

read the original abstract

Recently, a variety of database implementations adopting the property graph model have emerged. However, interoperable management of graph data on these implementations is challenging due to the differences in data models and formats. Here, we redefine the property graph model incorporating the differences in the existing models and propose interoperable serialization formats for property graphs. The model is independent of specific implementations and provides a basis of interoperable management of property graph data. The proposed serialization is not only general but also intuitive, thus it is useful for creating and maintaining graph data. To demonstrate the practical use of our model and serialization, we implemented converters from our serialization into existing formats, which can then be loaded into various graph databases. This work provides a basis of an interoperable platform for creating, exchanging, and utilizing property graph data.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper redefines the property graph model to bridge implementation differences and adds serialization formats plus converters, but the unification details look thin without more evidence on trade-offs.

read the letter

The paper's core move is to redefine the property graph model so it explicitly folds in variations seen across existing graph database implementations, then pair that with new serialization formats and converters that turn the new format into ones the target systems can load. They report building those converters as a practical check. This targets a real pain point: moving graph data between different engines without losing structure or requiring big rewrites on either side. The implementation step gives it more weight than a pure proposal would have. It is the kind of applied work that can matter for people who actually maintain or migrate graph datasets. The serialization is pitched as intuitive, which could help adoption if it holds up. The citation pattern looks normal for a standards-style paper; they reference prior models and focus on filling the interoperability gap those left open. No obvious circularity or invented entities that lack grounding. The main soft spot is exactly the one the stress-test note flags. The abstract claims the model unifies differences while staying independent of any one implementation and avoiding major changes to current database behavior, yet it gives no concrete rules or examples for how conflicts on property typing, multi-edges, labeling, or directionality get resolved. If the full model makes a choice that one or more target systems do not support natively, the converters will either drop information or the “no major changes” claim will not hold. Without seeing the model definition and the converter mappings in detail, it is difficult to judge whether the unification is lossless or just a lowest-common-denominator compromise. This paper is for the graph-database practitioner or standards group that needs a workable exchange format today. A reader already wrestling with data movement across engines could extract usable ideas from the formats and the converter approach even if the model needs tightening. It deserves a serious referee because the problem is concrete, the authors have shipped working converters, and the topic sits inside an active subfield. I would send it to peer review with the expectation that reviewers will press on the model rules and any information loss in the converters.

Referee Report

3 major / 1 minor

Summary. The paper redefines the property graph model to incorporate differences across existing implementations, proposes interoperable serialization formats, and demonstrates the approach via implemented converters that allow loading into various graph databases. The model is presented as implementation-independent and useful for creating and maintaining graph data.

Significance. If the unification succeeds without feature loss or forced DB changes, the work could establish a practical basis for property graph interoperability. The implementation of converters provides concrete evidence of usability and is a strength; however, the absence of formal definitions, comparison tables, or validation metrics limits the ability to assess whether the central claim holds.

major comments (3)

[Abstract] Abstract: the claim that the model 'incorporates the differences in the existing models' and that converters demonstrate practical use is stated without any equations, formal specification, validation data, or description of how unification across differences (property typing, multi-edges, labeling, directionality) was achieved.
[§§3–4] §§3–4: the model rules appear to require post-hoc choices on observed differences; it is not shown that these choices are native to all target systems or that the converters avoid information loss while satisfying the claim of 'no major changes to current database behavior'.
No section or table presents empirical validation of the converters (e.g., test cases covering the differences, data-preservation metrics, or compatibility results across the target databases).

minor comments (1)

The serialization format is described as 'intuitive' but lacks an explicit grammar, BNF, or complete example set that would allow independent re-implementation.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments highlighting the need for clearer support of the unification claims and empirical evidence. We address each major comment below, indicating where revisions will be made to strengthen the manuscript.

read point-by-point responses

Referee: [Abstract] Abstract: the claim that the model 'incorporates the differences in the existing models' and that converters demonstrate practical use is stated without any equations, formal specification, validation data, or description of how unification across differences (property typing, multi-edges, labeling, directionality) was achieved.

Authors: The abstract is intentionally concise as a summary. The incorporation of differences (e.g., support for multi-edges, optional directionality, and varied labeling/property typing) is achieved in the model definition by adopting a superset of observed features from implementations such as Neo4j and others, as detailed in Sections 3 and 4. The converters in Section 5 demonstrate practical use by enabling import without requiring database modifications. We will revise the abstract to briefly reference these sections and the specific differences addressed. revision: partial
Referee: [§§3–4] §§3–4: the model rules appear to require post-hoc choices on observed differences; it is not shown that these choices are native to all target systems or that the converters avoid information loss while satisfying the claim of 'no major changes to current database behavior'.

Authors: Sections 3 and 4 derive the model rules from a comparative analysis of existing property graph systems to capture commonalities while accommodating variations. The choices (such as allowing multiple edge labels or map-based properties) are mappable to native representations in target systems like Neo4j and JanusGraph. Converters are implemented to perform these mappings without information loss for supported features and without altering the target databases' core behavior or requiring schema changes. We will add clarifying text in Section 4 on the per-difference mapping strategy to make this explicit. revision: partial
Referee: [—] No section or table presents empirical validation of the converters (e.g., test cases covering the differences, data-preservation metrics, or compatibility results across the target databases).

Authors: Section 5 describes the implemented converters and their use with multiple graph databases to show practicality. We agree that explicit test cases and metrics would strengthen the evidence of no information loss. We will add a table or subsection summarizing the test cases (covering multi-edges, labeling, property types, and directionality), the target databases evaluated, and qualitative results on data preservation and compatibility. revision: yes

Circularity Check

0 steps flagged

No circularity: proposal synthesizes observed models without self-referential reduction

full rationale

The paper redefines a property graph model by incorporating observed differences across existing implementations and proposes serialization formats, with converters implemented to demonstrate loading into target databases. This is an observational synthesis and standardization effort rather than a derivation chain containing equations, fitted parameters called predictions, or load-bearing self-citations. The central claim rests on the converters providing external validation against real systems, not on any step that reduces by construction to the paper's own inputs. No self-definitional, uniqueness-imported, or ansatz-smuggled patterns appear.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The claim rests on the domain assumption that a single model can unify existing variations; the main addition is the new serialization format itself, which has no independent evidence outside the proposal.

axioms (1)

domain assumption A single property graph model can incorporate the differences present in existing database implementations
Invoked when the paper states the model redefinition provides a basis for interoperability

invented entities (1)

Interoperable serialization format for property graphs no independent evidence
purpose: To enable creation, exchange, and loading of graph data across different database products
Introduced as the core practical output of the work

pith-pipeline@v0.9.0 · 5653 in / 1237 out tokens · 26486 ms · 2026-05-25T00:16:35.757841+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Definition 1 (Property Graph Model) ... PG = ⟨N, Ed, Eu, S, V, P, e, ln, le, pn, pe⟩ ... multiple labels ... multiple values ... directed or undirected
IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

EBNF of PG format; converters to Neo4j/PGX/Neptune

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.