Extracting Basic Graph Patterns from Triple Pattern Fragment Logs

Desmontils Emmanuel; Molli Pascal; Nassopoulos Georges; Serrano-Alvarado Patricia

arxiv: 1906.08574 · v1 · pith:AUFXFXCNnew · submitted 2019-06-20 · 💻 cs.DB

Extracting Basic Graph Patterns from Triple Pattern Fragment Logs

Nassopoulos Georges , Serrano-Alvarado Patricia , Molli Pascal , Desmontils Emmanuel This is my paper

Pith reviewed 2026-05-25 18:54 UTC · model grok-4.3

classification 💻 cs.DB

keywords Triple Pattern FragmentsSPARQL query logsBasic Graph PatternsLinked Dataquery reconstructionserver logs

0 comments

The pith

LIFT reconstructs Basic Graph Patterns from logs of single-triple requests to TPF servers.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces an algorithm called LIFT that groups individual triple pattern evaluations recorded in TPF server logs back into the original multi-triple Basic Graph Patterns. TPF servers only ever see and answer one triple at a time, so providers currently have no direct view of the structure of the queries users submit. LIFT relies on the sequence and timing of those single-triple requests to perform the grouping. Experiments reported in the paper indicate that the extracted patterns match the originals with good precision and recall while producing limited noise. If the method holds, TPF hosts would gain the ability to inspect query shapes without altering the TPF protocol itself.

Core claim

LIFT extracts BGPs of executed queries from TPF server logs with good precision and good recall while generating limited noise.

What carries the argument

The LIFT algorithm, which processes the ordered sequence and timing of single-triple log entries to group them into the original multi-triple Basic Graph Patterns.

If this is right

TPF data providers obtain visibility into the structure of the queries their servers actually execute.
Query-log analysis becomes feasible for TPF without requiring clients to send full SPARQL queries.
Noise introduced by the reconstruction process remains low enough that downstream analyses of query shapes remain reliable.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same log-based grouping technique could be tested on other fragment interfaces that break queries into smaller requests.
If reconstruction accuracy varies with query shape, future work could classify which BGP structures are easiest or hardest to recover.
Reconstructed BGPs could feed into workload-aware caching or index decisions at the TPF server without exposing raw client queries.

Load-bearing premise

The ordering and timing information present in TPF server logs is sufficient for an algorithm to correctly group and reconstruct the original multi-triple Basic Graph Patterns.

What would settle it

Run a controlled test in which known multi-triple BGPs are submitted to a TPF server, collect the resulting log, apply LIFT, and measure how often the output BGPs exactly match the submitted ones.

Figures

Figures reproduced from arXiv: 1906.08574 by Desmontils Emmanuel, Molli Pascal, Nassopoulos Georges, Serrano-Alvarado Patricia.

**Figure 1.** Figure 1: Concurrent execution of queries Q1 and Q2. TPF clients decompose SPARQL queries into a sequence of triple pattern queries [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗

**Figure 2.** Figure 2: Examples of simplified TPF logs [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: TPF log and CTP List produced by Algorithm 2 with [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: CTP List and DTP Graph produced by Algorithm 3 with [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: Connected components of the DTP Graph produced by the execution of [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗

**Figure 6.** Figure 6: Precision and recall by query of the TPF web site. [PITH_FULL_IMAGE:figures/full_fig_p010_6.png] view at source ↗

**Figure 7.** Figure 7: LIFT deductions for Q7 and Q8. Prefix dbpo corresponds to dbpedia-owl. 4.2 Does LIFT results resist to concurrency? We implemented a tool to shuffle several TPF logs according to different parameters. Thus, given E(Q1), ..., E(Qn), we are able to produce different significant representations of E(Q1 k ... k Qn). We grouped queries targeting the same dataset into a set of randomly chosen queries as shown i… view at source ↗

**Figure 8.** Figure 8: Precision and recall. 4.3 Evaluation of LIFT with real SPARQL queries The goal of this experiment is to evaluate to which extent LIFT is able to deduce BGPs of an important number of real user queries. We analyzed 10 hours of one day (2015-10-30) of the log of the DBpedia SPARQL endpoint of USEWOD 2016 dataset [5]. From 380,834 http requests containing SPARQL queries, we analyzed 14,259 queries that repres… view at source ↗

**Figure 9.** Figure 9: Recurrent BGPs extracted from the TPF log of USEWOD 2016. [PITH_FULL_IMAGE:figures/full_fig_p014_9.png] view at source ↗

read the original abstract

The Triple Pattern Fragment (TPF) approach is de-facto a new way to publish Linked Data at low cost and with high server availability. However, data providers hosting TPF servers are not able to analyze the SPARQL queries they execute because they only receive and evaluate queries with one triple pattern. In this paper, we propose LIFT: an algorithm to extract Basic Graph Patterns (BGPs) of executed queries from TPF server logs. Experiments show that LIFT extracts BGPs with good precision and good recall generating limited noise.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

LIFT gives a concrete algorithm for grouping single-triple TPF requests into BGPs using log order and timing, but the abstract supplies no numbers or test details to show it works.

read the letter

The paper introduces LIFT, an algorithm that reconstructs Basic Graph Patterns from the single-triple requests recorded in TPF server logs. TPF servers only ever see one triple pattern, so providers lose visibility into the actual multi-triple queries their users issue. LIFT tries to recover those patterns from the ordering and timing already present in the logs. That is the core new piece: a targeted method for this exact setting that had not been described before. The write-up does a clear job stating the practical problem for data providers who want usage analytics without changing the TPF protocol. The approach stays lightweight and stays within the information the logs already contain. The main weakness is the evaluation. The abstract states that experiments produced good precision and recall with limited noise, yet it gives no numbers, no dataset description, no baseline, and no account of how concurrent clients or noisy timing were handled. Without those details the central claim cannot be checked. The assumption that order and timing alone will reliably group triples also looks plausible on paper but could fail when requests overlap or when clients issue similar patterns close together. The paper is aimed at the small group of people running or studying TPF deployments in linked data. A reader who needs a starting point for query-pattern analysis in that narrow setting could use the algorithm description. It deserves peer review because the problem is real and the proposed solution is new in this context, even though the current version needs a proper experimental section to be convincing.

Referee Report

1 major / 0 minor

Summary. The paper proposes LIFT, an algorithm to extract Basic Graph Patterns (BGPs) from Triple Pattern Fragment (TPF) server logs. TPF servers receive and evaluate only single-triple-pattern queries, so full SPARQL queries cannot be analyzed directly. LIFT reconstructs BGPs by grouping single-triple requests using ordering and timing information present in the logs. The abstract states that experiments demonstrate good precision and recall while generating limited noise.

Significance. If the reconstruction claims hold, the work would allow TPF data providers to recover query-structure information from existing logs without protocol changes. This addresses a practical gap in Linked Data publishing by enabling workload analysis, caching improvements, and server optimization that are currently unavailable.

major comments (1)

[Abstract] Abstract: the central claim that LIFT extracts BGPs 'with good precision and good recall generating limited noise' is unsupported because the abstract supplies no quantitative metrics, no dataset descriptions, no baselines, and no experimental protocol. This absence makes the soundness of the reconstruction approach impossible to assess from the provided text.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for identifying this issue with the abstract. We agree that the current wording leaves the central experimental claims without sufficient supporting detail for readers to evaluate them directly from the abstract alone.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that LIFT extracts BGPs 'with good precision and good recall generating limited noise' is unsupported because the abstract supplies no quantitative metrics, no dataset descriptions, no baselines, and no experimental protocol. This absence makes the soundness of the reconstruction approach impossible to assess from the provided text.

Authors: We accept the referee's point. While the full paper contains the requested experimental details (datasets, metrics, baselines, and protocol), the abstract does not. In the revised manuscript we will expand the abstract to include concrete precision and recall figures, the number and nature of the evaluation datasets, and a brief statement of the experimental protocol so that the central claim is directly supported by numbers rather than qualitative phrasing. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper presents LIFT as an algorithm that groups single-triple requests from TPF logs into BGPs using ordering and timing data. No equations, fitted parameters, predictions, or self-citations appear in the abstract or description that would reduce any claim to its own inputs by construction. The central claim rests on experimental validation of precision/recall rather than any self-referential derivation or uniqueness theorem. This is a standard algorithmic contribution without detectable circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract supplies no equations, parameters, or background assumptions that can be audited.

pith-pipeline@v0.9.0 · 5615 in / 840 out tokens · 24003 ms · 2026-05-25T18:54:14.912814+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

16 extracted references · 16 canonical work pages

[1]

W. Beek, L. Rietveld, H. R. Bazoobandi, J. Wielemaker, and S. Schlobach. LOD Laundromat: A Uniform Way of Publishing Other People’s Dirty Data. InISWC Conference, 2014

work page 2014
[2]

M. A. Gallego, J. D. Fernández, M. A. Martínez-Prieto, and P. de la Fuente. An Empirical Study of Real-World SPARQL Queries. InUSEWOD workshop, 2011

work page 2011
[3]

J. Han, M. Kamber, and J. Pei.Data Mining: Concepts and Techniques . Elsevier, 2011

work page 2011
[4]

DetectingSPARQLQueryTemplatesforDataPrefetch- ing

J.LoreyandF.Naumann. DetectingSPARQLQueryTemplatesforDataPrefetch- ing. In ESWC Conference, 2013

work page 2013
[5]

Markus, A

L.-R. Markus, A. Saud, B. Bettina, and H. Laura. USEWOD Research Dataset.,

work page
[6]

http://dx.doi.org/10.5258/SOTON/385344

work page doi:10.5258/soton/385344
[7]

Möller, M

K. Möller, M. Hausenblas, R. Cyganiak, G. Grimnes, and S. Handschuh. Learning from Linked Open Data Usage: Patterns & Metrics. InWebSci10:Extending the Frontiers of Society On-Line , 2010

work page 2010
[8]

C. H. Mooney and J. F. Roddick. Sequential Pattern Mining–Approaches and Algorithms. ACM Computing Surveys (CSUR) , 45(2):19, 2013

work page 2013
[9]

Morsey, J

M. Morsey, J. Lehmann, S. Auer, and A.-C. N. Ngomo. DBpedia SPARQL Benchmark–Performance Assessment with Real Queries on Real Data. InISWC Conference, 2011

work page 2011
[10]

Nassopoulos, P

G. Nassopoulos, P. Serrano-Alvarado, P. Molli, and E. Desmontils. FETA: Feder- ated QuEry TrAcking for Linked Data. InDEXA Conference, 2016

work page 2016
[11]

Picalausa and S

F. Picalausa and S. Vansummeren. What are Real SPARQL Queries Like? In SWIM Workshop, 2011

work page 2011
[12]

Raghuveer

A. Raghuveer. Characterizing Machine Agent Behavior through SPARQL Query Mining. In USEWOD Workshop, 2012

work page 2012
[13]

Rietveld, R

L. Rietveld, R. Hoekstra, et al. Man vs. Machine: Diﬀerences in SPARQL queries. In USEWOD Workshop, 2014

work page 2014
[14]

M. V. Sande, R. Verborgh, J. V. Herwegen, E. Mannens, and R. V. de Walle. Op- portunistic Linked Data Querying Through Approximate Membership Metadata. In ISWC Conference, 2015

work page 2015
[15]

InitialUsageAnalysisofDBpedia’s Triple Pattern Fragments

R.Verborgh,E.Mannens,andR.VandeWalle. InitialUsageAnalysisofDBpedia’s Triple Pattern Fragments. InUSEWOD Workshop, 2015

work page 2015
[16]

Verborgh, M

R. Verborgh, M. Vander Sande, O. Hartig, J. Van Herwegen, L. De Vocht, B. De Meester, G. Haesendonck, and P. Colpaert. Triple Pattern Fragments: a Low-cost Knowledge Graph Interface for the Web. Journal of Web Semantics , 37–38, Mar. 2016

work page 2016

[1] [1]

W. Beek, L. Rietveld, H. R. Bazoobandi, J. Wielemaker, and S. Schlobach. LOD Laundromat: A Uniform Way of Publishing Other People’s Dirty Data. InISWC Conference, 2014

work page 2014

[2] [2]

M. A. Gallego, J. D. Fernández, M. A. Martínez-Prieto, and P. de la Fuente. An Empirical Study of Real-World SPARQL Queries. InUSEWOD workshop, 2011

work page 2011

[3] [3]

J. Han, M. Kamber, and J. Pei.Data Mining: Concepts and Techniques . Elsevier, 2011

work page 2011

[4] [4]

DetectingSPARQLQueryTemplatesforDataPrefetch- ing

J.LoreyandF.Naumann. DetectingSPARQLQueryTemplatesforDataPrefetch- ing. In ESWC Conference, 2013

work page 2013

[5] [5]

Markus, A

L.-R. Markus, A. Saud, B. Bettina, and H. Laura. USEWOD Research Dataset.,

work page

[6] [6]

http://dx.doi.org/10.5258/SOTON/385344

work page doi:10.5258/soton/385344

[7] [7]

Möller, M

K. Möller, M. Hausenblas, R. Cyganiak, G. Grimnes, and S. Handschuh. Learning from Linked Open Data Usage: Patterns & Metrics. InWebSci10:Extending the Frontiers of Society On-Line , 2010

work page 2010

[8] [8]

C. H. Mooney and J. F. Roddick. Sequential Pattern Mining–Approaches and Algorithms. ACM Computing Surveys (CSUR) , 45(2):19, 2013

work page 2013

[9] [9]

Morsey, J

M. Morsey, J. Lehmann, S. Auer, and A.-C. N. Ngomo. DBpedia SPARQL Benchmark–Performance Assessment with Real Queries on Real Data. InISWC Conference, 2011

work page 2011

[10] [10]

Nassopoulos, P

G. Nassopoulos, P. Serrano-Alvarado, P. Molli, and E. Desmontils. FETA: Feder- ated QuEry TrAcking for Linked Data. InDEXA Conference, 2016

work page 2016

[11] [11]

Picalausa and S

F. Picalausa and S. Vansummeren. What are Real SPARQL Queries Like? In SWIM Workshop, 2011

work page 2011

[12] [12]

Raghuveer

A. Raghuveer. Characterizing Machine Agent Behavior through SPARQL Query Mining. In USEWOD Workshop, 2012

work page 2012

[13] [13]

Rietveld, R

L. Rietveld, R. Hoekstra, et al. Man vs. Machine: Diﬀerences in SPARQL queries. In USEWOD Workshop, 2014

work page 2014

[14] [14]

M. V. Sande, R. Verborgh, J. V. Herwegen, E. Mannens, and R. V. de Walle. Op- portunistic Linked Data Querying Through Approximate Membership Metadata. In ISWC Conference, 2015

work page 2015

[15] [15]

InitialUsageAnalysisofDBpedia’s Triple Pattern Fragments

R.Verborgh,E.Mannens,andR.VandeWalle. InitialUsageAnalysisofDBpedia’s Triple Pattern Fragments. InUSEWOD Workshop, 2015

work page 2015

[16] [16]

Verborgh, M

R. Verborgh, M. Vander Sande, O. Hartig, J. Van Herwegen, L. De Vocht, B. De Meester, G. Haesendonck, and P. Colpaert. Triple Pattern Fragments: a Low-cost Knowledge Graph Interface for the Web. Journal of Web Semantics , 37–38, Mar. 2016

work page 2016