Rule Applicability on RDF Triplestore Schemas
Pith reviewed 2026-05-25 10:04 UTC · model grok-4.3
The pith
A query rewriting method decides rule applicability on RDF triplestore schemas and produces output schemas.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Given a triplestore schema and a set of rules, rule applicability can be decided and an output schema produced by rewriting the rules into queries that check for matches within the schema constraints, avoiding the need to construct a full canonical instance of the schema.
What carries the argument
The query rewriting technique that transforms rule application checks into schema queries to determine applicability without materializing data instances.
Load-bearing premise
The triplestore schema is assumed to exactly capture the family of RDF graphs to which the rules might be applied.
What would settle it
A counterexample schema and rule set where the rewriting method and the critical instance method produce different applicability results or different output schemas.
Figures
read the original abstract
Rule-based systems play a critical role in health and safety, where policies created by experts are usually formalised as rules. When dealing with increasingly large and dynamic sources of data, as in the case of Internet of Things (IoT) applications, it becomes important not only to efficiently apply rules, but also to reason about their applicability on datasets confined by a certain schema. In this paper we define the notion of a triplestore schema which models a set of RDF graphs. Given a set of rules and such a schema as input we propose a method to determine rule applicability and produce output schemas. Output schemas model the graphs that would be obtained by running the rules on the graph models of the input schema. We present two approaches: one based on computing a canonical (critical) instance of the schema, and a novel approach based on query rewriting. We provide theoretical, complexity and evaluation results that show the superior efficiency of our rewriting approach.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper defines a triplestore schema as a structure that models a set of RDF graphs. Given a set of rules and such a schema, it proposes two methods to decide rule applicability and produce an output schema modeling the graphs after rule application: (1) computation of a canonical (critical) instance of the schema, and (2) a novel query-rewriting approach. Theoretical results, complexity analysis, and experimental evaluation are claimed to establish that the rewriting method is more efficient.
Significance. If the soundness and complexity claims hold, the work would be significant for rule-based policy systems in IoT and safety-critical domains by enabling applicability checks and schema evolution without materializing full instances. The dual-method presentation (canonical instance plus rewriting) and the explicit comparison of their efficiency constitute a clear contribution if the derivations are gap-free.
major comments (2)
- [Abstract, §1] Abstract and §1: the central efficiency claim rests on the rewriting approach being sound and complete for the given schema; however, the manuscript provides no mechanism, proof obligation, or discussion showing how a schema designer can guarantee that the input schema exactly captures (neither over- nor under-approximates) the intended family of RDF graphs. Any mismatch renders both the applicability decision and the produced output schema incorrect for the real data.
- [§4] §4 (or the section presenting the rewriting reduction): the claim that the rewriting approach yields a parameter-free decision procedure is load-bearing for the superiority result, yet the reduction appears to inherit the same schema-fidelity precondition as the canonical-instance method; without an explicit statement of the preconditions under which the rewriting is guaranteed to be both sound and complete, the complexity comparison cannot be evaluated.
minor comments (2)
- [§2] Notation for the triplestore schema (e.g., how constraints on predicates and objects are encoded) should be introduced with a small running example before the formal definitions.
- [Evaluation] The experimental section should report the precise schema sizes, rule counts, and query-rewriting times rather than aggregated speed-up factors alone.
Simulated Author's Rebuttal
We thank the referee for the constructive comments. We address the concerns about schema fidelity and the preconditions for the rewriting method point by point.
read point-by-point responses
-
Referee: [Abstract, §1] Abstract and §1: the central efficiency claim rests on the rewriting approach being sound and complete for the given schema; however, the manuscript provides no mechanism, proof obligation, or discussion showing how a schema designer can guarantee that the input schema exactly captures (neither over- nor under-approximates) the intended family of RDF graphs. Any mismatch renders both the applicability decision and the produced output schema incorrect for the real data.
Authors: The triplestore schema is defined as an input that models a set of RDF graphs, and both methods (canonical instance and rewriting) are proven sound and complete with respect to this schema. The paper focuses on the problem of rule applicability checking given such a schema, rather than on how to construct or verify the schema itself. This is analogous to how query rewriting in databases assumes a given schema. We agree that a discussion of this modeling assumption would strengthen the paper and will add a paragraph in §1 clarifying that the results hold relative to the provided schema, and that fidelity to real-world data is the designer's responsibility. No automated mechanism is provided because it would require access to the actual instance or additional oracles, which is outside the paper's scope. revision: partial
-
Referee: [§4] §4 (or the section presenting the rewriting reduction): the claim that the rewriting approach yields a parameter-free decision procedure is load-bearing for the superiority result, yet the reduction appears to inherit the same schema-fidelity precondition as the canonical-instance method; without an explicit statement of the preconditions under which the rewriting is guaranteed to be both sound and complete, the complexity comparison cannot be evaluated.
Authors: We will revise the presentation in §4 to explicitly state the preconditions: the input must be a valid triplestore schema (as defined in §3), and the rules are standard Datalog-style rules over RDF. Under these conditions, the rewriting is sound and complete, and parameter-free in the sense that it does not require choosing additional parameters such as instance sizes or approximations. The complexity results compare the two methods under identical assumptions on the input. We believe this clarification will address the concern without altering the core claims. revision: yes
Circularity Check
No circularity: algorithmic constructions are self-contained
full rationale
The paper defines triplestore schemas as modeling sets of RDF graphs and presents two direct algorithmic methods (canonical instance and query rewriting) to decide rule applicability and produce output schemas. These are supported by theoretical results, complexity analysis, and evaluation, with no reduction of any central claim to a fitted parameter, self-citation chain, or definitional equivalence. The input schema assumption is stated explicitly but does not create a circular derivation; the methods operate on the given schema without smuggling in the target result by construction.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption A triplestore schema can be defined that models a set of RDF graphs
- domain assumption Rule applicability can be decided by testing on a canonical instance or by query rewriting
invented entities (1)
-
triplestore schema
no independent evidence
Forward citations
Cited by 1 Pith paper
-
A Blueprint for AI-Driven Software Quality: Integrating LLMs with Established Standards
Survey mapping LLM applications in software quality assurance to established standards including ISO/IEC 12207, ISO 25010, CMMI, and TMM, with case studies, challenges, and future directions.
Reference graph
Works this paper leans on
-
[1]
Addison-Wesley Longman Publishing Co., Inc
Abiteboul, S., Hull, R., Vianu, V .: Foundations of databases: the logical level. Addison-Wesley Longman Publishing Co., Inc. (1995)
work page 1995
-
[2]
CoRR abs/1801.09061 (2018), http://arxiv
Bassiliades, N.: SWRL2SPIN: A tool for transforming SWRL rule bases in OWL ontologies to object-oriented SPIN rules. CoRR abs/1801.09061 (2018), http://arxiv. org/abs/1801.09061
-
[3]
Beimel, D., Peleg, M.: Editorial: Using owl and swrl to represent and reason with situation-based access control policies. Data Knowl. Eng. 70(6), 596–615 (2011)
work page 2011
-
[4]
In: Proceedings of the 36th ACM SIGMOD- SIGACT-SIGAI Symposium on Principles of Database Systems
Benedikt, M., Konstantinidis, G., Mecca, G., Motik, B., Papotti, P., Santoro, D., Tsamoura, E.: Benchmarking the chase. In: Proceedings of the 36th ACM SIGMOD- SIGACT-SIGAI Symposium on Principles of Database Systems. pp. 37–52. ACM (2017)
work page 2017
-
[5]
Semantic Web 8(3), 471–487 (2017)
Calvanese, D., Cogrel, B., Komla-Ebri, S., Kontchakov, R., Lanti, D., Rezk, M., Rodriguez-Muro, M., Xiao, G.: Ontop: Answering SPARQL queries over relational databases. Semantic Web 8(3), 471–487 (2017)
work page 2017
-
[6]
IEEE Transactions on Knowledge and Data Engineering 1(1), 146–166 (1989)
Ceri, S., Gottlob, G., Tanca, L.: What you always wanted to know about datalog (and never dared to ask). IEEE Transactions on Knowledge and Data Engineering 1(1), 146–166 (1989)
work page 1989
-
[7]
Chaochaisit, W., Sakamura, K., Koshizuka, N., Bessho, M.: CSV-X: A Linked Data Enabled Schema Lan- guage, Model, and Processing Engine for Non-Uniform CSV. 2016 IEEE Int. Conf. on Internet of Things (iThings) and IEEE Green Computing and Communica- tions (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (Smart- Data) pp...
work page 2016
-
[8]
W3C Recom- mendation, W3C (2014), http://www.w3.org/TR/2014/ REC-rdf11-concepts-20140225/
Cyganiak, R., Wood, D., Markus Lanthaler, G.: RDF 1.1 Concepts and Abstract Syntax. W3C Recom- mendation, W3C (2014), http://www.w3.org/TR/2014/ REC-rdf11-concepts-20140225/
work page 2014
-
[9]
ACM Computing Surveys (CSUR) 33(3), 374– 425 (2001)
Dantsin, E., Eiter, T., Gottlob, G., V oronkov, A.: Complexity and expressive power of logic program- ming. ACM Computing Surveys (CSUR) 33(3), 374– 425 (2001)
work page 2001
-
[10]
International Journal of Prod- uct Lifecycle Management 7(1), 75–93 (2014), pMID: 65458
Fortineau, V ., Fiorentini, X., Paviot, T., Louis-Sidney, L., Lamouri, S.: Expressing formal rules within ontology-based models using SWRL: an application to the nuclear industry. International Journal of Prod- uct Lifecycle Management 7(1), 75–93 (2014), pMID: 65458
work page 2014
-
[11]
In: International Semantic Web Conference
Glimm, B., Kazakov, Y ., Liebig, T., Tran, T.K., Vialard, V .: Abstraction refinement for ontology materialization. In: International Semantic Web Conference. pp. 180–
-
[12]
In: 26th International Conference on World Wide Web Companion
Gyrard, A., Serrano, M., Jares, J.B., Datta, S.K., Ali, M.I.: Sensor-based Linked Open Rules (S-LOR): An Automated Rule Discovery Approach for IoT Applica- tions and Its Use in Smart Cities. In: 26th International Conference on World Wide Web Companion. pp. 1153–
-
[13]
W3C Recommendation, W3C (2013), https://www.w3
Harris, S., Seaborne, A.: SPARQL 1.1 Query Language. W3C Recommendation, W3C (2013), https://www.w3. org/TR/sparql11-query/
work page 2013
-
[14]
Hecham, A., Bisquert, P., Croitoru, M.: On the Chase for All Provenance Paths with Existential Rules. In: Rules and Reasoning. pp. 135–150. Springer Interna- tional Publishing (2017)
work page 2017
-
[15]
W3C Member Submission, W3C (2004), https://www.w3.org/Submission/SWRL/
Horrocks, I., Patel-Schneider, P.F., Boley, H., Tabet, S., Grosof, B., Dean, M., et al.: SWRL: A se- mantic web rule language combining OWL and RuleML. W3C Member Submission, W3C (2004), https://www.w3.org/Submission/SWRL/
work page 2004
-
[16]
International Labour Organization: Act No. 6331 on Occupational Health and Safety (2012), https://www.ilo.org/dyn/natlex/natlex4.detail?p lang= fr&p isn=92011
work page 2012
-
[17]
W3C Member Submission, W3C (2011), http://www.w3.org/Submission/spin-sparql/
Knublauch, H.: SPIN - SPARQL Syntax. W3C Member Submission, W3C (2011), http://www.w3.org/Submission/spin-sparql/
work page 2011
-
[18]
W3C Recommendation, W3C (2017), https://www.w3.org/TR/shacl/
Knublauch, H., Kontokostas, D.: Shapes constraint lan- guage (SHACL). W3C Recommendation, W3C (2017), https://www.w3.org/TR/shacl/
work page 2017
-
[19]
W3C Recommendation, W3C (2017), https:// www.w3.org/TR/2017/REC-vocab-ssn-20171019/
Lefranc ¸ois, M., Cox, S., Taylor, K., Haller, A., Janow- icz, K., Phuoc, D.L.: Semantic Sensor Network On- tology. W3C Recommendation, W3C (2017), https:// www.w3.org/TR/2017/REC-vocab-ssn-20171019/
work page 2017
- [20]
-
[21]
IEEE Communications Surveys Tu- torials 16(1), 414–454 (2014)
Perera, C., Zaslavsky, A., Christen, P., Georgakopou- los, D.: Context Aware Computing for The Internet of Things: A Survey. IEEE Communications Surveys Tu- torials 16(1), 414–454 (2014)
work page 2014
-
[22]
ACM Transactions on Database Systems 34(3), 16:1–16:45 (2009)
P´erez, J., Arenas, M., Gutierrez, C.: Semantics and Complexity of SPARQL. ACM Transactions on Database Systems 34(3), 16:1–16:45 (2009)
work page 2009
-
[23]
In: Proceedings of the 10th Inter- national Conference on Semantic Systems
Prud’hommeaux, E., Labra Gayo, J.E., Solbrig, H.: Shape Expressions: An RDF Validation and Transfor- mation Language. In: Proceedings of the 10th Inter- national Conference on Semantic Systems. pp. 32–40. SEM ’14, ACM (2014)
work page 2014
-
[24]
Building Blocks for IoT Analytics p
Serrano, M., Gyrard, A.: A Review of Tools for IoT Se- mantics and Data Streaming Analytics. Building Blocks for IoT Analytics p. 139 (2016)
work page 2016
-
[25]
In: Proceedings of the Fourteenth Annual ACM Symposium on Theory of Computing
Vardi, M.Y .: The Complexity of Relational Query Lan- guages (Extended Abstract). In: Proceedings of the Fourteenth Annual ACM Symposium on Theory of Computing. pp. 137–146. STOC ’82, ACM, New York, NY , USA (1982) A Proof of Theorem 1 Theorem 1 For all rules r : A→ C and triplestore schemas S, I(score(S, r)) = I(critical(S, r)). Proof: Since the mappings...
work page 1982
-
[26]
If tA[i] and tS A[i] are both constants, then sincetI A∈ m(A), it must be true that tA[i] = tI A[i]. Moreover, since tS A must be a model oftI A, it follows thattS A[i] = tI A[i] and therefore tS A[i] = tI A[i]. We set element tq[i] to be the constant tA[i], which matches the constant tS A[i]
-
[27]
We set element tq[i] to be the constant :λ, which matches the constant tS A[i]
If tA[i] is a constant but tS A[i] is a variable, then we know that tS A[i] = :λ. We set element tq[i] to be the constant :λ, which matches the constant tS A[i]
-
[28]
Therefore mapping m must contain binding ?v→ x
If tA[i] is a variable ?v and tS A[i] is a constant x, then we know that tI A[i] = tS A[i] = x. Therefore mapping m must contain binding ?v→ x. We set element tq[i] to be the variable ?v, so that when we complete the construction of q, m∗ will contain mapping ?v→ x
-
[29]
If tA[i] and tS A[i] are both variables, it must be that tS A[i] = :λ. If it exists a triple t′ A ∈ A and a modelling triple tS′ A of m(t′ A), and position j such that t′ A[i] is variable ?v and tS′ A [i] is a constant, then we set element tq[i] to be the constant :λ (tS A[i] will be :λ in our sandobox graph). Note that even though we don’t use variablet′...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.