pith. sign in

arxiv: 1907.00855 · v1 · pith:NBVYEOSBnew · submitted 2019-07-01 · 💻 cs.PL

Type Checking Program Code using SHACL (Extended Version)

Pith reviewed 2026-05-25 11:23 UTC · model grok-4.3

classification 💻 cs.PL
keywords SHACLtype checkingRDFlambda calculustype safetyquery processinggraph constraints
0
0 comments X

The pith

Deriving SHACL shapes from queries lets a lambda calculus type-check RDF code and prove it avoids runtime errors.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that shapes can be extracted from queries over RDF data and then combined with existing data shapes to act as types. These combined shapes are embedded in a lambda calculus that comes with a formal type safety proof. A programmer therefore writes simpler code that assumes the presence and cardinality of attributes guaranteed by the shapes. The resulting programs are guaranteed not to fail at runtime on graphs that satisfy the shapes, apart from the usual case of empty lists. This approach separates the expression of data constraints from the logic of the processing code.

Core claim

TyCuS derives SHACL shapes from queries and integrates data shapes and query shapes as types into a lambda-calculus. The system supplies formal rules for this integration together with a proof of type safety. A programmer can therefore process RDF data with simplified, type-checked code that will not encounter run-time errors on valid graphs, except for the cases type checking cannot prevent such as accessing empty lists.

What carries the argument

TyCuS, the type system that extracts SHACL shapes from queries, merges them with data shapes, and uses the result as types inside a lambda calculus to guarantee safety.

If this is right

  • Programmers can omit explicit checks for missing or multi-valued attributes whenever the shapes guarantee their presence and cardinality.
  • Type safety is preserved under the actual semantics of query evaluation on RDF graphs.
  • The same code works without modification on any graph that satisfies the combined shapes.
  • The only remaining runtime exceptions are those that shapes cannot rule out, such as empty-list accesses.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The derivation of shapes from queries could be applied to other constraint languages that operate on graph data.
  • Embedding the same mechanism inside existing languages that process RDF might allow static checking of real query code.
  • Separating constraints from processing logic may reduce the amount of defensive code needed when data schemas evolve.

Load-bearing premise

Shapes derived from queries can be soundly combined with data shapes to form a type system whose safety proof holds for actual RDF graphs and query evaluation semantics.

What would settle it

A concrete program, query-derived shape, data shape, and valid RDF graph where the program passes the type checker yet still produces a runtime error during query evaluation.

Figures

Figures reproduced from arXiv: 1907.00855 by Claudia Schon, Martin Leinberger, Philipp Seifer, Ralf L\"ammel, Steffen Staab.

Figure 1
Figure 1. Figure 1: Sample RDF data graph G1. 1 print (x. studiesAt ) The Shapes Constraint Language (SHACL) is a recent W3C recommendation [13] set out to allow for formulating integrity constraints. By now, a proposal for its formal se￾mantics has been formulated by the research community [7] and SHACL shape graphs can be used to validate given data graphs. [13] itself states that: SHACL shape graphs [...] may be used for a… view at source ↗
Figure 2
Figure 2. Figure 2: SHACL constraints for RDF data graph G1. Listing 1: Program that may produce a run-time error. 1 map (fun x -> x .? X. age ) ( query { 2 SELECT ?X WHERE { ?X rdf:type ex:Student.} }) Contributions We propose a type checking procedure based on SHACL shapes being used as types. We assume that a program queries an—possibly evolving—RDF data graph that is validated by a SHACL shape graph. Our contributions are… view at source ↗
Figure 3
Figure 3. Figure 3: Syntax of PCQs. Semantics Evaluating a query q, follows standard semantics. We use r(G) to denote the evaluation of a property path expression r on a RDF graph G, which consists of all 1 As we use plain RDF, we do not differentiate between distinguished and existential variables [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Illustration of a problematic, recursive case. [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Evaluation rules of constraints. To illustrate constraint evaluation, consider the representation of “UniversityShape”, φUniversity = (≥1 studiesAt−.sStudent∧ ≥1 locatedIn.⊤) again. For G1, an assign￾ment σ1 may for example assign sStudent to the node bob (sStudent ∈ σ1(bob)). The evaluation of φUniversity for the node b1 using σ1 then looks as follows: (1)JφUniversity K b1,G1,σ1 = (2) and (4) (2)J≥1 studi… view at source ↗
Figure 6
Figure 6. Figure 6: Inference rules for inferring a set of shapes from the [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Basic example for multiple faithful assignments. [PITH_FULL_IMAGE:figures/full_fig_p012_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Abstract syntax of λSHACL. letrec x : T1 = t1 in t2 def = let x = fix (λx : T1.t1) in t2 [PITH_FULL_IMAGE:figures/full_fig_p013_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Syntactical abbreviations. When evaluating queries, evaluation rules turn query results into lists of records whereas answer variables are used as record labels. Lastly, in case of a projection for a graph node, the label is interpreted as a property path and the graph is traversed accordingly. Even though not explicitly mentioned in the syntax, we sometimes add parenthesis to terms for clarification. Valu… view at source ↗
Figure 10
Figure 10. Figure 10: Reduction rules of λSHACL. Least upper bound For a few constructs, e.g., if-then-else expressions, require the least upper bound of two types T1 and T2 has to be constructed through an operator lub (see [PITH_FULL_IMAGE:figures/full_fig_p015_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Reduction rules for lists of a λ-calculus. requires a new shape name for which we assume a function genName. As a new shape is constructed, lub does not only return a type, but also a set of shapes. The remaining cases are standard. For lists (T1 list and T2 list), the least upper bounds of the base types lub(T1, T2, S) is constructed. Likewise, for two functions T11 → T12 and T21 → T22, the greatest lowe… view at source ↗
Figure 12
Figure 12. Figure 12: Least upper bound of two types. which is being returned as newly created shapes. The actual type of a query then com￾prises a list of records. Each record contains one label per answer variable whereas the type of each label is the respective shape name for the query variable. Likewise, pro￾jections on graph nodes (T-NODEPROJ) create a new shape name s ′ using a function genName based on the old shape nam… view at source ↗
Figure 13
Figure 13. Figure 13: Typing rules for λSHACL [PITH_FULL_IMAGE:figures/full_fig_p025_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Subtyping rules. . . . S, Γ ⊢ t1 : t ′ 1, T1 list, S1 S, Γ ⊢ head t1 : head t ′ 1, T1, S1 (T-HEAD) S, Γ ⊢ t1 : t ′ 1, s, S1 S ∪ S ′ ∪ {stmp, =1 l.⊤, ∅} ⊢ s <: stmp genName(s) = s ′ S, Γ ⊢ t1.l : head t ′ 1.l, s′ , S1 ∪ {s ′ , ≥1 l − .s, ∅} (T-NPROJ-1) S, Γ ⊢ t1 : t ′ 1, s, S1 S ∪ S ′ ∪ {stmp, =1 l.⊤, ∅} 6⊢ s <: stmp S ∪ S ′ ∪ {stmp, ≥1 l.⊤, ∅} ⊢ s <: stmp genName(s) = s ′ S, Γ ⊢ t1.l : t ′ 1.l, s′ list, {… view at source ↗
Figure 15
Figure 15. Figure 15: Type system with type elaboration (excerpt). [PITH_FULL_IMAGE:figures/full_fig_p026_15.png] view at source ↗
read the original abstract

It is a strength of graph-based data formats, like RDF, that they are very flexible with representing data. To avoid run-time errors, program code that processes highly-flexible data representations exhibits the difficulty that it must always include the most general case, in which attributes might be set-valued or possibly not available. The Shapes Constraint Language (SHACL) has been devised to enforce constraints on otherwise random data structures. We present our approach, Type checking using SHACL (TyCuS), for type checking code that queries RDF data graphs validated by a SHACL shape graph. To this end, we derive SHACL shapes from queries and integrate data shapes and query shapes as types into a $\lambda$-calculus. We provide the formal underpinnings and a proof of type safety for TyCuS. A programmer can use our method in order to process RDF data with simplified, type checked code that will not encounter run-time errors (with usual exceptions as type checking cannot prevent accessing empty lists).

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The paper introduces TyCuS for type-checking code that queries RDF graphs validated by SHACL shapes. It derives SHACL shapes from queries, integrates these with data shapes as types into a λ-calculus, and claims to provide formal underpinnings together with a proof of type safety. The goal is to allow simplified code that avoids runtime errors when processing flexible RDF data (except for cases such as empty lists).

Significance. If the type-safety proof is valid and the query-shape derivation is shown to preserve the semantics of actual RDF graph matching and SPARQL evaluation, the work would supply a useful formal bridge between SHACL validation and typed programming languages. The explicit construction of a typed λ-calculus and the stated proof are strengths that could support safer RDF processing.

major comments (1)
  1. [section on integration into lambda calculus] Section on integration into lambda calculus: the type-safety proof is presented for the abstract TyCuS λ-calculus, yet no separate soundness lemma is described that relates query-derived shapes back to the operational semantics of SPARQL query evaluation on concrete RDF graphs; without this link the safety guarantee does not transfer to the intended practical use case.
minor comments (1)
  1. [Abstract] Abstract: the parenthetical remark on 'usual exceptions' for empty lists could be expanded to clarify the precise boundary of the safety guarantee.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful reading and the identification of this gap. We address the single major comment below.

read point-by-point responses
  1. Referee: Section on integration into lambda calculus: the type-safety proof is presented for the abstract TyCuS λ-calculus, yet no separate soundness lemma is described that relates query-derived shapes back to the operational semantics of SPARQL query evaluation on concrete RDF graphs; without this link the safety guarantee does not transfer to the intended practical use case.

    Authors: We agree that the current presentation does not contain an explicit soundness lemma relating the query-derived shapes to the operational semantics of SPARQL. The derivation of shapes from queries is defined so that satisfaction of the derived shape corresponds to the query result, but we have not stated and proved the required preservation property with respect to concrete SPARQL evaluation. In the revised manuscript we will add a soundness lemma (in the section on integration into the λ-calculus) that establishes: if a SPARQL query Q evaluated on a graph G yields a result that satisfies the shape derived from Q, then the corresponding TyCuS type is preserved under the operational semantics of the calculus. This will make the transfer of the type-safety guarantee explicit. revision: yes

Circularity Check

0 steps flagged

Independent formal type-safety construction; no reductions to inputs by definition

full rationale

The paper defines a lambda-calculus embedding of SHACL shapes (both data shapes and query-derived shapes) and proves type safety inside that calculus. No equations equate a derived quantity to a fitted parameter, no self-citation supplies a uniqueness theorem that forces the central result, and no ansatz is smuggled via prior work. The construction is self-contained as a syntactic type system whose safety proof is internal to the presented calculus; any gap between the abstract model and concrete RDF/SPARQL semantics is a soundness question, not a circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The approach rests on standard lambda-calculus metatheory and the assumption that SHACL shape derivation preserves query semantics; no free parameters or new entities are introduced.

axioms (1)
  • standard math Standard type safety properties of the simply-typed lambda calculus hold when extended with SHACL-derived types.
    Invoked when the paper states it integrates shapes as types and proves safety.

pith-pipeline@v0.9.0 · 5706 in / 1066 out tokens · 21370 ms · 2026-05-25T11:23:09.673743+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

25 extracted references · 25 canonical work pages

  1. [1]

    (eds.): The Description Logic Handbook : Theory, Implementation, and Ap- plications

    Baader, F., et al. (eds.): The Description Logic Handbook : Theory, Implementation, and Ap- plications. Cambridge University Press (2003)

  2. [2]

    In: Proc

    Bierman, G.M., et al.: The Essence of Data Access in C omega. In: Proc. ECOOP 2005. pp. 287–311 (2005)

  3. [3]

    In: Proc

    Bischof, S., et al.: Schema-Agnostic Query Rewriting in S PARQL 1.1. In: Proc. ISWC 2014. pp. 584–600. LNCS, Springer (2014)

  4. [4]

    In: Proc

    Boneva, I., et al.: Semantics and V alidation of Shapes Sch emas for RDF. In: Proc. ISWC

  5. [5]

    10587, pp

    LNCS, vol. 10587, pp. 104–120. Springer (2017)

  6. [6]

    In: Proc

    Carroll, J.J., et al.: Jena: implementing the semantic we b recommendations. In: Proc. WWW

  7. [7]

    pp. 74–83. ACM (2004) 24 Martin Leinberger, Philipp Seifer, Claudia Schon, Ralf L ämmel, Steffen Staab

  8. [8]

    Ciobanu, G., et al.: Minimal type inference for Linked Dat a consumers. J. Log. Algebr. Meth. Program. 84(4), 485–504 (2015)

  9. [9]

    In: Proc

    Corman, J., et al.: Semantics and V alidation of Recursive SHACL. In: Proc. ISWC 2018. pp. 318–336. LNCS, Springer (2018)

  10. [10]

    Science of Computer Pro- gramming 89, Part A, 2 – 22 (2014)

    Horne, R., et al.: A verified algebra for read-write Linked Data. Science of Computer Pro- gramming 89, Part A, 2 – 22 (2014)

  11. [11]

    Semantic Web 2(1), 11–21 (2011)

    Horridge, M., et al.: The OWL API: A Java API for OWL ontolog ies. Semantic Web 2(1), 11–21 (2011)

  12. [12]

    ACM Transactions on Programming Languages and Systems 23(3), 396–450 (May 2001)

    Igarashi, A., et al.: Featherweight Java: A Minimal Core Calculus for Java and GJ. ACM Transactions on Programming Languages and Systems 23(3), 396–450 (May 2001)

  13. [13]

    In: Proc

    Käfer, T., et al.: Rule-based Programming of User Agents for Linked Data. In: Proc. Linked Data on the Web. CEUR Workshop Proceedings, CEUR-WS.org (20 18)

  14. [14]

    In: Proc

    Kalyanpur, A., et al.: Automatic Mapping of OWL Ontologi es into Java. In: Proc. Software Engineering & Knowledge Engineering (SEKE) 2004. pp. 98–10 3 (2004)

  15. [15]

    W3C Recommendation (2017), https://www.w3.org/TR/shacl/

    Knublauch, H., et al.: Shapes Constraint Language (SHAC L). W3C Recommendation (2017), https://www.w3.org/TR/shacl/

  16. [16]

    In: Proc

    Leinberger, M., et al.: Semantic Web Application Develo pment with LITEQ. In: Proc. ISWC. pp. 212–227. LNCS, Springer (2014)

  17. [17]

    In: Proc

    Leinberger, M., et al.: The Essence of Functional Progra mming on Semantic Data. In: Proc. European Symp. on Programming. pp. 750–776. LNCS, Springer (2017)

  18. [18]

    I n: Proc

    Motik, B., et al.: Adding Integrity Constraints to OWL. I n: Proc. OWLED 2007. CEUR Workshop Proceedings, vol. 258. CEUR-WS.org (2007)

  19. [19]

    In: Proc

    Paar, A., et al.: Zhi# - OWL Aware Compilation. In: Proc. o f ESWC. pp. 315–329. LNCS, Springer (2011)

  20. [20]

    In: Proc

    Picalausa, F., et al.: A Structural Approach to Indexing Triples. In: Proc. ESWC 2012. pp. 406–421. LNCS, Springer (2012)

  21. [21]

    The MIT P ress (2002)

    Pierce, B.C.: Types and Programming Languages. The MIT P ress (2002)

  22. [22]

    Prud’hommeaux, E., et al.: SPARQL Query Language for RDF . W3C Rec. (Nov 2013), https://www.w3.org/TR/rdf-sparql-query/

  23. [23]

    In: Proc

    Prud’hommeaux, E., et al.: Shape expressions: an RDF val idation and transformation lan- guage. In: Proc. SEMANTICS 2014. pp. 32–40. ACM (2014)

  24. [24]

    Programming Journal 3(3), 13 (2019)

    Seifer, P ., et al.: Semantic Query Integration With Reas on. Programming Journal 3(3), 13 (2019)

  25. [25]

    In: Proc

    Tao, J., et al.: Integrity Constraints in OWL. In: Proc. A AAI 2010. AAAI Press (2010) Type Checking Program Code using SHACL (Extended V ersion) 2 5 S∪ SP , Γ ⊢ t1 : T1, S 1 S, Γ ⊢ SP , t 1 : T1, S 1 (T-PROGRAM) S, Γ ⊢ t1 : T1, S 1 S∪ S1, (Γ, x : T1)⊢ t2 : T2, S 2 S, Γ ⊢ let x = t1 in t2 : T2, S 1∪ S2 (T-LET) S, Γ ⊢ t1 : T1→ T1, S 1 S, Γ ⊢ fix t1 : T1, S ...