A Pythonic Functional Approach for Semantic Data Harmonisation in the ILIAD Project
Pith reviewed 2026-05-15 18:50 UTC · model grok-4.3
The pith
Python functions encode ocean ontology patterns so data scientists generate valid RDF through simple calls.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that a Pythonic functional approach, implemented as libraries organised at three levels of abstraction, encodes the design patterns of the Ocean Information Model so that correct RDF for data harmonisation can be produced by ordinary function calls rather than by writing specialised mapping syntax or mastering semantic web details.
What carries the argument
The three-tier Python function libraries that encode Ocean Information Model (OIM) ontology design patterns, with low-level functions exposing raw RDF/OWL syntax, mid-level functions packaging reusable patterns, and high-level functions orchestrating complete harmonisation tasks.
If this is right
- Data scientists can perform harmonisation tasks without learning RML, OTTR, or semantic web syntax.
- High-level functions compose mid-level pattern functions to handle entire domain workflows.
- The generated RDF directly supports interoperable digital twins of the ocean.
- The same library structure applies to the ILIAD aquaculture pilot and other environmental data sets.
Where Pith is reading between the lines
- The same multi-level library pattern could be replicated for other modular ontology families outside ocean data.
- Integration into standard data-science notebooks might reduce transcription errors that arise when moving between mapping tools and Python analysis code.
- Equivalence tests against existing RML mappings on the same datasets would provide an independent check of semantic fidelity.
Load-bearing premise
The Python functions must correctly and completely encode the OIM design patterns so that every generated RDF graph remains semantically valid for all intended use cases.
What would settle it
Apply the high-level functions to a sample ILIAD aquaculture dataset, then check whether the resulting RDF validates against the OIM ontologies and supports expected digital-twin queries; mismatch or validation failure would falsify the claim.
Figures
read the original abstract
Semantic data harmonisation is a central requirement in the ILIAD project, where heterogeneous environmental data must be harmonised according to the Ocean Information Model (OIM), a modular family of ontologies for enabling the implementation of interoperable Digital Twins of the Ocean. Existing approaches to Semantic Data Harmonisation, such as RML and OTTR, offer valuable abstractions but require extensive knowledge of the technical intricacies of the OIM and the Semantic Web standards, including namespaces, IRIs, OWL constructors, and ontology design patterns. Furthermore, RML and OTTR oblige practitioners to learn specialised syntaxes and dedicated tooling. Data scientists in ILIAD have found these approaches overly cumbersome and have therefore expressed the need for a solution that abstracts away these technical details while remaining seamlessly integrated into their Python-based environments. To address these requirements, we have developed a Pythonic functional approach to semantic data harmonisation that enables users to produce correct RDF through simple function calls. The functions, structured as Python libraries, encode the design patterns of the OIM and are organised across multiple levels of abstraction. Low-level functions directly expose OWL and RDF syntax, mid-level functions encapsulate ontology design patterns, and high-level domain-specific functions orchestrate data harmonisation tasks by invoking mid-level functions. According to feedback from ILIAD data scientists, this approach satisfies their requirements and substantially enhances their ability to participate in harmonisation activities. In this paper, we present the details of our Pythonic functional approach to semantic data harmonisation and demonstrate its applicability within the ILIAD Aquaculture pilot.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents a layered Python library for semantic data harmonisation in the ILIAD project. Low-level functions expose OWL/RDF constructors, mid-level functions encapsulate OIM ontology design patterns, and high-level domain-specific functions allow data scientists to generate RDF via simple calls. The authors claim this abstracts away Semantic Web technicalities, integrates with Python workflows, and meets ILIAD requirements based on feedback from project data scientists, with a demonstration in the Aquaculture pilot.
Significance. If the functions correctly encode OIM patterns and produce valid RDF, the work could meaningfully lower barriers for domain experts to contribute to semantic interoperability in environmental data projects like Digital Twins of the Ocean, offering a practical alternative to RML/OTTR for Python-centric teams.
major comments (2)
- [Abstract and demonstration section] The central claim that the layered functions produce semantically valid RDF for all intended use cases rests solely on unspecified qualitative feedback from ILIAD data scientists. No explicit input-to-triple mappings, OWL reasoner results, SHACL validation outputs, test suites, or quantitative comparison of generated RDF against ground truth are reported anywhere in the manuscript, which is load-bearing for the correctness and applicability assertions.
- [Implementation and pilot demonstration] The libraries are described as encoding OIM design patterns but are not released, and the Aquaculture pilot demonstration provides no concrete examples of function calls, generated triples, or verification steps, leaving the mid- and high-level abstractions untested in the text.
minor comments (2)
- [Approach description] The paper would benefit from a table or figure explicitly mapping high-level function signatures to the OIM patterns they invoke.
- [Low-level functions] No discussion of edge cases, error handling, or namespace/IRI management in the low-level functions is provided.
Simulated Author's Rebuttal
We thank the referee for the constructive comments on our manuscript. We address the major points below and agree to strengthen the demonstration with concrete examples and mappings in a revision.
read point-by-point responses
-
Referee: [Abstract and demonstration section] The central claim that the layered functions produce semantically valid RDF for all intended use cases rests solely on unspecified qualitative feedback from ILIAD data scientists. No explicit input-to-triple mappings, OWL reasoner results, SHACL validation outputs, test suites, or quantitative comparison of generated RDF against ground truth are reported anywhere in the manuscript, which is load-bearing for the correctness and applicability assertions.
Authors: We acknowledge that the manuscript supports its central claim primarily through qualitative feedback from ILIAD data scientists rather than explicit technical validations such as input-to-triple mappings or SHACL outputs. This feedback directly reflects the project's requirements and usability for domain experts. In revision we will add representative input-to-triple mappings and verification steps drawn from the Aquaculture pilot to make the correctness claims more concrete and testable. revision: yes
-
Referee: [Implementation and pilot demonstration] The libraries are described as encoding OIM design patterns but are not released, and the Aquaculture pilot demonstration provides no concrete examples of function calls, generated triples, or verification steps, leaving the mid- and high-level abstractions untested in the text.
Authors: The manuscript focuses on the methodological and functional approach rather than serving as a software release note; the libraries remain internal to the ILIAD project at present. We agree that the pilot section lacks concrete illustrations. In the revised version we will insert explicit examples of high-level and mid-level function calls, the resulting triples, and the verification steps performed against OIM patterns, allowing readers to evaluate the abstractions directly from the text. revision: yes
Circularity Check
No circularity: software design description without derivational steps
full rationale
The paper describes a multi-level Python library that encodes OIM ontology design patterns into functions for RDF generation. No equations, fitted parameters, predictions, or uniqueness theorems appear. Core claims rest on qualitative user feedback rather than any self-referential derivation or self-citation chain. The work is a self-contained implementation report whose correctness assertions are external to any internal reduction.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption OIM ontology design patterns can be encoded in Python functions without semantic loss or incorrect RDF output.
invented entities (1)
-
High-level domain-specific harmonisation functions
no independent evidence
Reference graph
Works this paper leans on
-
[1]
maplib: interactive, literal RDF model mapping for industry.IEEE Access, 11:39990– 40005, 2023
Magnus Bakken. maplib: interactive, literal RDF model mapping for industry.IEEE Access, 11:39990– 40005, 2023
work page 2023
-
[2]
Mapping between rdf and xml with xsparql.Journal on Data Semantics, 1(3):147–185, 2012
Stefan Bischof, Stefan Decker, Thomas Krennwallner, Nuno Lopes, and Axel Polleres. Mapping between rdf and xml with xsparql.Journal on Data Semantics, 1(3):147–185, 2012
work page 2012
-
[3]
Jena: implementing the semantic web recommendations
Jeremy J Carroll, Ian Dickinson, Chris Dollin, Dave Reynolds, Andy Seaborne, and Kevin Wilkinson. Jena: implementing the semantic web recommendations. InProceedings of the 13th international World Wide Web conference on Alternate track papers & posters, pages 74–83, 2004
work page 2004
-
[4]
Andrea Cimmino and Raúl García-Castro. Helio: a framework for implementing the life cycle of knowledge graphs.Semantic Web, 15(1):223–249, 2024
work page 2024
-
[5]
Knut-Frode Dagestad, Johannes Röhrs, Øyvind Breivik, and Bjørn Ådlandsvik. Opendrift v1. 0: a generic framework for trajectory modelling.Geoscientific Model Development, 11(4):1405–1420, 2018
work page 2018
- [6]
-
[7]
Rml: A generic language for integrated rdf mappings of heterogeneous data
Anastasia Dimou, Miel Vander Sande, Pieter Colpaert, Ruben Verborgh, Erik Mannens, and Rik Van de Walle. Rml: A generic language for integrated rdf mappings of heterogeneous data. InProceedings of the Workshop on Linked Data on the Web (LDOW), 2014
work page 2014
-
[8]
InHandbook on ontologies, pages 221–243
AldoGangemiandValentina Presutti.Ontologydesignpatterns. InHandbook on ontologies, pages 221–243. Springer, 2009
work page 2009
-
[9]
Herminio García-González, Iovka Boneva, Sławek Staworko, José Emilio Labra-Gayo, and Juan Manuel Cueva Lovelle. Shexml: improving the usability of heterogeneous data mapping languages for first-time users.PeerJ Computer Science, 6:e318, 2020
work page 2020
-
[10]
QUDT – quantities, units, dimensions and data types ontologies.https://github
QUDT Working Group. QUDT – quantities, units, dimensions and data types ontologies.https://github. com/qudt/qudt-public-repo, 2025
work page 2025
-
[11]
Ontology design patterns in Webprotégé
Karl Hammar. Ontology design patterns in Webprotégé. InISWC 2015 Posters & Demonstrations Track co-located with the 14th International Semantic Web Conference (ISWC-2015), Betlehem, USA, October 11, 2015. CEUR-WS, 2015
work page 2015
-
[12]
Pieter Heyvaert, Ben De Meester, Anastasia Dimou, and Ruben Verborgh. Declarative rules for linked data generation at your fingertips! InEuropean Semantic Web Conference, pages 213–217. Springer, 2018
work page 2018
-
[13]
RMLMapper-JAVA.https://github.com/RMLio/rmlmapper-java, 2023
Pieter Heyvaert, Dylan Van Assche, Ben De Meester, Gerald Haesendonck, Els de Vleeschauwer, and Sitt Min Oo. RMLMapper-JAVA.https://github.com/RMLio/rmlmapper-java, 2023
work page 2023
-
[14]
Pascal Hitzler, Aldo Gangemi, Krzysztof Janowicz, Adila Krisnadhi, and Valentina Presutti.Ontology engineering with ontology design patterns: foundations and applications, volume 25. IOS Press, 2016
work page 2016
-
[15]
The OWL API: A Java API for OWL ontologies.Semantic web, 2(1):11–21, 2011
Matthew Horridge and Sean Bechhofer. The OWL API: A Java API for OWL ontologies.Semantic web, 2(1):11–21, 2011
work page 2011
-
[16]
A PYTHONIC FUNCTIONAL APPROACH FOR SEMANTIC DATA HARMONISATION 17
PaulHudak.Conception, evolution, andapplicationoffunctionalprogramminglanguages.ACM Computing Surveys (CSUR), 21(3):359–411, 1989. A PYTHONIC FUNCTIONAL APPROACH FOR SEMANTIC DATA HARMONISATION 17
work page 1989
-
[17]
Sdm-rdfizer: An rml interpreter for the efficient creation of rdf knowledge graphs
Enrique Iglesias, Samaneh Jozashoori, David Chaves-Fraga, Diego Collarana, and Maria-Esther Vidal. Sdm-rdfizer: An rml interpreter for the efficient creation of rdf knowledge graphs. InProceedings of the 29th ACM international conference on Information & Knowledge Management, pages 3039–3046, 2020
work page 2020
-
[18]
Iliad: Digital twin of the ocean.https://ocean-twin.eu/, 2025
ILIAD Consortium. Iliad: Digital twin of the ocean.https://ocean-twin.eu/, 2025. Project website. Accessed: 2025-02-22
work page 2025
-
[19]
Ocean Information Model (OIM).https://github.com/ILIAD-ocean-twin/OIM, 2025
ILIAD Consortium. Ocean Information Model (OIM).https://github.com/ILIAD-ocean-twin/OIM, 2025
work page 2025
-
[20]
Kgtk: a toolkit for large knowledge graph manipulation and analysis
FilipIlievski, DanielGarijo, HansChalupsky, NarenTejaDivvala, YixiangYao, CraigRogers, RongpengLi, Jun Liu, Amandeep Singh, Daniel Schwabe, et al. Kgtk: a toolkit for large knowledge graph manipulation and analysis. InInternational Semantic Web Conference, pages 278–293. Springer, 2020
work page 2020
-
[21]
Robot: a tool for automating ontology workflows.BMC bioinformatics, 20(1):407, 2019
Rebecca C Jackson, James P Balhoff, Eric Douglass, Nomi L Harris, Christopher J Mungall, and James A Overton. Robot: a tool for automating ontology workflows.BMC bioinformatics, 20(1):407, 2019
work page 2019
-
[22]
Krzysztof Janowicz, Armin Haller, Simon JD Cox, Danh Le Phuoc, and Maxime Lefrançois. SOSA: A lightweight ontology for sensors, observations, samples, and actuators.Journal of Web Semantics, 56:1–10, 2019
work page 2019
-
[23]
Eugene Kindler and Ivan Krivy. Object-oriented simulation of systems with sophisticated control.Inter- national Journal of General Systems, 40(3):313–343, 2011
work page 2011
-
[24]
Shapes Constraint Language (SHACL).https://www.w3
Holger Knublauch and Dimitris Kontokostas. Shapes Constraint Language (SHACL).https://www.w3. org/TR/shacl/, 2017. W3C Recommendation
work page 2017
-
[25]
RDFLib.https://github.com/RDFLib/rdflib, 2025
Daniel Krech et al. RDFLib.https://github.com/RDFLib/rdflib, 2025
work page 2025
-
[26]
Generic ontologies and generic ontology design patterns
Bernd Krieg-Brückner and Till Mossakowski. Generic ontologies and generic ontology design patterns. In WOP@ ISWC, 2017
work page 2017
-
[27]
Owlready2.https://github.com/pwin/owlready2, 2024
Jean-Baptiste Lamy. Owlready2.https://github.com/pwin/owlready2, 2024
work page 2024
-
[28]
Maxime Lefrançois, Raphaël Troncy, and Fabien Gandon. Sparql-generate: A sparql extension for gener- ating rdf from heterogeneous data.Proceedings of the 14th Extended Semantic Web Conference (ESWC), 2017
work page 2017
-
[29]
The Semantic Web takes Wing: Programming Ontologies with Tawny-OWL
Phillip Lord. The semantic web takes wing: Programming ontologies with tawny-owl.arXiv preprint arXiv:1303.0213, 2013
work page internal anchor Pith review Pith/arXiv arXiv 2013
- [30]
-
[31]
pyOTTR.https://github.com/Callidon/pyOTTR
Thomas Minier et al. pyOTTR.https://github.com/Callidon/pyOTTR
-
[32]
The distributed ontology, modeling and specification language–dol
Till Mossakowski, Mihai Codescu, Fabian Neuhaus, and Oliver Kutz. The distributed ontology, modeling and specification language–dol. InThe Road to Universal Logic: Festschrift for the 50th Birthday of Jean- Yves Béziau Volume II, pages 489–520. Springer, 2015
work page 2015
-
[33]
Barentswatch API.https://www.barentswatch.no
Norwegian Coastal Administration. Barentswatch API.https://www.barentswatch.no
-
[34]
Morph-RDB.https://github.com/oeg-upm/morph-rdb, 2020
Ontology Engineering Group, UPM. Morph-RDB.https://github.com/oeg-upm/morph-rdb, 2020
work page 2020
-
[35]
Ontop.https://github.com/ontop/ontop, 2024
Ontop Team. Ontop.https://github.com/ontop/ontop, 2024
work page 2024
-
[36]
Jinja documentation.https://jinja.palletsprojects.com/, 2024
Pallets Projects. Jinja documentation.https://jinja.palletsprojects.com/, 2024. Accessed: 2025-02-03
work page 2024
-
[37]
Shape expressions: an rdf validation and transformation language
Eric Prud’hommeaux, Jose Emilio Labra Gayo, and Harold Solbrig. Shape expressions: an rdf validation and transformation language. InProceedings of the 10th International Conference on Semantic Systems, pages 32–40, 2014
work page 2014
-
[38]
Comodide–the comprehensive modular ontology engineering ide
Cogan Shimizu and Karl Hammar. Comodide–the comprehensive modular ontology engineering ide. In ISWC 2019 Satellite Tracks (Posters & Demonstrations, Industry, and Outrageous Ideas) co-located with 18th International Semantic Web Conference (ISWC 2019) Auckland, New Zealand, October 26-30, 2019., volume 2456, pages 249–252. CEUR-WS, 2019
work page 2019
-
[39]
MODL: A Modular Ontology Design Library
Cogan Shimizu, Quinn Hirt, and Pascal Hitzler. Modl: A modular ontology design library.arXiv preprint arXiv:1904.05405, 2019
work page internal anchor Pith review Pith/arXiv arXiv 1904
-
[40]
Martin Georg Skjæveland and Leif Harald Karlsen. The reasonable ontology templates framework.Trans- actions on Graph Data and Knowledge, 2(2):5–1, 2024
work page 2024
-
[41]
TARQL: SPARQL for Tables.https://tarql.github.io/, 2019
Tarql Contributors. TARQL: SPARQL for Tables.https://tarql.github.io/, 2019
work page 2019
-
[42]
W3C Semantic Web standards.https://www.w3.org/2001/sw/wiki/Main_ Page
Word Wide Web Consortium. W3C Semantic Web standards.https://www.w3.org/2001/sw/wiki/Main_ Page
work page 2001
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.