{"paper":{"title":"Relational Database Data Lineage Ontology","license":"http://creativecommons.org/licenses/by/4.0/","headline":"A new ontology adds structural, semantic and transformation details to relational database lineage, improving knowledge-graph link prediction.","cross_cats":[],"primary_cat":"cs.DB","authors_text":"Jakub Dutkiewicz, Pawe{\\l} Misiorek, Robert Wrembel","submitted_at":"2026-05-15T15:29:56Z","abstract_excerpt":"Modeling data lineage in relational databases remains a challenging problem, particularly in scenarios involving incomplete or missing dependencies between database objects. In this paper, we propose a novel ontology for relational database data lineage, designed to provide a richer and more expressive semantic representation supporting discovering the lineage links by means of knowledge graphs (KGs). Building upon our previous work on KG-based lineage discovery, the proposed ontology extends the earlier model with additional concepts capturing structural, semantic, and transformation-level ch"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"Experimental results demonstrate that the application of the enriched semantic model leads to improvements in lineage link prediction performance, as measured by AUC and Hits@10 metrics.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"The evaluation assumes that any measured improvement in the graph neural network link prediction task is caused by the added ontology concepts rather than differences in graph construction, training details, or dataset characteristics.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"The authors extend a prior ontology for relational database data lineage with new concepts and report improved AUC and Hits@10 scores in graph neural network link prediction on knowledge graphs.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"A new ontology adds structural, semantic and transformation details to relational database lineage, improving knowledge-graph link prediction.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"380f363d305df1ae4cc6296dc9ef402fc83da92de6d9e98c1d97f15119ac4809"},"source":{"id":"2605.16068","kind":"arxiv","version":1},"verdict":{"id":"912b4bbe-968d-4c5a-a137-37ca0aaaec82","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-19T18:38:53.248529Z","strongest_claim":"Experimental results demonstrate that the application of the enriched semantic model leads to improvements in lineage link prediction performance, as measured by AUC and Hits@10 metrics.","one_line_summary":"The authors extend a prior ontology for relational database data lineage with new concepts and report improved AUC and Hits@10 scores in graph neural network link prediction on knowledge graphs.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"The evaluation assumes that any measured improvement in the graph neural network link prediction task is caused by the added ontology concepts rather than differences in graph construction, training details, or dataset characteristics.","pith_extraction_headline":"A new ontology adds structural, semantic and transformation details to relational database lineage, improving knowledge-graph link prediction."},"integrity":{"clean":true,"summary":{"advisory":0,"critical":0,"by_detector":{},"informational":0},"endpoint":"/pith/2605.16068/integrity.json","findings":[],"available":true,"detectors_run":[{"name":"doi_title_agreement","ran_at":"2026-05-19T19:01:18.979178Z","status":"completed","version":"1.0.0","findings_count":0},{"name":"doi_compliance","ran_at":"2026-05-19T18:51:54.855803Z","status":"completed","version":"1.0.0","findings_count":0},{"name":"ai_meta_artifact","ran_at":"2026-05-19T17:33:41.544929Z","status":"skipped","version":"1.0.0","findings_count":0},{"name":"claim_evidence","ran_at":"2026-05-19T16:41:55.515193Z","status":"completed","version":"1.0.0","findings_count":0}],"snapshot_sha256":"c79482ff0e3733d82763052cd9e35801c8301df936c98cf2a79fa40aee3e2627"},"references":{"count":38,"sample":[{"doi":"","year":2026,"title":"https://www.ibm.com/topics/data- lineage (Accessed Apr, 2026)","work_id":"f99e1a23-e149-4725-8fd6-28ca25312e63","ref_index":1,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2023,"title":"The Journal of Supercomputing80(3) (2023)","work_id":"d6f59a80-6701-4f3c-935b-e38ce0356623","ref_index":2,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2013,"title":"Bordes, A., Usunier, N., Garcia-Durán, A., Weston, J., Yakhnenko, O.: Translating embeddings for modeling multi-relational data. In: Int. Conf. on Neural Informa- tion Processing Systems (NIPS) Volume","work_id":"606812bf-ce13-42e6-bd7a-ab24f4ed9d9d","ref_index":3,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2025,"title":"In: AIDB Workshop @ VLDB","work_id":"af1140ff-95f6-45e8-b6f4-224899ea7ca2","ref_index":4,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2020,"title":"VLDB Endowment 14(4) (2020) 14 J","work_id":"e9aeceb4-8bc2-49af-88b7-93e06723e0a8","ref_index":5,"cited_arxiv_id":"","is_internal_anchor":false}],"resolved_work":38,"snapshot_sha256":"bc763c3404edf0f998c81705dd612c7b0654ba6e6c35a97677343a9951938fc6","internal_anchors":0},"formal_canon":{"evidence_count":2,"snapshot_sha256":"8c398d02313ee3dd6a4ff282d4012131178140521f1ae283d695f3244c58a72a"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}