Efficiently Computing Provenance Graphs for Queries with Negation

Bertram Ludaescher; Boris Glavic; Seokki Lee; Sven Koehler

arxiv: 1701.05699 · v1 · pith:YIRJQN2Enew · submitted 2017-01-20 · 💻 cs.DB

Efficiently Computing Provenance Graphs for Queries with Negation

Seokki Lee , Sven Koehler , Bertram Ludaescher , Boris Glavic This is my paper

classification 💻 cs.DB

keywords provenanceapproachqueriescomputenegationquestionquestionsanswer

0 comments

read the original abstract

Explaining why an answer is in the result of a query or why it is missing from the result is important for many applications including auditing, debugging data and queries, and answering hypothetical questions about data. Both types of questions, i.e., why and why-not provenance, have been studied extensively. In this work, we present the first practical approach for answering such questions for queries with negation (first-order queries). Our approach is based on a rewriting of Datalog rules (called firing rules) that captures successful rule derivations within the context of a Datalog query. We extend this rewriting to support negation and to capture failed derivations that explain missing answers. Given a (why or why-not) provenance question, we compute an explanation, i.e., the part of the provenance that is relevant to answer the question. We introduce optimizations that prune parts of a provenance graph early on if we can determine that they will not be part of the explanation for a given question. We present an implementation that runs on top of a relational database using SQL to compute explanations. Our experiments demonstrate that our approach scales to large instances and significantly outperforms an earlier approach which instantiates the full provenance to compute explanations.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Provenance for Large-scale Datalog
cs.PL 2019-07 unverdicted novelty 6.0

New provenance lattice with proof annotations and minimal-height query mechanism for debugging large-scale Datalog analyses, implemented in Souffle with 1.27x average overhead.