Towards a Theory of Data-Diff: Optimal Synthesis of Succinct Data Modification Scripts
pith:TEUGPGAK Add to your LaTeX paper
What is a Pith Number?\usepackage{pith}
\pithnumber{TEUGPGAK}
Prints a linked pith:TEUGPGAK badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more
read the original abstract
This paper addresses the Data-Diff problem: given a dataset and a subsequent version of the dataset, find the shortest sequence of operations that transforms the dataset to the subsequent version, under a restricted family of operations. We consider operations similar to SQL UPDATE, each with a condition (WHERE) that matches a subset of tuples and a modifier (SET) that makes changes to those matched tuples. We characterize the problem based on different constraints on the attributes and the allowed conditions and modifiers, providing complexity classification and algorithms in each case.
This paper has not been read by Pith yet.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.