pith. the verified trust layer for science. sign in

arxiv: 1801.06258 · v1 · pith:TEUGPGAKnew · submitted 2018-01-19 · 💻 cs.DB

Towards a Theory of Data-Diff: Optimal Synthesis of Succinct Data Modification Scripts

classification 💻 cs.DB
keywords datasetoperationsdata-diffproblemsubsequenttuplesversionaddresses
0
0 comments X p. Extension
Add this Pith Number to your LaTeX paper What is a Pith Number?
\usepackage{pith}
\pithnumber{TEUGPGAK}

Prints a linked pith:TEUGPGAK badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

read the original abstract

This paper addresses the Data-Diff problem: given a dataset and a subsequent version of the dataset, find the shortest sequence of operations that transforms the dataset to the subsequent version, under a restricted family of operations. We consider operations similar to SQL UPDATE, each with a condition (WHERE) that matches a subset of tuples and a modifier (SET) that makes changes to those matched tuples. We characterize the problem based on different constraints on the attributes and the allowed conditions and modifiers, providing complexity classification and algorithms in each case.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.