A new catalog classifying 35 data error types into missing, incorrect, and redundant categories for tabular data, with definitions and examples to improve data quality management.
Maupong, Dimane Mpoeleng, Thabo Semong, Banyatsang Mphago, and Oteng Tabona
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
PuckTrick library adds controlled imperfections to synthetic data and shows that models trained on the resulting contaminated data outperform those trained on clean synthetic data in financial dataset experiments.
citing papers explorer
-
A Catalog of Data Errors
A new catalog classifying 35 data error types into missing, incorrect, and redundant categories for tabular data, with definitions and examples to improve data quality management.
-
PuckTrick: A Library for Making Synthetic Data More Realistic
PuckTrick library adds controlled imperfections to synthetic data and shows that models trained on the resulting contaminated data outperform those trained on clean synthetic data in financial dataset experiments.