DiffICL breaks the quality-privacy tradeoff in small-data tabular synthesis by using in-context learning on pretrained structural priors to generate data that is both higher quality and less memorizing of training samples.
Ctsyn: A foundation model for cross tabular data generation
3 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 3representative citing papers
Proposes three metrics for inter-column logical relationships in synthetic tabular data and reports that current generators often fail to preserve them on an industrial dataset.
Off-the-shelf image diffusion models can be repurposed to create synthetic structured data capable of inducing ground truth drift in machine pipelines.
citing papers explorer
-
Breaking the Quality-Privacy Tradeoff in Tabular Data Generation via In-Context Learning
DiffICL breaks the quality-privacy tradeoff in small-data tabular synthesis by using in-context learning on pretrained structural priors to generate data that is both higher quality and less memorizing of training samples.
-
Evaluating Inter-Column Logical Relationships in Synthetic Tabular Data Generation
Proposes three metrics for inter-column logical relationships in synthetic tabular data and reports that current generators often fail to preserve them on an industrial dataset.
-
Repurposing Image Diffusion Models for Adversarial Synthetic Structured Data: A Case Study of Ground Truth Drift
Off-the-shelf image diffusion models can be repurposed to create synthetic structured data capable of inducing ground truth drift in machine pipelines.