LLM tabular generators leak memorized numeric strings, allowing a no-box attack to achieve near-perfect membership inference on some state-of-the-art models.
-C., van der Schaar, M.: SynthCity: facilitating innovative use cases of synthetic data in different data modalities
5 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
GenTS is a modular benchmark library providing unified data pipelines, generative models, and evaluation metrics for time series synthesis, forecasting, and imputation, with open-source code and initial benchmarking experiments.
Adversaries can degrade synthetic data quality via small manipulations such as label flipping or feature-importance interventions, substantially harming downstream model performance and increasing statistical divergence from real data.
DECAF synthetic data generator best balances privacy and fairness while fairness pre-processing improves outcomes more on synthetic data than real data, though at some cost to predictive accuracy.
CTGAN and LLMs generate synthetic student data that passes statistical and predictive utility checks for learning analytics.
citing papers explorer
-
Quality Degradation Attack in Synthetic Data
Adversaries can degrade synthetic data quality via small manipulations such as label flipping or feature-importance interventions, substantially harming downstream model performance and increasing statistical divergence from real data.