LLM tabular generators leak memorized numeric strings, allowing a no-box attack to achieve near-perfect membership inference on some state-of-the-art models.
”TabDDPM: Modelling Tab- ular Data with Diffusion Models”.https://arxiv.org/abs/2209.15421
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
COMPASS formalizes HPC configuration questions as ML tasks on traces, quantifies recommendation trustworthiness, and delivers 65.93% lower average job turnaround time plus 80.93% lower node usage versus prior methods in simulator tests.
RDDG is an in-context learning system with dynamic guidance and automatic quality feedback that synthesizes high-fidelity relational data to improve imbalanced classification.
Diffusion models via DDPM work for anomaly detection but are slow; the proposed DTE method estimates diffusion time distribution analytically and with a neural net to deliver faster inference while outperforming DDPM on ADBench for unsupervised and semi-supervised settings.
A temporal extension of TabDDPM generates coherent synthetic time-series sequences on the WISDM dataset that match real distributions and support downstream classification with macro F1 of 0.64.
citing papers explorer
-
When Tables Leak: Attacking String Memorization in LLM-Based Tabular Data Generation
LLM tabular generators leak memorized numeric strings, allowing a no-box attack to achieve near-perfect membership inference on some state-of-the-art models.
-
COMPASS: A Unified Decision-Intelligence System for Navigating Performance Trade-off in HPC
COMPASS formalizes HPC configuration questions as ML tasks on traces, quantifies recommendation trustworthiness, and delivers 65.93% lower average job turnaround time plus 80.93% lower node usage versus prior methods in simulator tests.
-
Self-Reinforcing Controllable Synthesis of Rare Relational Data via Bayesian Calibration
RDDG is an in-context learning system with dynamic guidance and automatic quality feedback that synthesizes high-fidelity relational data to improve imbalanced classification.
-
On Diffusion Modeling for Anomaly Detection
Diffusion models via DDPM work for anomaly detection but are slow; the proposed DTE method estimates diffusion time distribution analytically and with a neural net to deliver faster inference while outperforming DDPM on ADBench for unsupervised and semi-supervised settings.
-
Extending Tabular Denoising Diffusion Probabilistic Models for Time-Series Data Generation
A temporal extension of TabDDPM generates coherent synthetic time-series sequences on the WISDM dataset that match real distributions and support downstream classification with macro F1 of 0.64.