A comprehensive survey of synthetic tabular data generation

[Shi et al · 2025 · arXiv 2504.16506

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

read on arXiv browse 7 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Autoregressive Synthesis of Sparse and Semi-Structured Mixed-Type Data

cs.LG · 2026-03-02 · conditional · novelty 8.0

ORiGAMi synthesizes sparse semi-structured mixed-type JSON data using path-encoded autoregressive tokenization and schema constraints, outperforming flattened tabular baselines on 17 of 18 fidelity, detection, and utility metrics while keeping privacy above 96%.

Mamba-Based Graph Convolutional Networks: Tackling Over-smoothing with Selective State Space

cs.LG · 2025-01-26 · unverdicted · novelty 7.0

MbaGCN combines message aggregation, selective state space transitions, and node state prediction to create a more scalable deep graph convolutional network.

TabKDE: Simple and Scalable Tabular Data Generation with Kernel Density Estimates

cs.LG · 2026-05-17 · conditional · novelty 6.0

TabKDE generates synthetic tabular data using copula transformations followed by kernel density estimation, matching prior accuracy with negligible training time and reduced storage via coresets.

Generative AI-Based Monte Carlo Simulation for Method Evaluation Using Synthetic Multilevel Data

stat.ME · 2026-05-07 · unverdicted · novelty 6.0

A framework using generative AI to produce synthetic multilevel data for Monte Carlo simulations that evaluate the performance and parameter recovery of quantitative methods.

SAGE: Sparse Adaptive Guidance for Dependency-Aware Tabular Data Generation

cs.LG · 2026-04-27 · unverdicted · novelty 6.0

SAGE improves LLM-based synthetic tabular data generation by enforcing sparse, value-adaptive dependency guidance, yielding higher fidelity and 10% better downstream F1 scores than prior methods.

Enhancing Tabular Anomaly Detection via Pseudo-Label-Guided Generation

cs.AI · 2026-04-20 · unverdicted · novelty 6.0

PLAG boosts tabular anomaly detection by using pseudo-label-guided synthetic anomaly generation with a two-stage filter, achieving SOTA results and lifting F1 scores by 0.08-0.21 when added to existing detectors.

Tabular Foundation Model for Generative Modelling

cs.LG · 2026-05-10 · unverdicted · novelty 5.0

TabFORGE generates high-quality synthetic tabular data by leveraging pretrained causality-aware representations in a two-stage diffusion-decoder architecture that mitigates latent distribution shifts.

citing papers explorer

Showing 7 of 7 citing papers.

Autoregressive Synthesis of Sparse and Semi-Structured Mixed-Type Data cs.LG · 2026-03-02 · conditional · none · ref 33
ORiGAMi synthesizes sparse semi-structured mixed-type JSON data using path-encoded autoregressive tokenization and schema constraints, outperforming flattened tabular baselines on 17 of 18 fidelity, detection, and utility metrics while keeping privacy above 96%.
Mamba-Based Graph Convolutional Networks: Tackling Over-smoothing with Selective State Space cs.LG · 2025-01-26 · unverdicted · none · ref 23
MbaGCN combines message aggregation, selective state space transitions, and node state prediction to create a more scalable deep graph convolutional network.
TabKDE: Simple and Scalable Tabular Data Generation with Kernel Density Estimates cs.LG · 2026-05-17 · conditional · none · ref 153
TabKDE generates synthetic tabular data using copula transformations followed by kernel density estimation, matching prior accuracy with negligible training time and reduced storage via coresets.
Generative AI-Based Monte Carlo Simulation for Method Evaluation Using Synthetic Multilevel Data stat.ME · 2026-05-07 · unverdicted · none · ref 44
A framework using generative AI to produce synthetic multilevel data for Monte Carlo simulations that evaluate the performance and parameter recovery of quantitative methods.
SAGE: Sparse Adaptive Guidance for Dependency-Aware Tabular Data Generation cs.LG · 2026-04-27 · unverdicted · none · ref 2
SAGE improves LLM-based synthetic tabular data generation by enforcing sparse, value-adaptive dependency guidance, yielding higher fidelity and 10% better downstream F1 scores than prior methods.
Enhancing Tabular Anomaly Detection via Pseudo-Label-Guided Generation cs.AI · 2026-04-20 · unverdicted · none · ref 27
PLAG boosts tabular anomaly detection by using pseudo-label-guided synthetic anomaly generation with a two-stage filter, achieving SOTA results and lifting F1 scores by 0.08-0.21 when added to existing detectors.
Tabular Foundation Model for Generative Modelling cs.LG · 2026-05-10 · unverdicted · none · ref 74
TabFORGE generates high-quality synthetic tabular data by leveraging pretrained causality-aware representations in a two-stage diffusion-decoder architecture that mitigates latent distribution shifts.

A comprehensive survey of synthetic tabular data generation

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer