ORiGAMi synthesizes sparse semi-structured mixed-type JSON data using path-encoded autoregressive tokenization and schema constraints, outperforming flattened tabular baselines on 17 of 18 fidelity, detection, and utility metrics while keeping privacy above 96%.
InAdvances in Neural Information Processing Systems, Vol
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.LG 2years
2026 2representative citing papers
CausalSynth combines structural causal models with LLMs and iterative verification to produce synthetic data that respects given causal structures while remaining linguistically natural.
citing papers explorer
-
Autoregressive Synthesis of Sparse and Semi-Structured Mixed-Type Data
ORiGAMi synthesizes sparse semi-structured mixed-type JSON data using path-encoded autoregressive tokenization and schema constraints, outperforming flattened tabular baselines on 17 of 18 fidelity, detection, and utility metrics while keeping privacy above 96%.
-
CasualSynth: Generating Structurally Sound Synthetic Data
CausalSynth combines structural causal models with LLMs and iterative verification to produce synthetic data that respects given causal structures while remaining linguistically natural.