pith. sign in

arxiv: 2605.05652 · v2 · pith:EBRDFLDQnew · submitted 2026-05-07 · 💻 cs.LG

Information-Preserving Domain Transfer with Unlabeled Data in Misspecified Simulation-Based Inference

Pith reviewed 2026-05-19 17:12 UTC · model grok-4.3

classification 💻 cs.LG
keywords simulation-based inferencedomain adaptationmodel misspecificationunlabeled datainformation preservationposterior inferencemachine learning
0
0 comments X

The pith

SPIN improves posterior inference in misspecified simulation-based inference by using information-preserving domain transfer with unlabeled real-world data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper proposes SPIN to address the problem of model misspecification in simulation-based inference where the simulator does not perfectly match real-world observations. It does this by learning a domain transfer that moves labeled simulator data to the real domain and back, using the labels to keep the mutual information with the parameters intact. At test time, real observations are mapped to the simulator domain using the learned map for inference. A sympathetic reader cares because this allows better use of simulators in real applications without needing labeled real data or perfect simulator match, and the benefits grow with greater misspecification.

Core claim

The central discovery is that by performing bidirectional domain transfer on labeled simulator observations and encouraging preservation of parameter-relevant mutual information through the use of original labels, the resulting real-to-simulator transport map enables accurate SBI posteriors from unlabeled real-world data even under misspecification.

What carries the argument

The SPIN framework's cycle-based domain transfer mechanism that uses simulator labels during training to preserve parameter-relevant mutual information in the learned transport maps.

If this is right

  • Posterior inference can be performed on real-world data by first mapping it to the simulator domain using the trained transport.
  • Accuracy gains are larger when the degree of simulator misspecification is higher.
  • The method requires only unpaired unlabeled real data and labeled simulator data.
  • It applies to both synthetic and physical real-world tasks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar information-preserving transfers could improve domain adaptation in other scientific inference problems.
  • Testing SPIN on a wider range of misspecification types might reveal when the preservation holds best.
  • Integrating this with likelihood-free methods could further enhance robustness.

Load-bearing premise

The transport map learned from cycling simulator to real and back must retain sufficient mutual information between the observations and the parameters so that inference on the mapped real data remains reliable.

What would settle it

Running SPIN on benchmarks where misspecification is systematically increased and checking if the posterior error stops decreasing or starts increasing compared to baselines without the information preservation step.

Figures

Figures reproduced from arXiv: 2605.05652 by Eunho Jeong, Hyeonjin Kim, Joon Jang, Kyu Sung Choi.

Figure 1
Figure 1. Figure 1: Overview of SPIN. During training, a labeled simulator observation (θ, xs) is translated toward the real-world domain as xsr and returned to the simulator domain as xsrs. Since xsrs retains the simulator-origin label, SPIN uses θ to maximize the MI lower bound in Eq. (2), encouraging parameter-relevant information to be preserved after transport. At test time, an unlabeled real-world observation xr is mapp… view at source ↗
Figure 2
Figure 2. Figure 2: Posterior comparison across methods on matched Pendulum samples under increasing misspecification. Each column shows a matched simulator/real-world example sharing the same parameter, with the real-world observation generated under a different damping strength α. In the stronger damping settings, SPIN shows posterior mass closer to the reference parameter. These exam￾ples provide qualitative posterior comp… view at source ↗
Figure 3
Figure 3. Figure 3: Sensitivity to misspecification strength on Pendulum. Box plots summarize performance over five independent runs at three damping levels α ∈ {0.05, 0.25, 0.5}. Under weak shift, methods are relatively close, while SPIN shows clearer gains as damping increases. SPIN (w/o Linfo) removes simulator-labeled information-preservation supervision, highlighting the contribution of Linfo under stronger misspecificat… view at source ↗
Figure 4
Figure 4. Figure 4: Effect of information-preservation supervision during training. We plot the mean over five runs together with run-to-run variability for Linfo in Eq. (3), comparing the transport-only variant (λinfo = 0) with the full method (λinfo = 1). The difference is smallest on SIR and becomes larger on Pendulum, Wind Tunnel, and Light Tunnel. SIR PendulumWind Tunnel Light Tunnel 0 10 20 30 40 50 RMSE Nr =10 Nr =100 … view at source ↗
Figure 5
Figure 5. Figure 5: Performance across unlabeled real-world data budgets. We vary Nr, the number of unlabeled real-world observations used during adaptation, while keeping Ns fixed, and report performance across tasks over five independent runs. The effect of the real-world data budget depends on the task and metric. This pattern is consistent with the role of unlabeled real-world observations in SPIN. The real￾world pool pro… view at source ↗
Figure 6
Figure 6. Figure 6: evaluates xsrs using RMSE, LPP, and ACAUC. Since xsrs is generated from labeled simulator samples, this analysis checks whether Linfo increases the posterior density assigned to the original simulator parameter on the transported observations it directly supervises. SIR PendulumWind Tunnel Light Tunnel 0 5 10 15 20 25 30 35 RMSE Without info With info ( info =1.0) SIR PendulumWind Tunnel Light Tunnel 10 2 … view at source ↗
Figure 7
Figure 7. Figure 7: reports the simulator-side NPE risk Ls for λinfo = 0 and λinfo = 1. 0 50 100 Epoch 1.0 1.5 2.0 2.5 3.0 3.5 4.0 s SIR Without info With info ( info =1) 0 50 100 Epoch 1 2 3 4 5 6 s Pendulum 0 50 100 Epoch 1 0 1 2 s Wind Tunnel 0 20 40 Epoch 0 2 4 6 8 s Light Tunnel view at source ↗
Figure 8
Figure 8. Figure 8: reports the calibration curves corresponding to the ACAUC values used in the main results [14, 18, 48, 49]. 0.25 0.50 0.75 Expected Coverage 0.0 0.2 0.4 0.6 0.8 1.0 Observed Coverage SIR ideal NPE NPE-MMD NPE-DANN SPIN (w/o info) SPIN ACAUC NPE: -0.0322 NPE-MMD: -0.0327 NPE-DANN: -0.0477 SPIN (w/o info): -0.0111 SPIN: -0.0252 0.25 0.50 0.75 Expected Coverage Observed Coverage Pendulum ACAUC NPE: 0.2646 NPE… view at source ↗
Figure 9
Figure 9. Figure 9: varies the strength of information-preservation supervision while keeping the rest of the training setup fixed. SIR PendulumWind Tunnel Light Tunnel 0 10 20 30 40 RMSE Without info With info ( info =0.1) With info ( info =0.5) With info ( info =1) SIR PendulumWind Tunnel Light Tunnel 10 3 10 2 10 1 10 0 0 10 0 10 1 LPP SIR PendulumWind Tunnel Light Tunnel 0.0 0.1 0.2 0.3 ACAUC view at source ↗
Figure 10
Figure 10. Figure 10: visualizes the learned observation-space transport and is included only as qualitative support. 0 100 200 300 Time step 0.00 0.01 0.02 0.03 0.04 Signal value SIR xs xsr xsrs 0 50 100 150 200 Time step 0.4 0.2 0.0 0.2 0.4 Pendulum 0 20 40 Time step 0.0 2.5 5.0 7.5 10.0 12.5 Wind Tunnel Light Tunnel 0 100 200 300 Time step 0.00 0.01 0.02 0.03 0.04 Signal value xs xr xrs 0 50 100 150 200 Time step 0.2 0.0 0.… view at source ↗
read the original abstract

Simulation-based inference (SBI) provides amortized Bayesian parameter inference from simulator-generated data without requiring explicit likelihood evaluation. Its reliability can degrade under model misspecification, where real-world observations are not well represented by the simulator used for training. Existing methods using unlabeled real-world data often align simulated and real-world data distributions, but marginal alignment alone does not directly preserve parameter-relevant information needed for posterior inference. We propose SPIN, an SBI framework with parameter-relevant information-preserving domain transfer using unlabeled, unpaired real-world observations. During training, SPIN translates labeled simulator observations toward the real-world domain and back to the simulator domain, using the original simulator labels to encourage domain transfer that preserves parameter-relevant mutual information. At test time, the learned real-to-simulator transport maps real-world observations into the simulator domain for posterior inference, without requiring real-world parameter labels or paired real--simulator observations. Across controlled synthetic and physical real-world benchmarks, SPIN improves real-world posterior inference, with the improvement becoming clearer as misspecification increases.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces SPIN, a framework for simulation-based inference (SBI) under misspecification. It learns a real-to-simulator transport map from unpaired unlabeled real observations by training on cycle-consistent translations of labeled simulator observations (sim-to-real-to-sim) that use the original simulator labels to encourage preservation of parameter-relevant mutual information. At test time the learned map sends real observations into the simulator domain for standard amortized posterior inference. Experiments on controlled synthetic and physical real-world benchmarks report improved posterior accuracy, with larger gains as misspecification increases.

Significance. If the transport map reliably preserves the mutual information between observations and parameters on real data, SPIN would offer a practical route to improve SBI posteriors when simulators are misspecified by exploiting readily available unpaired real observations. The cycle-consistent, label-guided construction is a concrete algorithmic contribution that directly targets the information-preservation gap left by marginal-alignment baselines.

major comments (2)
  1. The central claim that the learned real-to-simulator map preserves parameter-relevant mutual information rests on cycle consistency enforced only on simulator data. No direct constraint or diagnostic is supplied for how the map behaves on real observations whose support lies outside the simulator manifold under increasing misspecification. A concrete verification (e.g., conditional mutual-information estimate or posterior calibration on held-out simulator data after transport) is required to substantiate that the downstream SBI posterior remains accurate.
  2. The experimental results attribute gains to the information-preserving mechanism, yet the reported benchmarks do not include an ablation that isolates the contribution of the label-guided cycle loss versus a simpler marginal-alignment baseline. Without this comparison it is unclear whether the observed improvements are specifically due to mutual-information preservation or to generic domain alignment.
minor comments (2)
  1. The abstract refers to 'physical real-world benchmarks' without naming the specific tasks or simulators; a short parenthetical description would improve readability.
  2. Notation for the forward and backward transport maps and the two domains is introduced gradually; an early diagram or consolidated definition table would reduce reader effort.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment below and will incorporate revisions to strengthen the presentation of SPIN's information-preservation properties and experimental validation.

read point-by-point responses
  1. Referee: The central claim that the learned real-to-simulator map preserves parameter-relevant mutual information rests on cycle consistency enforced only on simulator data. No direct constraint or diagnostic is supplied for how the map behaves on real observations whose support lies outside the simulator manifold under increasing misspecification. A concrete verification (e.g., conditional mutual-information estimate or posterior calibration on held-out simulator data after transport) is required to substantiate that the downstream SBI posterior remains accurate.

    Authors: We agree that the primary training signal uses simulator data and that direct verification on real observations is inherently limited by the absence of parameter labels. The cycle-consistent, label-guided objective is intended to encourage preservation of parameter-relevant information through the composition of maps, with the assumption that this generalizes to real data. To address the request for concrete verification, we will add in revision: (i) posterior calibration and accuracy metrics on held-out simulator observations after round-trip transport under controlled misspecification, and (ii) an analysis of transport behavior on synthetic out-of-support observations generated by increasing simulator misspecification. We will also explicitly discuss the practical difficulty of direct conditional mutual-information estimation on unlabeled real data and present the downstream posterior improvements as supporting (if indirect) evidence. revision: yes

  2. Referee: The experimental results attribute gains to the information-preserving mechanism, yet the reported benchmarks do not include an ablation that isolates the contribution of the label-guided cycle loss versus a simpler marginal-alignment baseline. Without this comparison it is unclear whether the observed improvements are specifically due to mutual-information preservation or to generic domain alignment.

    Authors: The manuscript already compares SPIN against marginal-alignment baselines and reports larger gains under increasing misspecification, which we interpret as evidence for the value of label-guided information preservation. Nevertheless, we acknowledge that an explicit ablation isolating the label-guided cycle loss would make this attribution clearer. We will add such an ablation in the revision, including a variant that uses cycle consistency without parameter labels and a direct comparison against a pure marginal-alignment objective, to quantify the incremental benefit of the information-preserving component. revision: yes

Circularity Check

0 steps flagged

No significant circularity in SPIN algorithmic framework

full rationale

The paper introduces SPIN as a new algorithmic procedure for domain transfer in misspecified SBI: it trains real-to-simulator and simulator-to-real maps via cycle consistency on unpaired data while using simulator labels to encourage preservation of parameter-relevant mutual information. This construction is not self-definitional or tautological; the mutual-information objective is an explicit training loss, not a renaming of the downstream posterior target. No equations reduce the claimed improvement to a fitted quantity defined on the same data by construction. The central claims rest on empirical evaluation across synthetic and physical benchmarks rather than a closed-form derivation or load-bearing self-citation chain. The method is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The framework rests on the existence of a transport map that can be learned from unpaired data while preserving parameter-relevant information; no explicit free parameters, axioms, or invented entities are named in the abstract.

pith-pipeline@v0.9.0 · 5718 in / 1084 out tokens · 31839 ms · 2026-05-19T17:12:14.128838+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

56 extracted references · 56 canonical work pages

  1. [1]

    The frontier of simulation-based inference

    Kyle Cranmer, Johann Brehmer, and Gilles Louppe. The frontier of simulation-based inference. Proceedings of the National Academy of Sciences, 117(48):30055–30062, 2020. doi: 10.1073/ pnas.1912789117. URLhttps://doi.org/10.1073/pnas.1912789117

  2. [2]

    Michael Deistler, Jan Boelts, Peter Steinbach, Guy Moss, Thomas Moreau, Manuel Gloeckler, Pedro L. C. Rodrigues, Julia Linhart, Janne K. Lappalainen, Benjamin Kurt Miller, Pedro J. Gonçalves, Jan-Matthis Lueckmann, Cornelius Schröder, and Jakob H. Macke. Simulation- based inference: A practical guide, 2025. URLhttps://arxiv.org/abs/2508.12939

  3. [3]

    Fastϵ-free inference of simulation models with Bayesian conditional density estimation

    George Papamakarios and Iain Murray. Fastϵ-free inference of simulation models with Bayesian conditional density estimation. InAdvances in Neural Information Processing Systems, vol- ume 29, pages 1028–1036, 2016. URL https://proceedings.neurips.cc/paper/2016/ hash/6aca97005c68f1206823815f66102863-Abstract.html

  4. [4]

    Greenberg, Marcel Nonnenmacher, and Jakob H

    David S. Greenberg, Marcel Nonnenmacher, and Jakob H. Macke. Automatic posterior trans- formation for likelihood-free inference. InProceedings of the 36th International Conference on Machine Learning, volume 97 ofProceedings of Machine Learning Research, pages 2404–2414,

  5. [5]

    URLhttps://proceedings.mlr.press/v97/greenberg19a.html

  6. [6]

    George E. P. Box. Science and statistics.Journal of the American Statistical Association, 71 (356):791–799, 1976. doi: 10.1080/01621459.1976.10480949. URL https://doi.org/10. 1080/01621459.1976.10480949

  7. [7]

    Frazier, Christian P

    David T. Frazier, Christian P. Robert, and Judith Rousseau. Model misspecification in approxi- mate bayesian computation: Consequences and diagnostics.Journal of the Royal Statistical Society: Series B (Statistical Methodology), 82(2):421–444, 2020. doi: 10.1111/rssb.12356. URLhttps://doi.org/10.1111/rssb.12356

  8. [8]

    Patrick Cannon, Daniel Ward, and Sebastian M. Schmon. Investigating the impact of model misspecification in neural simulation-based inference, 2022. URL https://arxiv.org/abs/ 2209.01845

  9. [9]

    B. J. K. Kleijn and A. W. van der Vaart. The bernstein-von mises theorem under misspecification. Electronic Journal of Statistics, 6:354–381, 2012. doi: 10.1214/12-EJS675. URL https: //doi.org/10.1214/12-EJS675

  10. [10]

    Posterior predictive assessment of model fitness via realized discrepancies.Statistica Sinica, 6(4):733–807, 1996

    Andrew Gelman, Xiao-Li Meng, and Hal Stern. Posterior predictive assessment of model fitness via realized discrepancies.Statistica Sinica, 6(4):733–807, 1996. URL https://www3.stat. sinica.edu.tw/statistica/j6n4/j6n41/j6n41.htm

  11. [11]

    Daniel Ward, Patrick Cannon, Mark Beaumont, Matteo Fasiolo, and Sebastian M. Schmon. Robust neural posterior estimation and statistical model criticism. InAdvances in Neural Information Processing Systems, volume 35, pages 33845–33859, 2022. URL https:// neurips.cc/virtual/2022/poster/52936

  12. [12]

    Detecting model misspecification in amortized bayesian inference with neural networks

    Marvin Schmitt, Paul-Christian Bürkner, Ullrich Köthe, and Stefan T Radev. Detecting model misspecification in amortized bayesian inference with neural networks. InDagm german conference on pattern recognition, pages 541–557. Springer, 2023. URL https://arxiv. org/pdf/2112.08866. 10

  13. [13]

    Souza, Luigi Acerbi, and Samuel Kaski

    Daolang Huang, Ayush Bharti, Amauri H. Souza, Luigi Acerbi, and Samuel Kaski. Learning robust statistics for simulation-based inference under model mis- specification. InAdvances in Neural Information Processing Systems, volume 36,

  14. [14]

    URL https://proceedings.neurips.cc/paper_files/paper/2023/hash/ 16c5b4102a6b6eb061e502ce6736ad8a-Abstract-Conference.html

  15. [15]

    Radev, and Paul-Christian Bürkner

    Aayush Mishra, Daniel Habermann, Marvin Schmitt, Stefan T. Radev, and Paul-Christian Bürkner. Robust amortized bayesian inference with self-consistency losses on unlabeled data. InInternational Conference on Learning Representations, 2026. URL https://openreview. net/forum?id=E1dANKwo4I

  16. [16]

    Gamella, Ozan Sener, Jens Behrmann, Guillermo Sapiro, Jörn- Henrik Jacobsen, and Marco Cuturi

    Antoine Wehenkel, Juan L. Gamella, Ozan Sener, Jens Behrmann, Guillermo Sapiro, Jörn- Henrik Jacobsen, and Marco Cuturi. Addressing misspecification in simulation-based inference through data-driven calibration. InProceedings of the 42nd International Conference on Machine Learning, volume 267 ofProceedings of Machine Learning Research, 2025. URL https://...

  17. [17]

    Optimal Transport for Domain Adaptation.IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(9): 1853–1865, September 2017

    Nicolas Courty, Rémi Flamary, Devis Tuia, and Alain Rakotomamonjy. Optimal transport for domain adaptation.IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(9): 1853–1865, 2017. doi: 10.1109/TPAMI.2016.2615921. URL https://doi.org/10.1109/ TPAMI.2016.2615921

  18. [18]

    , year =

    Cédric Villani.Optimal Transport: Old and New, volume 338 ofGrundlehren der mathematis- chen Wissenschaften. Springer, Berlin, Heidelberg, 2009. doi: 10.1007/978-3-540-71050-9. URLhttps://doi.org/10.1007/978-3-540-71050-9

  19. [19]

    Computational optimal transport: With applications to data science

    Gabriel Peyré and Marco Cuturi.Computational optimal transport: With applications to data science. Now Foundations and Trends, 2019. URLhttps://arxiv.org/abs/1803.00567

  20. [20]

    Inductive domain transfer in misspecified simulation-based inference

    Ortal Senouf, Antoine Wehenkel, Cédric Vincent-Cuaz, Emmanuel Abbé, and Pascal Frossard. Inductive domain transfer in misspecified simulation-based inference. InAdvances in Neu- ral Information Processing Systems, 2025. URL https://openreview.net/forum?id= PhnquAa8eV

  21. [21]

    Marvin Schmitt, Desi Ivanova, Daniel Habermann, Paul-Christian Bürkner, Ullrich Köthe, and Stefan T. Radev. Leveraging self-consistency for data-efficient amortized bayesian inference. InProceedings of the 41st International Conference on Machine Learning, volume 235 of Proceedings of Machine Learning Research, pages 43723–43741. PMLR, 2024. URL https: //...

  22. [22]

    Ivanova, Marvin Schmitt, and Stefan T

    Desi R. Ivanova, Marvin Schmitt, and Stefan T. Radev. Data-efficient variational mutual informa- tion estimation via bayesian self-consistency. InNeurIPS 2024 Workshop on Bayesian Decision- making and Uncertainty, 2024. URLhttps://openreview.net/forum?id=QfiyElaO1f

  23. [23]

    Lasse Elsemüller, Valentin Pratz, Mischa von Krause, Andreas V oss, Paul-Christian Bürkner, and Stefan T. Radev. Does unsupervised domain adaptation improve the robustness of amortized bayesian inference? a systematic evaluation.Transactions on Machine Learning Research,

  24. [24]

    URLhttps://openreview.net/forum?id=ewgLuvnEw6

  25. [25]

    Borgwardt, Malte J

    Arthur Gretton, Karsten M. Borgwardt, Malte J. Rasch, Bernhard Schölkopf, and Alexander Smola. A kernel two-sample test.Journal of Machine Learning Research, 13(25):723–773,

  26. [26]

    URLhttps://jmlr.org/papers/v13/gretton12a.html

  27. [27]

    Domain-adversarial training of neural networks.Journal of Machine Learning Research, 17(59):1–35, 2016

    Yaroslav Ganin, Evgeniya Ustinova, Hana Ajakan, Pascal Germain, Hugo Larochelle, François Laviolette, Mario Marchand, and Victor Lempitsky. Domain-adversarial training of neural networks.Journal of Machine Learning Research, 17(59):1–35, 2016. URL https://jmlr. org/papers/v17/15-239.html

  28. [28]

    Johansson, David Sontag, and Rajesh Ranganath

    Fredrik D. Johansson, David Sontag, and Rajesh Ranganath. Support and invertibility in domain- invariant representations. InProceedings of the 22nd International Conference on Artificial Intelligence and Statistics, volume 89 ofProceedings of Machine Learning Research, pages 527–536, 2019. URLhttps://proceedings.mlr.press/v89/johansson19a.html. 11

  29. [29]

    Han Zhao, Remi Tachet des Combes, Kun Zhang, and Geoffrey J. Gordon. On learning invariant representations for domain adaptation. InProceedings of the 36th International Conference on Machine Learning, volume 97 ofProceedings of Machine Learning Research, pages 7523–7532,

  30. [30]

    URLhttps://proceedings.mlr.press/v97/zhao19a.html

  31. [31]

    Domain adaptation with invariant representation learning: What transformations to learn? In M

    Petar Stojanov, Zijian Li, Mingming Gong, Ruichu Cai, Jaime Carbonell, and Kun Zhang. Domain adaptation with invariant representation learning: What transformations to learn? In M. Ranzato, A. Beygelzimer, Y . Dauphin, P.S. Liang, and J. Wortman Vaughan, editors, Advances in Neural Information Processing Systems, volume 34, pages 24791–24803. Curran Assoc...

  32. [32]

    CyCADA: Cycle-consistent adversarial domain adaptation

    Judy Hoffman, Eric Tzeng, Taesung Park, Jun-Yan Zhu, Phillip Isola, Kate Saenko, Alexei Efros, and Trevor Darrell. CyCADA: Cycle-consistent adversarial domain adaptation. In Jennifer Dy and Andreas Krause, editors,Proceedings of the 35th International Conference on Machine Learning, volume 80 ofProceedings of Machine Learning Research, pages 1989–1998. PM...

  33. [33]

    Radev, Ulf K

    Stefan T. Radev, Ulf K. Mertens, Andreas V oss, Lynton Ardizzone, and Ullrich Köthe. BayesFlow: Learning complex stochastic models with invertible neural networks.IEEE Transactions on Neural Networks and Learning Systems, 33(4):1452–1466, 2022. doi: 10.1109/TNNLS.2020.3042395. URL https://doi.org/10.1109/TNNLS.2020.3042395

  34. [34]

    Gutmann, Aaron Courville, and Zhanxing Zhu

    Yanzhi Chen, Dinghuai Zhang, Michael U. Gutmann, Aaron Courville, and Zhanxing Zhu. Neural approximate sufficient statistics for implicit models. InInternational Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=SRDuJssQud

  35. [35]

    Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros. Unpaired image-to-image translation using cycle-consistent adversarial networks. InProceedings of the IEEE International Conference on Computer Vision, pages 2223–2232, 2017. doi: 10.1109/ICCV .2017.244. URL https://doi.org/10.1109/ICCV.2017.244

  36. [36]

    Claude E. Shannon. A mathematical theory of communication.The Bell System Technical Journal, 27(3–4):379–423, 623–656, 1948. doi: 10.1002/j.1538-7305.1948.tb01338.x. URL https://doi.org/10.1002/j.1538-7305.1948.tb01338.x

  37. [37]

    Cover and Joy A

    Thomas M. Cover and Joy A. Thomas.Elements of Information Theory. Wiley-Interscience, Hoboken, NJ, 2 edition, 2006. doi: 10.1002/047174882X. URL https://doi.org/10.1002/ 047174882X

  38. [38]

    David Barber and Felix V . Agakov. The IM algorithm: A variational approach to information maximization. InAdvances in Neural Information Processing Systems, volume 16, 2003. URL https://aivalley.com/Papers/MI_NIPS_final.pdf

  39. [39]

    Alemi, and George Tucker

    Ben Poole, Sherjil Ozair, Aaron van den Oord, Alexander A. Alemi, and George Tucker. On variational bounds of mutual information. InProceedings of the 36th International Conference on Machine Learning, volume 97 ofProceedings of Machine Learning Research, pages 5171– 5180, 2019. URLhttps://proceedings.mlr.press/v97/poole19a.html

  40. [40]

    Spectral normalization for generative adversarial networks

    Takeru Miyato, Toshiki Kataoka, Masanori Koyama, and Yuichi Yoshida. Spectral normalization for generative adversarial networks. InInternational Conference on Learning Representations,

  41. [41]

    URLhttps://openreview.net/forum?id=B1QRgziT-

  42. [42]

    A contribution to the mathematical theory of epidemics,

    William Ogilvy Kermack and Anderson G. McKendrick. A contribution to the mathematical theory of epidemics.Proceedings of the Royal Society of London. Series A, Containing Papers of a Mathematical and Physical Character, 115(772):700–721, 1927. doi: 10.1098/rspa.1927.0118. URLhttps://doi.org/10.1098/rspa.1927.0118

  43. [43]

    Gamella, Jonas Peters, and Peter Bühlmann

    Juan L. Gamella, Jonas Peters, and Peter Bühlmann. Causal chambers as a real-world physical testbed for AI methodology.Nature Machine Intelligence, 7:107–118, 2025. doi: 10.1038/ s42256-024-00964-x. URLhttps://doi.org/10.1038/s42256-024-00964-x. 12

  44. [44]

    Greenberg, Pedro J

    Jan-Matthis Lueckmann, Jan Boelts, David S. Greenberg, Pedro J. Gonçalves, and Jakob H. Macke. Benchmarking simulation-based inference. InProceedings of the 24th International Conference on Artificial Intelligence and Statistics, volume 130 ofProceedings of Machine Learning Research, pages 343–351, 2021. URL https://proceedings.mlr.press/v130/ lueckmann21a.html

  45. [45]

    A crisis in simulation-based inference? beware, your posterior approx- imations can be unfaithful.Transactions on Machine Learning Research, 2022

    Joeri Hermans, Arnaud Delaunoy, François Rozet, Antoine Wehenkel, V olodimir Begy, and Gilles Louppe. A crisis in simulation-based inference? beware, your posterior approx- imations can be unfaithful.Transactions on Machine Learning Research, 2022. URL https://openreview.net/forum?id=LHAbHkt6Aq

  46. [46]

    Solomon Kullback and Richard A. Leibler. On information and sufficiency.The Annals of Mathematical Statistics, 22(1):79–86, 1951. doi: 10.1214/aoms/1177729694. URL https: //doi.org/10.1214/aoms/1177729694

  47. [47]

    On Wasserstein two-sample testing and related famili es of nonparametric tests

    Aaditya Ramdas, Nicolás García Trillos, and Marco Cuturi. On wasserstein two-sample testing and related families of nonparametric tests.Entropy, 19(2):47, 2017. doi: 10.3390/e19020047. URLhttps://doi.org/10.3390/e19020047

  48. [48]

    SPIE Press, Bellingham, WA, 2005

    Edward Collett.Field Guide to Polarization. SPIE Press, Bellingham, WA, 2005. doi: 10.1117/3.626141. URLhttps://doi.org/10.1117/3.626141

  49. [49]

    Wasserstein auto- encoders

    Ilya Tolstikhin, Olivier Bousquet, Sylvain Gelly, and Bernhard Schölkopf. Wasserstein auto- encoders. InInternational Conference on Learning Representations, 2018. URL https: //openreview.net/forum?id=HkL7n1-0b

  50. [50]

    Masked autoregressive flow for density estimation

    George Papamakarios, Theo Pavlakou, and Iain Murray. Masked autoregressive flow for density estimation. InAdvances in Neural Information Processing Systems, volume 30, pages 2335–2344, 2017. URL https://proceedings.neurips.cc/paper/2017/hash/ 6c1da886822c67822bcf3679d04369fa-Abstract.html

  51. [51]

    probabilists/zuko: Zuko 1.1.0, January 2024

    François Rozet, Félix Divo, and Simon Schnake. probabilists/zuko: Zuko 1.1.0, January 2024. URLhttps://doi.org/10.5281/zenodo.10571785. Software package

  52. [52]

    Decoupled weight decay regularization

    Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization. InInternational Conference on Learning Representations, 2019. URL https://openreview.net/forum? id=Bkg6RiCqY7

  53. [53]

    Kingma and Jimmy Ba

    Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization. InInterna- tional Conference on Learning Representations, 2015. URL https://arxiv.org/abs/1412. 6980

  54. [54]

    Cook, Andrew Gelman, and Donald B

    Samantha R. Cook, Andrew Gelman, and Donald B. Rubin. Validation of software for bayesian models using posterior quantiles.Journal of Computational and Graphical Statistics, 15 (3):675–692, 2006. doi: 10.1198/106186006X136976. URL https://doi.org/10.1198/ 106186006X136976

  55. [55]

    Validating bayesian inference algorithms with simulation-based calibration, 2020

    Sean Talts, Michael Betancourt, Daniel Simpson, Aki Vehtari, and Andrew Gelman. Validating bayesian inference algorithms with simulation-based calibration, 2018. URL https://arxiv. org/abs/1804.06788. 13 A Mathematical details for parameter-relevant information preservation This appendix collects the population identities used to interpret the information...

  56. [56]

    Domain confusion is applied through a gradient reversal layer with the standard schedule [23] λgrl(p) = 2 1 + exp(−10p) −1, where p is the normalized training progress

    with scales {0.1,0.2,0.5,1,2,5,10} .NPE-DANN[ 21] also uses the same summary network and posterior flow, and adds a domain classifier with3 hidden layers of width 256. Domain confusion is applied through a gradient reversal layer with the standard schedule [23] λgrl(p) = 2 1 + exp(−10p) −1, where p is the normalized training progress. The domain classifie...