Bayesian workflow diagnostics outperform unit tests for detecting and repairing statistically misspecified LLM-generated probabilistic programs across benchmarks and real generation tasks.
hub
Validating bayesian inference algorithms with simulation-based calibration
27 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
GraphNPE recovers a significantly lower central density for Boötes I consistent with a core while Draco remains marginally cuspy, and demonstrates that higher-order velocity moments reduce bias in dynamical modeling.
Extends NPE to mixed discrete-continuous parameter spaces via a factorized inference network combining an autoregressive classifier and generative model, trained jointly to yield accurate calibrated posteriors.
SBI framework with GNN-on-sets and masked autoregressive flow recovers input cosmologies from eRASS1 mocks at 11.5% precision on Ω_m and 4.4% on σ_8 using 3259 clusters.
LLM embeddings condition a generative transformer to enable faster convergence, better performance, and generalization to unseen LHC processes using a single model.
A beta-VAE analysis of pop-cosmos models finds that five latent dimensions capture the rest-frame optical SED, corresponding to stellar mass, recent star formation, dust, and two gas ionization states.
Generative models for cosmological field-level inference can reproduce posterior means and cross-correlations yet fail to capture correct uncertainty geometry when validated against HMC reference samples.
GenSBI delivers JAX-native implementations of generative SBI methods with transformer backbones and reports near-ideal calibration scores on standard benchmarks.
Pointwise metrics compress marginal spectra in multimodal inverse problems, and a three-part protocol using CRPS, spectrum fidelity, and calibration reverses model rankings on synthetic and particle-physics benchmarks.
Bayesian joint model infers infectious virus shedding trajectories and derived infectiousness metrics from PCR and other proxies in SARS-CoV-2 using data from five cohorts of roughly 2000 infections.
SPIN performs bidirectional domain transfer in SBI to retain parameter mutual information from unlabeled real observations, improving real-world posterior inference under increasing misspecification.
Embedding selection mechanisms into generative simulators enables amortized Bayesian inference to produce debiased, well-calibrated posteriors without tractable likelihoods.
Simulation-based inference on Big Sobol Sequence halos at z=0.5 shows CMD+MFs improves σ8 and Ωm precision by ~27% over MFs alone and outperforms PS by ~45% in mass-selected samples at matched scales.
Score-based diffusion models learn the empirical distribution of real LIGO noise to enable unbiased gravitational-wave parameter estimation under only an additivity assumption.
New smooth invertible parameterization of anisotropic GF correlation length and diffusion matrix, with PC priors that penalize finite range and nonzero anisotropy for constant-parameter models.
Galaxy size-mass relations exhibit double power-law breaks at different pivot masses for quiescent versus bulge-dominated samples, coinciding with AGN activity scales.
Using ray-tracing simulations and simulation-based inference, the authors construct an AGN population that reproduces the cosmic X-ray background, number counts, and absorption properties, deriving an intrinsic Compton-thick fraction of 40±3%.
Introduces a cascade-based structural decomposition of posterior uncertainty to isolate intrinsic ambiguity from estimation uncertainty in deep generative models for linear inverse problems.
Neural posterior estimation trained on GPU-simulated radar data enables calibrated probabilistic inversion of terrain parameters and transfers to real Mars radar profiles.
MIRA is a new analytic score for conditional distribution accuracy derived from equal probability mass assignment, enabling Bayesian model comparison via direct posterior validation.
No evidence for deviations from general relativity is found in LIGO-Virgo binary black hole events, with improved constraints on waveform parameters, graviton mass, and ringdown properties.
Hierarchical Bayesian analysis of ~50 binary pulsars finds moderate evidence for mass-spin anti-correlation (ρ = -0.26) in the recycled population, plus a small mass offset by white-dwarf companion type and confirmation of companion-mass–eccentricity correlation in DNS systems.
Bayesian EVT with Hawkes-AR-Gumbel dependence estimates CVaR up to 99.995% on simulated operational risk data and outperforms independent and shared-factor baselines.
BILBY is validated on simulated compact binary signals and reproduces the eleven GWTC-1 results with configuration and output files provided for reproduction.
citing papers explorer
-
Calibration, Not Compilation: Detecting and Repairing Misspecified Probabilistic Programs Written by Language Models
Bayesian workflow diagnostics outperform unit tests for detecting and repairing statistically misspecified LLM-generated probabilistic programs across benchmarks and real generation tasks.
-
Dark Matter in Draco and Bo\"otes I: Hints of a Core in an Ultra-Faint Dwarf from Simulation-Based Inference
GraphNPE recovers a significantly lower central density for Boötes I consistent with a core while Draco remains marginally cuspy, and demonstrates that higher-order velocity moments reduce bias in dynamical modeling.
-
Mixed neural posterior estimation for simulators with discrete and continuous parameters
Extends NPE to mixed discrete-continuous parameter spaces via a factorized inference network combining an autoregressive classifier and generative model, trained jointly to yield accurate calibrated posteriors.
-
Simulation-Based Inference for Cluster Cosmology with Set-Based Neural Network Architectures
SBI framework with GNN-on-sets and masked autoregressive flow recovers input cosmologies from eRASS1 mocks at 11.5% precision on Ω_m and 4.4% on σ_8 using 3259 clusters.
-
One Generator, Any Process: LLM-Conditioning for the LHC
LLM embeddings condition a generative transformer to enable faster convergence, better performance, and generalization to unseen LHC processes using a single model.
-
pop-cosmos: Disentangling galaxy properties from observables using data-driven approaches
A beta-VAE analysis of pop-cosmos models finds that five latent dimensions capture the rest-frame optical SED, corresponding to stellar mass, recent star formation, dust, and two gas ionization states.
-
Learning the Universe: Posterior Reliability of Neural Generative Models in High-Dimensional Field-Level Inference of Cosmic Initial Conditions
Generative models for cosmological field-level inference can reproduce posterior means and cross-correlations yet fail to capture correct uncertainty geometry when validated against HMC reference samples.
-
GenSBI: Generative Methods for Simulation-Based Inference in JAX
GenSBI delivers JAX-native implementations of generative SBI methods with transformer backbones and reports near-ideal calibration scores on standard benchmarks.
-
Pointwise Metrics Mislead: An Evaluation Protocol for Multimodal Inverse Problems
Pointwise metrics compress marginal spectra in multimodal inverse problems, and a three-part protocol using CRPS, spectrum fidelity, and calibration reverses model rankings on synthetic and particle-physics benchmarks.
-
Inferring infectiousness: a joint model of the within-host viral kinetics of SARS-CoV-2
Bayesian joint model infers infectious virus shedding trajectories and derived infectiousness metrics from PCR and other proxies in SARS-CoV-2 using data from five cohorts of roughly 2000 infections.
-
Information-Preserving Domain Transfer with Unlabeled Data in Misspecified Simulation-Based Inference
SPIN performs bidirectional domain transfer in SBI to retain parameter mutual information from unlabeled real observations, improving real-world posterior inference under increasing misspecification.
-
Overcoming Selection Bias in Statistical Studies With Amortized Bayesian Inference
Embedding selection mechanisms into generative simulators enables amortized Bayesian inference to produce debiased, well-calibrated posteriors without tractable likelihoods.
-
pop-cosmos: Galaxy size evolution across structural and star-formation classifications in COSMOS-Web
Galaxy size-mass relations exhibit double power-law breaks at different pivot masses for quiescent versus bulge-dominated samples, coinciding with AGN activity scales.
-
Population synthesis of active galactic nuclei based on the radiation-regulated unification model
Using ray-tracing simulations and simulation-based inference, the authors construct an AGN population that reproduces the cosmic X-ray background, number counts, and absorption properties, deriving an intrinsic Compton-thick fraction of 40±3%.
-
Separating Intrinsic Ambiguity from Estimation Uncertainty in Deep Generative Models for Linear Inverse Problems
Introduces a cascade-based structural decomposition of posterior uncertainty to isolate intrinsic ambiguity from estimation uncertainty in deep generative models for linear inverse problems.
-
Neural Posterior Estimation of Terrain Parameters from Radar Sounder Data
Neural posterior estimation trained on GPU-simulated radar data enables calibrated probabilistic inversion of terrain parameters and transfers to real Mars radar profiles.
-
MIRA: A Score for Conditional Distribution Accuracy and Model Comparison
MIRA is a new analytic score for conditional distribution accuracy derived from equal probability mass assignment, enabling Bayesian model comparison via direct posterior validation.
-
Neutron Star Mass across Binary Pulsar Subpopulations: Mass-Spin Correlation, Mass Distributions, and Moment of Inertia Effects
Hierarchical Bayesian analysis of ~50 binary pulsars finds moderate evidence for mass-spin anti-correlation (ρ = -0.26) in the recycled population, plus a small mass offset by white-dwarf companion type and confirmation of companion-mass–eccentricity correlation in DNS systems.
-
Bayesian Extreme Value Theory with Hawkes-AR-Gumbel Dependence for Extreme CVaR Estimation in Operational Risk
Bayesian EVT with Hawkes-AR-Gumbel dependence estimates CVaR up to 99.995% on simulated operational risk data and outperforms independent and shared-factor baselines.
-
Machine Learning Techniques for Astrophysics and Cosmology: Simulation-Based Inference
Simulation-based inference uses neural networks trained on simulations to enable parameter inference in cosmology and astrophysics where traditional likelihood calculations are intractable.
-
Application of Machine Learning to 21 cm Cosmology
Review chapter organizing machine learning methods for 21 cm cosmology into observation, theory, and inference domains.