Composable Effects for Flexible and Accelerated Probabilistic Programming in NumPyro

Du Phan; Martin Jankowiak; Neeraj Pradhan

REVIEW 2 major objections 2 minor 60 cited by

Reviewed by Pith at T0; open to challenge.

T0 means a machine referee read the full paper against a public rubric. The mark states how deep the mechanical check went, never who wrote it. the ladder, T0–T4 →

Challenge this review Re-run · record.json Download PDF Read on arXiv ↗

T0 review · grok-4.3

NumPyro composes Pyro effect handlers with JAX to deliver a fully JIT-compiled iterative NUTS sampler.

2026-05-15 15:58 UTC pith:52FPSCX5

load-bearing objection NumPyro shows a workable JAX backend for Pyro with JIT NUTS that improves speed across data sizes. the 2 major comments →

arxiv 1912.11554 v1 pith:52FPSCX5 submitted 2019-12-24 stat.ML cs.AIcs.LGcs.PL

Composable Effects for Flexible and Accelerated Probabilistic Programming in NumPyro

Du Phan , Neeraj Pradhan , Martin Jankowiak This is my paper

classification stat.ML cs.AIcs.LGcs.PL

keywords NumPyroPyroNUTSJAXeffect handlersJIT compilationprobabilistic programmingMCMC

verification ladder T0 review T1 audit T2 compute T3 formal T4 reserved

The pith

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents NumPyro as a lightweight NumPy backend for the Pyro probabilistic programming language that reuses the same modeling interface, language primitives, and effect handling abstractions. It establishes that these effect handlers can be composed directly with JAX program transformations to enable hardware acceleration, automatic differentiation, and vectorization. The central demonstration is an iterative formulation of the No-U-Turn Sampler that supports end-to-end JIT compilation. This yields substantially faster inference than prior alternatives across both small and large dataset regimes. A reader would care because the work shows how to retain flexible probabilistic modeling while gaining the performance benefits of a functional numeric backend.

Core claim

NumPyro shows that Pyro's effect handlers compose with JAX's functional transformations to preserve the original modeling API while adding hardware acceleration and automatic differentiation. In particular it supplies an iterative formulation of the No-U-Turn Sampler that can be compiled end-to-end with JAX's JIT, producing faster runtimes than existing implementations in both the small-data and large-data regimes.

What carries the argument

Effect handlers that extend Pyro's modeling abstractions to JAX's functional transformations for acceleration and compilation.

Load-bearing premise

Pyro effect handlers compose cleanly with JAX transformations without introducing correctness problems or reducing modeling expressiveness.

What would settle it

A direct runtime comparison on standard benchmark models showing that the NumPyro NUTS implementation is not faster than existing Pyro or Stan alternatives in either the small-dataset or large-dataset regime.

Watch this falsifier — get emailed when new claim-graph text bears on it.

If this is right

Probabilistic models written in the Pyro interface can run with full JIT compilation and hardware acceleration.
The same modeling code benefits from vectorization and automatic differentiation supplied by JAX.
Inference scales to both small and large datasets without separate code paths.
Effect-handler composition becomes a reusable pattern for adding new backends to probabilistic programming languages.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same handler-composition technique could be applied to accelerate other MCMC or variational methods inside JAX.
Models could be automatically ported between CPU, GPU, and TPU execution without rewriting inference logic.
New modeling primitives that exploit JAX's functional purity might become feasible once the handler layer is stable.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit.

Desk Editor's Note

NumPyro shows a workable JAX backend for Pyro with JIT NUTS that improves speed across data sizes.

read the letter

The main thing to know is that NumPyro provides a JAX backend for Pyro that allows effect handlers to be composed with JAX transformations. This results in an iterative NUTS sampler that can be JIT compiled end-to-end and runs faster than existing alternatives in small and large dataset regimes. They handle the core challenge well by maintaining the same modeling interface and language primitives. The effect handling abstractions are extended to the new backend, which is what makes the whole thing usable without forcing users to learn a new API. The iterative formulation of NUTS is a practical choice that fits JAX's functional style and enables the compilation and acceleration features like autodiff and vectorization. This is a useful piece of work for anyone doing probabilistic inference. Where it could be stronger is in the presentation of the results. The performance claims are stated clearly in the abstract, but the paper should provide detailed benchmarks, including specific models, dataset sizes, hardware used, and direct comparisons with numbers. This would make it easier to assess how general the speedups are and if there are any trade-offs in accuracy or supported features. Also, more discussion on potential limitations of the composition, such as with very complex models, would be helpful to set expectations. This paper is aimed at developers and users of probabilistic programming tools who want better performance for MCMC sampling. It would be of interest to the community around Pyro and JAX. The thinking is clear and the approach is grounded in the actual implementation challenges, so it deserves serious peer review. I would recommend accepting it for review, as the engineering contribution is real and the claims are verifiable with the right experiments.

Referee Report

2 major / 2 minor

Summary. The paper introduces NumPyro as a lightweight NumPy-based backend for the Pyro probabilistic programming language that preserves the same modeling interface and effect-handling abstractions. It demonstrates that Pyro effect handlers compose with JAX's functional transformations (JIT, autodiff, vectorization) to support an iterative formulation of the No-U-Turn Sampler (NUTS) that is end-to-end JIT-compilable, yielding substantially faster inference than existing alternatives across both small- and large-dataset regimes.

Significance. If the performance and correctness claims hold, the work is significant for showing how effect-handler composition can bridge imperative PPL APIs with functional autodiff frameworks, enabling scalable, hardware-accelerated sampling without loss of modeling expressiveness. The engineering result directly addresses practical bottlenecks in Bayesian inference for machine-learning models.

major comments (2)

[Abstract and §4] Abstract and §4 (results): the central claim that the JIT-compiled iterative NUTS is 'much faster than existing alternatives in both the small and large dataset regimes' is load-bearing yet supported only by high-level statements; specific wall-clock timings, hardware specifications, baseline implementations (e.g., Pyro, Stan, TensorFlow Probability), and dataset sizes must be reported with error bars or multiple runs to allow verification.
[§3] §3 (NUTS formulation): the iterative NUTS algorithm is presented as end-to-end JIT-compatible, but the manuscript does not explicitly address potential non-differentiable control flow or side-effect leakage when the effect handlers are transformed; a short proof sketch or counter-example check would strengthen the correctness argument.

minor comments (2)

[§4] Add a table or figure in §4 that directly tabulates speedup factors versus the closest competing samplers for the reported models.
[Introduction] Clarify in the introduction whether the modeling API is byte-for-byte identical to Pyro or admits any syntactic differences.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their careful reading and positive recommendation for minor revision. We address the major comments point-by-point below, agreeing to incorporate additional details and clarifications in the revised manuscript.

read point-by-point responses

Referee: [Abstract and §4] Abstract and §4 (results): the central claim that the JIT-compiled iterative NUTS is 'much faster than existing alternatives in both the small and large dataset regimes' is load-bearing yet supported only by high-level statements; specific wall-clock timings, hardware specifications, baseline implementations (e.g., Pyro, Stan, TensorFlow Probability), and dataset sizes must be reported with error bars or multiple runs to allow verification.

Authors: We agree that the performance claims would benefit from more detailed empirical support. In the revised manuscript, we will expand §4 to include specific wall-clock timings, hardware specifications (such as the CPU and GPU models used), the exact baseline implementations (Pyro, Stan, TensorFlow Probability), dataset sizes, and results reported as means with standard deviations over multiple independent runs. This will provide the necessary quantitative evidence for the 'much faster' claim in both small and large dataset regimes. revision: yes
Referee: [§3] §3 (NUTS formulation): the iterative NUTS algorithm is presented as end-to-end JIT-compatible, but the manuscript does not explicitly address potential non-differentiable control flow or side-effect leakage when the effect handlers are transformed; a short proof sketch or counter-example check would strengthen the correctness argument.

Authors: We thank the referee for highlighting this point on correctness. The effect handlers in NumPyro are implemented to be fully compatible with JAX's functional transformations, ensuring no side-effect leakage and that control flow remains traceable. In the revision, we will add a short paragraph in §3 providing a sketch of why the iterative NUTS formulation avoids non-differentiable operations and side effects, referencing the pure functional nature of the handlers and JAX's tracing mechanism. If space permits, we can include a brief counter-example check or note on the absence of such issues in our implementation. revision: yes

Circularity Check

0 steps flagged

No significant circularity in implementation description

full rationale

This is an implementation paper presenting NumPyro as a JAX-based backend for Pyro's modeling interface. The central claim concerns the engineering outcome of composing effect handlers with JAX transformations to enable an end-to-end JIT-compilable iterative NUTS sampler, with reported performance gains. No mathematical derivations, parameter fits, or self-referential equations appear in the provided text that reduce to their own inputs by construction. The work is self-contained as a software design and benchmarking description, with no load-bearing steps that match the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The approach rests on the domain assumption that effect handlers from Pyro can be lifted to a JAX functional backend without semantic change.

axioms (1)

domain assumption Pyro effect handlers can be composed with JAX program transformations while preserving modeling semantics
This is the central premise enabling the entire NumPyro design as stated in the abstract.

pith-pipeline@v0.9.0 · 5425 in / 1115 out tokens · 29639 ms · 2026-05-15T15:58:48.146930+00:00 · methodology

0 comments

read the original abstract

NumPyro is a lightweight library that provides an alternate NumPy backend to the Pyro probabilistic programming language with the same modeling interface, language primitives and effect handling abstractions. Effect handlers allow Pyro's modeling API to be extended to NumPyro despite its being built atop a fundamentally different JAX-based functional backend. In this work, we demonstrate the power of composing Pyro's effect handlers with the program transformations that enable hardware acceleration, automatic differentiation, and vectorization in JAX. In particular, NumPyro provides an iterative formulation of the No-U-Turn Sampler (NUTS) that can be end-to-end JIT compiled, yielding an implementation that is much faster than existing alternatives in both the small and large dataset regimes.

discussion (0)

Forward citations

Cited by 60 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Atmospheric asymmetries in WASP-121 b revealed by rotational transits detected with JWST
astro-ph.EP 2026-06 unverdicted novelty 8.0

Rotational transits of WASP-121 b observed with JWST/NIRSpec and NIRISS show asymmetric light curves and phase-dependent molecular absorption revealing stronger evening than morning terminator temperature gradients.
Multi-Agent AI Control: Distributed Attacks Hamper Per-Instance Monitors
cs.LG 2026-07 conditional novelty 7.0

Splitting an attack across more coordinating agents lowers per-commit monitor suspicion, making distributed attacks harder to detect than single-agent attacks.
Decoding the Early-Time Light Curves of Type Ia Supernovae. II. Population Parameters of One Thousand ZTF Supernovae
astro-ph.HE 2026-06 unverdicted novelty 7.0

Hierarchical Bayesian power-law fits to early ZTF light curves of 972 SNe Ia yield population parameters for rise time, index, and color evolution, revealing a bifurcation with SALT2 stretch.
Decoding the Early-Time Light Curves of Type Ia Supernovae. I. A Hierarchical Bayesian Framework for Demographic Inference
astro-ph.HE 2026-06 unverdicted novelty 7.0

A hierarchical Bayesian approach with multivariate Gaussian population prior reduces bias in demographic inference of SN Ia power-law rise parameters compared to individual fitting.
Calibration, Not Compilation: Detecting and Repairing Misspecified Probabilistic Programs Written by Language Models
cs.LG 2026-06 unverdicted novelty 7.0

Bayesian workflow diagnostics outperform unit tests for detecting and repairing statistically misspecified LLM-generated probabilistic programs across benchmarks and real generation tasks.
Gaussian processes on ray-guided transformed uniform grids for fast, flexible, and auto-differentiable adaptive source reconstruction in lens modelling
astro-ph.IM 2026-06 unverdicted novelty 7.0

A new RTU grid method models the lensing source as a Gaussian process on a ray-transformed uniform grid, achieving comparable fits with roughly half the pixels per dimension and higher ELBOs on mock data.
A hierarchical Bayesian framework for cosmology using Type 1 AGN variability
astro-ph.CO 2026-06 unverdicted novelty 7.0

A hierarchical Bayesian framework that uses the empirical anti-correlation between AGN variability amplitude and luminosity to infer cosmological parameters from moderate-baseline light curves via importance reweighting.
Same Evidence, Different Answer: Auditing Order Sensitivity in Multimodal Large Language Models
cs.CL 2026-06 unverdicted novelty 7.0

All 18 audited MLLMs exhibit order sensitivity with per-facet flip rates of 24-50%, exceeding same-order decoder noise.
Discovering Misconceptions and Misunderstandings From Administrations of Research-Designed Multiple Choice Instruments
physics.ed-ph 2026-06 unverdicted novelty 7.0

Multidimensional IRT analysis of 34k FCI administrations identifies 22 robust misconception dimensions and computes student/class scores revealing varied post-instruction remediation patterns.
High-Dimensional Bayesian Calibration of Expensive Nuclear Models with Differentiable Emulation
nucl-th 2026-05 unverdicted novelty 7.0

DREAM enables exact-gradient Bayesian calibration of nuclear models via offline SVD emulation of parameter-dependent operators, demonstrated by rapid HMC convergence on an 18-parameter CDCC analysis of d+58Ni scattering.
Kernel Renormalization in Bayesian Deep Neural Networks: the Equivalent Wishart Ansatz in the Proportional Regime
cs.LG 2026-05 unverdicted novelty 7.0

Equivalent Wishart Ansatz for kernel renormalization in Bayesian MLPs and CNNs in the proportional regime, with tests showing good agreement on benchmarks.
Archimedean Copula Inference via Taylor-Mode AD
cs.LG 2026-05 unverdicted novelty 7.0

acopula enables polynomial-time exact inference for arbitrary nested Archimedean copulas with censoring via Taylor-mode AD on user-defined generators.
ASSEMBLAGE-DEEPHISTORY: A Cross-Build Binary Dataset with Temporal Coverage
cs.CR 2026-05 unverdicted novelty 7.0

A new queryable binary dataset combining cross-build diversity, temporal history, and CVE labels with linked metadata for vulnerability research.
Theoretical guidelines for annealed Langevin dynamics in compositional simulation-based inference
stat.ML 2026-05 unverdicted novelty 7.0

Derives Wasserstein bounds and explicit hyperparameter tuning rules for annealed Langevin dynamics in compositional score-based SBI, proving Linhart et al. (2026) allows larger steps and fewer total steps than Geffner...
Mixed neural posterior estimation for simulators with discrete and continuous parameters
cs.LG 2026-05 unverdicted novelty 7.0

Extends NPE to mixed discrete-continuous parameter spaces via a factorized inference network combining an autoregressive classifier and generative model, trained jointly to yield accurate calibrated posteriors.
Variational predictive resampling
stat.ME 2026-05 unverdicted novelty 7.0

Variational predictive resampling uses sequential imputation from variational predictives to generate samples whose distribution converges to the exact Bayesian posterior in Gaussian models and improves dependence cap...
Variational predictive resampling
stat.ME 2026-05 conditional novelty 7.0

Variational predictive resampling iteratively imputes data from a variational predictive to produce posterior samples that converge to the exact Bayesian posterior in Gaussian models where mean-field VI retains a gap.
Bayesian Doppler Imaging: Simultaneous Inference of Surface Maps and Geometric Parameters
astro-ph.EP 2026-05 conditional novelty 7.0

A fully Bayesian pixel-based Doppler imaging framework uses Gaussian Process priors and Hamiltonian Monte Carlo to simultaneously infer surface maps and geometric parameters from spectral data.
ADELIA: Automatic Differentiation for Efficient Laplace Inference Approximations
cs.DC 2026-05 conditional novelty 7.0

ADELIA is the first AD-enabled INLA system that computes exact hyperparameter gradients via a structure-exploiting multi-GPU backward pass, delivering 4.2-7.9x per-gradient speedups and 5-8x better energy efficiency t...
Archival Multiband Gravitational-Wave Signals from Massive Black Hole Binary Mergers
astro-ph.HE 2026-04 unverdicted novelty 7.0

Massive black hole binary mergers produce orphaned low-frequency signals in PTA pulsar terms that can be stacked for archival multiband gravitational-wave detection.
High-dimensional inference for the $\gamma$-ray sky with differentiable programming
astro-ph.HE 2026-04 unverdicted novelty 7.0

A differentiable forward model and likelihood enable probabilistic inference over many spatial morphologies for the Galactic Center gamma-ray Excess using variational methods on GPUs.
Dynamic sparse graphs with overlapping communities
stat.ME 2025-12 unverdicted novelty 7.0

Bayesian nonparametric model for dynamic sparse networks with overlapping communities via completely random measures and latent Markov processes.
People readily follow personal advice from AI but it does not improve their well-being
cs.HC 2025-11 conditional novelty 7.0

Large longitudinal RCT finds high rates of following AI personal advice but no sustained well-being gains versus a hobbies control condition.
AMIGO: a Data-Driven Calibration of the JWST Interferometer
astro-ph.IM 2025-10 unverdicted novelty 7.0

AMIGO is an end-to-end differentiable forward model of JWST AMI that corrects detector systematics to recover high-precision astrometry and detect close high-contrast companions.
Using Symbolic Regression to Emulate the Radial Fourier Transform of the S\'ersic profile for Fast, Accurate and Differentiable Galaxy Profile Fitting
astro-ph.IM 2025-08 conditional novelty 7.0

Symbolic regression yields an emulator for the radial Fourier transform of the Sérsic profile that enables 2.5 times faster galaxy profile fitting with minimal accuracy loss.
Neural Stochastic Differential Equations on Compact State Spaces: Theory, Methods, and Application to Suicide Risk Modeling
stat.ML 2025-08 unverdicted novelty 7.0

The authors derive drift and diffusion constraints plus a parameterization that forces neural SDE solutions to remain inside compact polyhedral domains, yielding better forecasts on real EMA suicide-risk datasets than...
Differentiable Fuzzy Cosmic-Web for Field Level Inference
astro-ph.CO 2025-06 unverdicted novelty 7.0

Introduces HICOBIAN, a differentiable fuzzy hierarchical cosmic-web bias model using sigmoid gradients for smooth region transitions, enabling accurate Bayesian field-level reconstruction of primordial density fields ...
Distinct spin properties and astrophysical origin of low mass binary black holes in gravitational wave data
astro-ph.HE 2026-07 unverdicted novelty 6.0

Hierarchical Bayesian analysis of GWTC-5.0 data identifies a mass transition at 15.2 solar masses separating distinct effective-spin distributions, pointing to different formation channels for low-mass binary black holes.
Kinematic detection of dusty outflows from active galactic nuclei: Polycyclic aromatic hydrocarbon kinematics of type 2 quasars with JWST/MIRI spectroscopy
astro-ph.GA 2026-06 unverdicted novelty 6.0

PAH velocity maps reveal outflows in three type 2 quasars, suggesting dusty outflows are common at λ_Edd ≳ 0.1.
HIcosmo: a differentiable JAX-based framework for cosmology inference
astro-ph.CO 2026-06 unverdicted novelty 6.0

HIcosmo is a new JAX-based differentiable framework for background cosmology inference that matches Cobaya results while delivering 8.7x CPU and up to 20x GPU speedups.
Shape-Constrained Bayesian Active Learning of Self-Limiting Saturation Curves
cond-mat.mtrl-sci 2026-06 unverdicted novelty 6.0

Bayesian monotonic I-spline regression with uncertainty sampling learns self-limiting saturation curves to within noise using as few as seven measurements across five kinetic families.
Learning Dynamical Systems from Multiple Sparse Datasets: A Hierarchical Bayesian Modeling Approach
cs.LG 2026-06 unverdicted novelty 6.0

A hierarchical Bayesian framework pools information across sparse dynamical system datasets via a shared population distribution to improve parameter inference and prediction over unpooled approaches.
A thorough investigation of cross-correlation estimators for stochastic gravitational-wave background searches in ground-based detector data
gr-qc 2026-06 unverdicted novelty 6.0

Reformulation of frequency-domain narrowband cross-correlation estimators for SGWB searches provides new expressions for estimators and covariances, while showing that widely used prior expressions still yield correct...
Scalable Bayesian Additive Models for Stellar Flare Detection via Amortized Gaussian Process Inference and Hidden Markov Models
stat.ML 2026-06 unverdicted novelty 6.0

A VAE-based surrogate for Celerite GPs is embedded in an additive GP+HMM model to achieve scalable Bayesian inference for stellar flare detection.
Calibration of an Analog-to-Digital Conversion Nonlinearity in JWST/NIRISS
astro-ph.IM 2026-06 unverdicted novelty 6.0

A data-driven model for periodic ADC integral nonlinearity in JWST/NIRISS is fitted to ramp residuals and applied to correct the ERS1366 WASP-39b transmission spectrum, reducing systematics at the 30ppm level.
GenSBI: Generative Methods for Simulation-Based Inference in JAX
cs.LG 2026-05 unverdicted novelty 6.0

GenSBI delivers JAX-native implementations of generative SBI methods with transformer backbones and reports near-ideal calibration scores on standard benchmarks.
A Strongly Parametrized Mass Ratio Model for the Stable Mass Transfer Channel: a Case Study of the $10 \, \rm{M}_{\odot}$ Peak
astro-ph.HE 2026-05 unverdicted novelty 6.0

A parametrized analytical model for BBH mass ratios from the stable mass transfer channel is derived and applied to the 10 solar-mass peak in GWTC-4, favoring little mass-ratio reversal.
AI4BayesCode: From Natural Language Descriptions to Validated Modular Stateful Bayesian Samplers
stat.CO 2026-05 unverdicted novelty 6.0

AI4BayesCode generates validated modular stateful MCMC samplers from natural language Bayesian model descriptions via LLM translation, modular blocks, and recursive stateful composition.
On the Reparameterization Between Cartesian Position-Velocity Vectors and Orbital Elements in the Kepler Problem
astro-ph.EP 2026-05 unverdicted novelty 6.0

Compact analytic Jacobians are derived for reparameterizing Keplerian orbits between orbital elements and Cartesian states, correcting a singularity in the Skowron et al. (2011) microlensing model and improving MCMC e...
Forbidden Formation Histories: The Binary Black Hole Merger Rate Disfavors Long Delay Times
astro-ph.HE 2026-05 unverdicted novelty 6.0

Deconvolution of the GWTC-4.0 BBH merger rate reveals that long-delay tails in the delay time distribution are forbidden, constraining progenitor formation histories to decline more steeply than the star formation rat...
Stories in Space: In-Context Learning Trajectories in Conceptual Belief Space
cs.CL 2026-05 unverdicted novelty 6.0

LLMs perform in-context learning as trajectories through a structured low-dimensional conceptual belief space, with the structure visible in both behavior and internal representations and causally manipulable via inte...
A hierarchical Bayesian pipeline for soliton-plus-NFW inference on SPARC rotation curves: diagnostics and prior-boundary behaviour
astro-ph.CO 2026-05 conditional novelty 6.0

A hierarchical Bayesian pipeline applied to 106 SPARC galaxies yields posteriors that reach prior boundaries for soliton parameters, indicating no detectable interior population-level soliton within the Schive-normali...
What You Don't Know Won't Hurt You: Self-Consistent Hierarchical Inference with Unknown Follow-up Selection Strategies
astro-ph.IM 2026-05 unverdicted novelty 6.0

Hierarchical Bayesian inference allows accurate recovery of intrinsic astrophysical source populations even when follow-up selection is unknown and correlated with parameters of interest.
Bayesian Modeling and Prediction of Generalized Contact Matrices
stat.ME 2026-05 unverdicted novelty 6.0

A Bayesian model for multi-feature contact matrices that uses tensor structures and contingency table theory to satisfy structural constraints and impute missing contact features, validated on simulations and US/Germa...
Towards E-Value Based Stopping Rules for Bayesian Deep Ensembles
cs.LG 2026-04 unverdicted novelty 6.0

E-value sequential tests enable early stopping of MCMC sampling in Bayesian deep ensembles, often needing only a fraction of the full budget while improving over standard deep ensembles.
A unified harmonic framework for dark siren cosmology
astro-ph.CO 2026-03 unverdicted novelty 6.0

The GW-galaxy cross-correlation method, unified with spectral sirens in a harmonic framework, can measure H0 to 1% and Omega_m to 5% precision with 2 years of data from next-generation detectors like Einstein Telescop...
Stochastic gravitational-wave background search using data from five pulsar timing arrays
astro-ph.CO 2025-12 conditional novelty 6.0

Combined five-PTA dataset yields posterior on SGWB power-law amplitude and index consistent with nonzero signal but below 5-sigma significance, with reconstructed angular correlations matching the Hellings-Downs prediction.
OzDES Reverberation Mapping of Active Galactic Nuclei: Final Data Release, Black-Hole Mass Results, & Scaling Relations
astro-ph.GA 2025-12 unverdicted novelty 6.0

OzDES final release delivers 62 new reverberation-mapped black hole masses and tighter lag-luminosity relations for Hβ, MgII, and CIV in high-redshift AGN after correcting for survey-length selection effects.
Photon counting readout for detection and inference of gravitational waves from neutron star merger remnants
gr-qc 2025-11 conditional novelty 6.0

Photon counting readout detects weak postmerger gravitational wave signals at a rate of about 1 in 100 for SNR 0.2 and yields a twofold improvement in neutron star radius measurement after 20,000 events.
Image reconstruction with the JWST Interferometer
astro-ph.IM 2025-10 unverdicted novelty 6.0

Dorito enables diffraction-limited image reconstruction from JWST AMI observations by deconvolving images or Fourier observables using maximum entropy and total variation regularization.
Conversational AI increases political knowledge as effectively as self-directed internet search
cs.HC 2025-09 conditional novelty 6.0

Conversational AI matches self-directed internet search in increasing belief in true political information and decreasing belief in misinformation.
RefineStat: Efficient Exploration for Probabilistic Program Synthesis
cs.LG 2025-09 unverdicted novelty 6.0

RefineStat improves small language model performance on probabilistic program synthesis by adding semantic constraint enforcement and diagnostic-aware refinement, producing syntactically and statistically reliable cod...
Scalable Spatiotemporal Inference with Biased Scan Attention Transformer Neural Processes
cs.LG 2025-06 unverdicted novelty 6.0

BSA-TNP is a new neural process model with KRBlocks and biased scan attention that claims to match top accuracy while scaling inference to over 1M points in under a minute on a single GPU and supporting translation in...
A search for periodic AGN variability in $\textit{Gaia}$ Data Release 3
astro-ph.HE 2025-05 accept novelty 6.0

Systematic search of 377k Gaia DR3 AGN light curves finds no reliable periodic SMBHB candidates after red-noise modeling and empirical false-alarm testing; all survivors lie in the few-cycle regime.
A "Black Hole Star" Reveals the Remarkable Gas-Enshrouded Hearts of the Little Red Dots
astro-ph.GA 2025-03 unverdicted novelty 6.0

A source 660 million years after the Big Bang is interpreted as a black hole star with a dust-free dense gas atmosphere, implying Little Red Dots have black hole masses overestimated by orders of magnitude.
Towards Understanding Sycophancy in Language Models
cs.CL 2023-10 conditional novelty 6.0

Sycophancy is prevalent in state-of-the-art AI assistants and is likely driven in part by human preferences that favor agreement over truthfulness.
The HST/WFC3 Transmission Spectrum of AU Mic b Part I: An Atmosphere Obscured by Contamination and Systematics
astro-ph.EP 2026-06 unverdicted novelty 5.0

The transmission spectrum of AU Mic b is dominated by the transit light source effect from stellar spots, yielding only weak atmospheric constraints with a preferred scale height below 185 km.
ASTEP confirmation of a pair of long-period Jupiter-sized planets with extremely low densities transiting TOI-791
astro-ph.EP 2026-06 unverdicted novelty 5.0

Two extremely low-density Jupiter-sized planets on long-period orbits around TOI-791 were confirmed via ground-based photometry and TTV-derived masses.
Empirical-Bayes Unfolding of $\gamma$-ray Spectra
astro-ph.IM 2026-06 unverdicted novelty 5.0

Empirical-Bayes hierarchical unfolding for gamma-ray spectra with Poisson ON/OFF likelihood, adaptive Richardson-Lucy prior, and NUTS posterior sampling, yielding spectra consistent with frequentist regularized ML.
Signatures of $^{56}$Ni Mixing and Neutron-rich Ejecta in Supernovae
astro-ph.HE 2026-06 unverdicted novelty 5.0

Multi-shell modeling shows outward 56Ni mixing produces faster brighter rises and biases one-zone fits to lower ejecta mass and higher nickel fraction, while r-process signatures in collapsars depend on placement, dis...

Reference graph

Works this paper leans on

25 extracted references · 25 canonical work pages · cited by 81 Pith papers · 3 internal anchors

[1]

Composable Effects for Flexible and Accelerated Probabilistic Programming in NumPyro

and Edward2 [6] based on TensorFlow, and PyMC3 [7] based on Theano. NumPyro is a package for probabilistic programming built atop JAX [8, 9], which is a high-level tracing library for program transformations (e.g. automatic differentiation, vectorization and JIT compilation) of Python and NumPy functions. Thus NumPyro enables users to write probabilistic ...

work page internal anchor Pith review Pith/arXiv arXiv 1912
[2]

An introduction to probabilistic programming, 2021

Jan-Willem van de Meent, Brooks Paige, Hongseok Yang, and Frank Wood. An introduction to probabilistic programming. arXiv preprint arXiv:1809.10756, 2018

work page arXiv 2018
[3]

Chen, Martin Jankowiak, Fritz Obermeyer, Neeraj Pradhan, Theofanis Karaletsos, Rohit Singh, Paul Szerlip, Paul Horsfall, and Noah D

Eli Bingham, Jonathan P. Chen, Martin Jankowiak, Fritz Obermeyer, Neeraj Pradhan, Theofanis Karaletsos, Rohit Singh, Paul Szerlip, Paul Horsfall, and Noah D. Goodman. Pyro: Deep universal probabilistic programming. Journal of Machine Learning Research , 20(28):1–6, 2019. URL http://jmlr.org/ papers/v20/18-403.html

work page 2019
[4]

Learning disentangled representations with semi-supervised deep generative models

Siddharth Narayanaswamy, T Brooks Paige, Jan-Willem Van de Meent, Alban Desmaison, Noah Goodman, Pushmeet Kohli, Frank Wood, and Philip Torr. Learning disentangled representations with semi-supervised deep generative models. In Advances in Neural Information Processing Systems , pages 5925–5935, 2017

work page 2017
[5]

Automatic differentiation in pytorch

Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. Automatic differentiation in pytorch. 2017

work page 2017
[6]

TensorFlow Distributions

Joshua V Dillon, Ian Langmore, Dustin Tran, Eugene Brevdo, Srinivas Vasudevan, Dave Moore, Brian Patton, Alex Alemi, Matt Hoffman, and Rif A Saurous. Tensorﬂow distributions. arXiv preprint arXiv:1711.10604, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[7]

Simple, distributed, and accelerated probabilistic programming

Dustin Tran, Matthew W Hoffman, Dave Moore, Christopher Suter, Srinivas Vasudevan, and Alexey Radul. Simple, distributed, and accelerated probabilistic programming. In Advances in Neural Information Processing Systems, pages 7598–7609, 2018

work page 2018
[8]

and Fonnesbeck, Christopher , title =

John Salvatier, Thomas V . Wiecki, and Christopher Fonnesbeck. Probabilistic programming in python using PyMC3. PeerJ Computer Science , 2:e55, apr 2016. doi: 10.7717/peerj-cs.55. URL https: //doi.org/10.7717/peerj-cs.55

work page doi:10.7717/peerj-cs.55 2016
[9]

JAX: composable transformations of Python+NumPy programs, 2018

James Bradbury, Roy Frostig, Peter Hawkins, Matthew James Johnson, Chris Leary, Dougal Maclaurin, and Skye Wanderman-Milne. JAX: composable transformations of Python+NumPy programs, 2018. URL http://github.com/google/jax

work page 2018
[10]

Compiling machine learning programs via high-level tracing

Roy Frostig, Matthew Johnson, and Chris Leary. Compiling machine learning programs via high-level tracing. 2018. URL http://www.sysml.cc/doc/2018/146.pdf. 5

work page 2018
[11]

https://www.tensorflow.org/xla/

XLA: Optimizing Compiler for Machine Learning. https://www.tensorflow.org/xla/

work page
[12]

Effect Handling for Composable Program Transformations in Edward2

Dave Moore and Maria I. Gorinova. Effect handling for composable program transformations in edward2. CoRR, abs/1811.06150, 2018. URL http://arxiv.org/abs/1811.06150

work page internal anchor Pith review Pith/arXiv arXiv 2018
[13]

Handlers of algebraic effects

Gordon Plotkin and Matija Pretnar. Handlers of algebraic effects. In Giuseppe Castagna, editor, Program- ming Languages and Systems , pages 80–94, Berlin, Heidelberg, 2009. Springer Berlin Heidelberg. ISBN 978-3-642-00590-9

work page 2009
[14]

JAX PRNG Design

The JAX Team. JAX PRNG Design. https://github.com/google/jax/blob/master/design_ notes/prng.md, 2019

work page 2019
[15]

Hoffman and Andrew Gelman

Matthew D. Hoffman and Andrew Gelman. The no-u-turn sampler: Adaptively setting path lengths in hamiltonian monte carlo. Journal of Machine Learning Research , 15:1593–1623, 2014. URL http: //jmlr.org/papers/v15/hoffman14a.html

work page 2014
[16]

Hybrid monte carlo

Simon Duane, Anthony D Kennedy, Brian J Pendleton, and Duncan Roweth. Hybrid monte carlo. Physics letters B, 195(2):216–222, 1987

work page 1987
[17]

Handbook of

Radford Neal. MCMC Using Hamiltonian Dynamics . CRC Press, May 2011. doi: 10.1201/b10905-6. URL http://dx.doi.org/10.1201/b10905-6

work page doi:10.1201/b10905-6 2011
[18]

Stochastic variational inference

Matthew D Hoffman, David M Blei, Chong Wang, and John Paisley. Stochastic variational inference. The Journal of Machine Learning Research, 14(1):1303–1347, 2013

work page 2013
[19]

Stan: A probabilistic programming language

Bob Carpenter, Andrew Gelman, Matthew D Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li, and Allen Riddell. Stan: A probabilistic programming language. Journal of statistical software, 76(1), 2017

work page 2017
[20]

Arnold, Dougal J

Allen Riddell, Ari Hartikainen, Daniel Lee, riddell stan, Marco Inacio, Daniel Chen, Kenneth C. Arnold, Dougal J. Sutherland, Aki Vehtari, Shinya SUZUKI, Takahiro Kubo, Todd Small, Tobias Erhardt, Stephen Hoover, Stephan Hoyer, Richard C Gerkin, Joerg Rings, Jackie, J. J. Ramsey, Aaron Darling, seantalts, Skipper Seabold, Max Shron, Liam Brannigan, Kyle F...

work page 2018
[21]

UCI machine learning repository, 2017

Dheeru Dua and Casey Graff. UCI machine learning repository, 2017. URL http://archive.ics.uci. edu/ml

work page 2017
[22]

The kernel interaction trick: Fast Bayesian discovery of pairwise interactions in high dimensions

Raj Agrawal, Brian Trippe, Jonathan Huggins, and Tamara Broderick. The kernel interaction trick: Fast Bayesian discovery of pairwise interactions in high dimensions. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning , volume 97 of Proceedings of Machine Learning Research, pages 14...

work page
[23]

URL http://proceedings.mlr.press/v97/agrawal19a.html

PMLR. URL http://proceedings.mlr.press/v97/agrawal19a.html

work page
[24]

Stan Modeling Language User’s Guide and Reference Manual, V ersion 2.18.0

Stan Development Team. Stan Modeling Language User’s Guide and Reference Manual, V ersion 2.18.0

work page
[25]

6 index 1 2 3 4 0 depth 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Figure 3: A graphical representation of how binary trees are constructed in ITERATIVE BUILD TREE

URL http://mc-stan.org. 6 index 1 2 3 4 0 depth 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Figure 3: A graphical representation of how binary trees are constructed in ITERATIVE BUILD TREE. The orange node is the leaf generated at the current step. Blue nodes are the leaves stored in memory for the purpose of checking the U-Turn condition. White nodes are past ...

work page 2080

[1] [1]

Composable Effects for Flexible and Accelerated Probabilistic Programming in NumPyro

and Edward2 [6] based on TensorFlow, and PyMC3 [7] based on Theano. NumPyro is a package for probabilistic programming built atop JAX [8, 9], which is a high-level tracing library for program transformations (e.g. automatic differentiation, vectorization and JIT compilation) of Python and NumPy functions. Thus NumPyro enables users to write probabilistic ...

work page internal anchor Pith review Pith/arXiv arXiv 1912

[2] [2]

An introduction to probabilistic programming, 2021

Jan-Willem van de Meent, Brooks Paige, Hongseok Yang, and Frank Wood. An introduction to probabilistic programming. arXiv preprint arXiv:1809.10756, 2018

work page arXiv 2018

[3] [3]

Chen, Martin Jankowiak, Fritz Obermeyer, Neeraj Pradhan, Theofanis Karaletsos, Rohit Singh, Paul Szerlip, Paul Horsfall, and Noah D

Eli Bingham, Jonathan P. Chen, Martin Jankowiak, Fritz Obermeyer, Neeraj Pradhan, Theofanis Karaletsos, Rohit Singh, Paul Szerlip, Paul Horsfall, and Noah D. Goodman. Pyro: Deep universal probabilistic programming. Journal of Machine Learning Research , 20(28):1–6, 2019. URL http://jmlr.org/ papers/v20/18-403.html

work page 2019

[4] [4]

Learning disentangled representations with semi-supervised deep generative models

Siddharth Narayanaswamy, T Brooks Paige, Jan-Willem Van de Meent, Alban Desmaison, Noah Goodman, Pushmeet Kohli, Frank Wood, and Philip Torr. Learning disentangled representations with semi-supervised deep generative models. In Advances in Neural Information Processing Systems , pages 5925–5935, 2017

work page 2017

[5] [5]

Automatic differentiation in pytorch

Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. Automatic differentiation in pytorch. 2017

work page 2017

[6] [6]

TensorFlow Distributions

Joshua V Dillon, Ian Langmore, Dustin Tran, Eugene Brevdo, Srinivas Vasudevan, Dave Moore, Brian Patton, Alex Alemi, Matt Hoffman, and Rif A Saurous. Tensorﬂow distributions. arXiv preprint arXiv:1711.10604, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[7] [7]

Simple, distributed, and accelerated probabilistic programming

Dustin Tran, Matthew W Hoffman, Dave Moore, Christopher Suter, Srinivas Vasudevan, and Alexey Radul. Simple, distributed, and accelerated probabilistic programming. In Advances in Neural Information Processing Systems, pages 7598–7609, 2018

work page 2018

[8] [8]

and Fonnesbeck, Christopher , title =

John Salvatier, Thomas V . Wiecki, and Christopher Fonnesbeck. Probabilistic programming in python using PyMC3. PeerJ Computer Science , 2:e55, apr 2016. doi: 10.7717/peerj-cs.55. URL https: //doi.org/10.7717/peerj-cs.55

work page doi:10.7717/peerj-cs.55 2016

[9] [9]

JAX: composable transformations of Python+NumPy programs, 2018

James Bradbury, Roy Frostig, Peter Hawkins, Matthew James Johnson, Chris Leary, Dougal Maclaurin, and Skye Wanderman-Milne. JAX: composable transformations of Python+NumPy programs, 2018. URL http://github.com/google/jax

work page 2018

[10] [10]

Compiling machine learning programs via high-level tracing

Roy Frostig, Matthew Johnson, and Chris Leary. Compiling machine learning programs via high-level tracing. 2018. URL http://www.sysml.cc/doc/2018/146.pdf. 5

work page 2018

[11] [11]

https://www.tensorflow.org/xla/

XLA: Optimizing Compiler for Machine Learning. https://www.tensorflow.org/xla/

work page

[12] [12]

Effect Handling for Composable Program Transformations in Edward2

Dave Moore and Maria I. Gorinova. Effect handling for composable program transformations in edward2. CoRR, abs/1811.06150, 2018. URL http://arxiv.org/abs/1811.06150

work page internal anchor Pith review Pith/arXiv arXiv 2018

[13] [13]

Handlers of algebraic effects

Gordon Plotkin and Matija Pretnar. Handlers of algebraic effects. In Giuseppe Castagna, editor, Program- ming Languages and Systems , pages 80–94, Berlin, Heidelberg, 2009. Springer Berlin Heidelberg. ISBN 978-3-642-00590-9

work page 2009

[14] [14]

JAX PRNG Design

The JAX Team. JAX PRNG Design. https://github.com/google/jax/blob/master/design_ notes/prng.md, 2019

work page 2019

[15] [15]

Hoffman and Andrew Gelman

Matthew D. Hoffman and Andrew Gelman. The no-u-turn sampler: Adaptively setting path lengths in hamiltonian monte carlo. Journal of Machine Learning Research , 15:1593–1623, 2014. URL http: //jmlr.org/papers/v15/hoffman14a.html

work page 2014

[16] [16]

Hybrid monte carlo

Simon Duane, Anthony D Kennedy, Brian J Pendleton, and Duncan Roweth. Hybrid monte carlo. Physics letters B, 195(2):216–222, 1987

work page 1987

[17] [17]

Handbook of

Radford Neal. MCMC Using Hamiltonian Dynamics . CRC Press, May 2011. doi: 10.1201/b10905-6. URL http://dx.doi.org/10.1201/b10905-6

work page doi:10.1201/b10905-6 2011

[18] [18]

Stochastic variational inference

Matthew D Hoffman, David M Blei, Chong Wang, and John Paisley. Stochastic variational inference. The Journal of Machine Learning Research, 14(1):1303–1347, 2013

work page 2013

[19] [19]

Stan: A probabilistic programming language

Bob Carpenter, Andrew Gelman, Matthew D Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li, and Allen Riddell. Stan: A probabilistic programming language. Journal of statistical software, 76(1), 2017

work page 2017

[20] [20]

Arnold, Dougal J

Allen Riddell, Ari Hartikainen, Daniel Lee, riddell stan, Marco Inacio, Daniel Chen, Kenneth C. Arnold, Dougal J. Sutherland, Aki Vehtari, Shinya SUZUKI, Takahiro Kubo, Todd Small, Tobias Erhardt, Stephen Hoover, Stephan Hoyer, Richard C Gerkin, Joerg Rings, Jackie, J. J. Ramsey, Aaron Darling, seantalts, Skipper Seabold, Max Shron, Liam Brannigan, Kyle F...

work page 2018

[21] [21]

UCI machine learning repository, 2017

Dheeru Dua and Casey Graff. UCI machine learning repository, 2017. URL http://archive.ics.uci. edu/ml

work page 2017

[22] [22]

The kernel interaction trick: Fast Bayesian discovery of pairwise interactions in high dimensions

Raj Agrawal, Brian Trippe, Jonathan Huggins, and Tamara Broderick. The kernel interaction trick: Fast Bayesian discovery of pairwise interactions in high dimensions. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning , volume 97 of Proceedings of Machine Learning Research, pages 14...

work page

[23] [23]

URL http://proceedings.mlr.press/v97/agrawal19a.html

PMLR. URL http://proceedings.mlr.press/v97/agrawal19a.html

work page

[24] [24]

Stan Modeling Language User’s Guide and Reference Manual, V ersion 2.18.0

Stan Development Team. Stan Modeling Language User’s Guide and Reference Manual, V ersion 2.18.0

work page

[25] [25]

6 index 1 2 3 4 0 depth 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Figure 3: A graphical representation of how binary trees are constructed in ITERATIVE BUILD TREE

URL http://mc-stan.org. 6 index 1 2 3 4 0 depth 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Figure 3: A graphical representation of how binary trees are constructed in ITERATIVE BUILD TREE. The orange node is the leaf generated at the current step. Blue nodes are the leaves stored in memory for the purpose of checking the U-Turn condition. White nodes are past ...

work page 2080