AI4BayesCode: From Natural Language Descriptions to Validated Modular Stateful Bayesian Samplers

Alex Ziyu Jiang; Jungang Zou; Qixuan Chen

arxiv: 2605.18476 · v1 · pith:FNR3SASSnew · submitted 2026-05-18 · 📊 stat.CO · cs.AI· cs.LG

AI4BayesCode: From Natural Language Descriptions to Validated Modular Stateful Bayesian Samplers

Jungang Zou , Alex Ziyu Jiang , Qixuan Chen This is my paper

Pith reviewed 2026-05-20 01:49 UTC · model grok-4.3

classification 📊 stat.CO cs.AIcs.LG

keywords MCMCBayesian modelingLLM code generationprobabilistic programmingmodular samplingcode validationstateful coding

0 comments

The pith

AI4BayesCode turns natural-language Bayesian model descriptions into validated modular MCMC samplers

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces AI4BayesCode as an LLM-driven system that converts plain-English descriptions of Bayesian models into runnable MCMC sampling code. It works by splitting each model into separate sampling blocks that connect to ready-made components, then checking the input description before generation and the output code afterward for correctness. A new recursively stateful coding method lets these blocks fit together reliably even when they come from separate contributors. A reader would care because this setup could remove the need for users to write or debug complex sampling routines themselves, making advanced Bayesian inference more accessible.

Core claim

AI4BayesCode is an extensible LLM-driven system that translates natural-language Bayesian model descriptions into runnable, validated MCMC samplers. It adopts a modular design that decomposes models into modular sampling blocks and maps each block to a built-in sampling component. Reliability is improved through pre-generation validation of model specifications and post-generation validation of generated sampler code. A novel recursively stateful coding paradigm allows modular sampling components to be composed coherently within larger MCMC procedures.

What carries the argument

Modular decomposition of models into sampling blocks mapped to built-in components, reinforced by pre- and post-generation validation and a recursively stateful coding paradigm that enables coherent composition across modules

If this is right

Users can implement a wide range of Bayesian models without coding sampling algorithms from scratch
New built-in sampling blocks can be added to expand the system's coverage over time
Modules developed independently by different contributors compose reliably thanks to the stateful paradigm
A dedicated benchmark suite supports systematic evaluation of sampler generation from descriptions
Overall performance improves as the underlying LLM advances and more components become available

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Non-experts could apply advanced MCMC methods to their data without first learning low-level implementation details
The same modular-plus-validation pattern might transfer to code generation for other inference algorithms
Models with unusual dependence structures could expose limits in how well the current blocks handle edge cases

Load-bearing premise

That modular breakdown into built-in sampling components plus validation steps is enough to produce correct and composable samplers for complex models without users writing algorithms themselves

What would settle it

A natural-language description of a hierarchical model whose generated sampler produces posterior samples that diverge from those of a manually verified reference implementation on the same dataset

Figures

Figures reproduced from arXiv: 2605.18476 by Alex Ziyu Jiang, Jungang Zou, Qixuan Chen.

**Figure 1.** Figure 1: Overview of AI4BayesCode using a Spike-and-Slab regression model as an example. The system translates a natural-language Bayesian model description into a validated MCMC sampler through pre-generation validation, modular block decomposition and lookup, stateful sampler generation, and post-generation validation. DAG: directed acyclic graph. NUTS: no-U-turn sampler. RJMCMC: reversible-jump MCMC. 2.1 Modular… view at source ↗

**Figure 2.** Figure 2: Three-tier architecture of AI4BayesCode. Example 2. Three-tier update in Gaussian linear regression Continuing Example 1, the wrapper decomposes the model and assigns σ and β to two separate modular blocks, each associated with its own conditional target: π(σ | y, Xβ, τσ) and π(β | y, X, σ, µ, η2 ). By default, AI4BayesCode uses the NUTS block for continuous parameters and thus assigns both modular blocks … view at source ↗

**Figure 3.** Figure 3 [PITH_FULL_IMAGE:figures/full_fig_p016_3.png] view at source ↗

**Figure 4.** Figure 4: Results of Experiment 1. Each row corresponds to a model category, and each point represents a model. Bracketed ratios indicate the validated-sampler rate within each category after at most M = 5 attempts. Models with high (≥ 1.05) Rˆ are highlighted in orange; red crosses represent reference samplers, where available. ESS and running time are plotted on log10 x-scales. model aligns with a built-in block, … view at source ↗

**Figure 5.** Figure 5: Results of Experiment 2. The colored dots represent successful, independently generated AI4BayesCode samplers, and the red crosses represent reference samplers. ESS and running time are plotted on log10 x-scales. 5.4 Experiment 3: Validation beyond code generation alone Finally, we evaluated the extent to which post-generation validation improves results beyond code generation alone. Using the same setting… view at source ↗

read the original abstract

Coding and computation remain major bottlenecks in Markov chain Monte Carlo (MCMC) workflows, especially as modern sampling algorithms have become increasingly complex and existing probabilistic programming systems remain limited in model support, extensibility, and composability. We introduce \textbf{AI4BayesCode}, an extensible LLM-driven system that translates natural-language Bayesian model descriptions into runnable, validated MCMC samplers. To improve reliability, AI4BayesCode adopts a modular design that decomposes models into modular sampling blocks and maps each block to a built-in sampling component, reducing the need to implement complex sampling algorithms from scratch. Reliability is further improved through pre-generation validation of model specifications and post-generation validation of generated sampler code. AI4BayesCode also introduces a novel recursively stateful coding paradigm for MCMC, allowing modular sampling components, potentially developed by different contributors, to be composed coherently within larger MCMC procedures. We develop a benchmark suite to evaluate AI4BayesCode for sampler-generation. Experiments show that AI4BayesCode can implement a wide range of Bayesian models from natural-language descriptions alone. As an open-ended system, its capability can continue to expand with improvements in the underlying AI agent and the addition of new built-in blocks.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

AI4BayesCode introduces a modular LLM system with a recursively stateful coding idea for composing MCMC blocks, but the experiments are described too vaguely to show whether the generated samplers are actually statistically valid.

read the letter

The main point is that this system aims to turn natural-language model descriptions into runnable MCMC samplers by breaking them into modular blocks mapped to built-in components, plus pre- and post-generation checks, and a new recursively stateful way to compose those blocks. That composition approach is the clearest bit of novelty here, since it tries to let pieces from different sources fit together without breaking the overall procedure.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces AI4BayesCode, an extensible LLM-driven system that translates natural-language Bayesian model descriptions into runnable, validated MCMC samplers. It employs a modular design that decomposes models into sampling blocks mapped to built-in components, incorporates pre- and post-generation validation, and introduces a recursively stateful coding paradigm to enable coherent composition of modular components potentially contributed by different developers. A benchmark suite is developed to evaluate sampler generation, with the claim that experiments demonstrate successful implementation of a wide range of Bayesian models from natural-language descriptions alone.

Significance. If the central claims hold, the work could meaningfully reduce the coding and implementation barriers in MCMC workflows by leveraging LLMs for model-to-sampler translation while emphasizing extensibility through new blocks and improved underlying agents. The modular decomposition and recursively stateful paradigm are presented as addressing composability limitations in existing probabilistic programming systems; these design choices merit credit as they aim to support community-driven expansion without requiring users to implement algorithms from scratch.

major comments (2)

[Abstract] Abstract: the claim that 'experiments show that AI4BayesCode can implement a wide range of Bayesian models from natural-language descriptions alone' is unsupported by any reported quantitative success rates, failure modes, benchmark details, or error analysis, which is load-bearing for the central reliability claim.
[Abstract / validation description] Post-generation validation (described in Abstract and implied in the method): the validation is stated to improve reliability of generated sampler code, yet no indication is given that it incorporates MCMC statistical diagnostics such as Gelman-Rubin convergence checks, effective sample size, or comparison against known posteriors; this leaves open whether generated samplers are merely runnable or actually sample from the target posterior, especially for non-conjugate or complex models.

minor comments (1)

[Abstract] The recursively stateful coding paradigm is introduced as novel but would benefit from an explicit statement of its composition invariants in the main text to clarify how state is preserved across arbitrary module combinations.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their thoughtful and constructive comments. We address each major comment below and indicate planned revisions to improve clarity and support for the central claims.

read point-by-point responses

Referee: [Abstract] Abstract: the claim that 'experiments show that AI4BayesCode can implement a wide range of Bayesian models from natural-language descriptions alone' is unsupported by any reported quantitative success rates, failure modes, benchmark details, or error analysis, which is load-bearing for the central reliability claim.

Authors: We agree that the abstract would be strengthened by including quantitative details. The manuscript describes a benchmark suite and reports experimental results across multiple models; in the revised version we will update the abstract to report key success rates (e.g., fraction of natural-language descriptions that produced executable and validated samplers), note the main failure modes observed, and briefly reference the benchmark design. revision: yes
Referee: [Abstract / validation description] Post-generation validation (described in Abstract and implied in the method): the validation is stated to improve reliability of generated sampler code, yet no indication is given that it incorporates MCMC statistical diagnostics such as Gelman-Rubin convergence checks, effective sample size, or comparison against known posteriors; this leaves open whether generated samplers are merely runnable or actually sample from the target posterior, especially for non-conjugate or complex models.

Authors: We appreciate this clarification request. The post-generation validation currently checks syntactic correctness, successful execution, and consistency with the modular stateful composition rules. It does not yet perform statistical MCMC diagnostics such as Gelman-Rubin or effective sample size. We will revise the manuscript to explicitly state the current scope of validation and to note that rigorous posterior correctness verification via such diagnostics is an important direction for future extensions, particularly for non-conjugate models. revision: partial

Circularity Check

0 steps flagged

No circularity; system claims rest on architecture and external benchmarks

full rationale

The paper presents an LLM-based system for translating natural-language Bayesian model descriptions into modular MCMC samplers, with pre/post-generation validation and a recursively stateful paradigm. No equations, fitted parameters, or derivations are described that reduce outputs to inputs by construction. The central claims rely on a benchmark suite for evaluation and the extensibility of adding new blocks, which are independent of any self-referential loop. No self-citations are invoked as load-bearing uniqueness theorems or ansatzes. The skeptic concern about validation depth addresses empirical correctness rather than circularity in the derivation chain. This is a standard systems paper whose results are falsifiable via the benchmark and do not collapse to tautology.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

Based on the abstract alone, the central claim rests on the assumption that LLMs guided by modular structure and validation can reliably produce correct MCMC code, plus the introduction of a new coding paradigm without external falsifiable evidence beyond the system description itself.

axioms (1)

domain assumption LLMs can reliably translate natural language Bayesian model descriptions into correct and composable MCMC sampler code when the model is decomposed into modular blocks with pre- and post-generation validation.
This assumption underpins the reliability and extensibility claims of the entire system.

invented entities (1)

recursively stateful coding paradigm no independent evidence
purpose: Enables coherent composition of modular sampling components, potentially developed by different contributors, within larger MCMC procedures.
Presented as a novel technical contribution for stateful MCMC composition.

pith-pipeline@v0.9.0 · 5750 in / 1392 out tokens · 73181 ms · 2026-05-20T01:49:31.635866+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

76 extracted references · 76 canonical work pages · 5 internal anchors

[1]

Nonparametric Machine Learning and Efficient Computation with Bayesian Additive Regression Trees: The BART R Package , year =

Sparapani, Rodney and Spanbauer, Charles and McCulloch, Robert , journal =. Nonparametric Machine Learning and Efficient Computation with Bayesian Additive Regression Trees: The BART R Package , year =

work page
[2]

Chipman and Edward I

Hugh A. Chipman and Edward I. George and Robert E. McCulloch , journal =. 2010 , number =

work page 2010
[3]

Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , year =

Geman, Stuart and Geman, Donald , journal =. Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , year =. doi:10.1109/TPAMI.1984.4767596 , keywords =

work page doi:10.1109/tpami.1984.4767596 1984
[4]

and Rosenbluth, Marshall N

Metropolis, Nicholas and Rosenbluth, Arianna W. and Rosenbluth, Marshall N. and Teller, Augusta H. and Teller, Edward , journal =. Equation of State Calculations by Fast Computing Machines , year =. doi:10.1063/1.1699114 , interhash =

work page doi:10.1063/1.1699114
[5]

Hastings, W. K. , journal =. Monte Carlo sampling methods using Markov chains and their applications , year =

work page
[6]

Ieee Software , volume=

Migrating enterprise legacy source code to microservices: on multitenancy, statefulness, and data consistency , author=. Ieee Software , volume=. 2017 , publisher=

work page 2017
[7]

M. J. Betancourt and Mark Girolami , title =. 2013 , archiveprefix =. 1312.0906 , primaryclass =

work page internal anchor Pith review Pith/arXiv arXiv 2013
[8]

Hierarchical two-parameter logistic item response model , url=

Furr, Daniel , year=. Hierarchical two-parameter logistic item response model , url=

work page
[9]

2011 , publisher=

Bayesian population analysis using WinBUGS: a hierarchical perspective , author=. 2011 , publisher=

work page 2011
[10]

2013 , publisher=

Bayesian Cognitive Modeling , author=. 2013 , publisher=

work page 2013
[11]

Applied Psychological Measurement , volume=

Computerized adaptive testing with item cloning , author=. Applied Psychological Measurement , volume=. 2003 , publisher=

work page 2003
[12]

Ecology , volume=

Estimating species richness and accumulation by modeling species occurrence and detectability , author=. Ecology , volume=. 2006 , publisher=

work page 2006
[13]

Geoscientific Model Development , volume=

Modeling radiocarbon dynamics in soils: SoilR version 1.1 , author=. Geoscientific Model Development , volume=. 2014 , publisher=

work page 2014
[14]

New Yorli , year=

The Conservation of thc Wild Life of Canada , author=. New Yorli , year=

work page
[15]

2009 , publisher=

Howard, Peter , title=. 2009 , publisher=

work page 2009
[16]

Baltimore: Waverly , year=

Principles of physical biology , author=. Baltimore: Waverly , year=

work page
[17]

1926 , publisher=

Fluctuations in the abundance of a species considered mathematically , author=. 1926 , publisher=

work page 1926
[18]

1927 , publisher=

Variazioni e fluttuazioni del numero d'individui in specie animali conviventi , author=. 1927 , publisher=

work page 1927
[19]

ACM Computing Surveys (CSUR) , volume=

Feature selection: A data perspective , author=. ACM Computing Surveys (CSUR) , volume=. 2018 , publisher=

work page 2018
[20]

Electronic Journal of Statistics , volume=

Sparsity information and regularization in the horseshoe and other shrinkage priors , author=. Electronic Journal of Statistics , volume=. 2017 , publisher=

work page 2017
[21]

The American Statistician , volume=

Forecasting at scale , author=. The American Statistician , volume=. 2018 , publisher=

work page 2018
[22]

arXiv preprint arXiv:1905.11916 , year=

Selecting the Metric in Hamiltonian Monte Carlo , author=. arXiv preprint arXiv:1905.11916 , year=

work page arXiv 1905
[23]

2016 , publisher=

ggplot2: elegant graphics for data analysis , author=. 2016 , publisher=

work page 2016
[24]

2013 , publisher=

Modern applied statistics with S-PLUS , author=. 2013 , publisher=

work page 2013
[25]

2006 , publisher=

Data analysis using regression and multilevel/hierarchical models , author=. 2006 , publisher=

work page 2006
[26]

2013 , publisher=

Bayesian data analysis , author=. 2013 , publisher=

work page 2013
[27]

Journal of Educational Statistics , volume=

Estimation in parallel randomized experiments , author=. Journal of Educational Statistics , volume=. 1981 , publisher=

work page 1981
[28]

Journal of machine Learning research , volume=

Latent dirichlet allocation , author=. Journal of machine Learning research , volume=

work page
[29]

Epidemiology (Cambridge, Mass.) , volume=

Cholera modeling: challenges to quantitative analysis and predicting the impact of interventions , author=. Epidemiology (Cambridge, Mass.) , volume=. 2012 , publisher=

work page 2012
[30]

Report 13: Estimating the number of infections and the impact of non-pharmaceutical interventions on COVID-19 in 11 European countries , author=

work page
[31]

Proceedings of the IEEE , volume=

Gradient-based learning applied to document recognition , author=. Proceedings of the IEEE , volume=. 1998 , publisher=

work page 1998
[32]

Neural networks , volume=

Bayesian approach for neural networks—review and case studies , author=. Neural networks , volume=. 2001 , publisher=

work page 2001
[33]

2021 , Eprint =

Johan Jonasson and Måns Magnusson , Title =. 2021 , Eprint =

work page 2021
[34]

Andras Farkas , Title =

work page
[35]

Bird-habitat associations predict population trends in central European forest and farmland birds , url =

Jil. Bird-habitat associations predict population trends in central European forest and farmland birds , url =

work page
[36]

Small-area spatiotemporal analysis of pedestrian and bicyclist injuries in

DiMaggio, Charles , journal=. Small-area spatiotemporal analysis of pedestrian and bicyclist injuries in. 2015 , publisher=

work page 2015
[37]

Manuscript in preparation , year=

Fast hierarchical Gaussian processes , author=. Manuscript in preparation , year=

work page
[38]

1995 , publisher=

Euro-barometer 38.1: Consumer protection and perceptions of science and technology, november 1992 , author=. 1995 , publisher=

work page 1992
[39]

MRC Biostatistics Unit , Title =

work page
[40]

Roche and Howard Wainer and David Thissen , title =

Alex F. Roche and Howard Wainer and David Thissen , title =. Official Journal of the American Academy of Pediatrics , year =

work page
[41]

Gelfand and Adrian F

Alan E. Gelfand and Susan E. Hills and Amy Racine-Poon and Adrian F. M. Smith , title =. Journal of the American Statistical Association , volume =. 1990 , publisher =. doi:10.1080/01621459.1990.10474968 , URL =

work page doi:10.1080/01621459.1990.10474968 1990
[42]

Crowder , title =

Martin J. Crowder , title =. Journal of the Royal Statistical Society: Series C (Applied Statistics) , volume =. doi:https://doi.org/10.2307/2346223 , url =. https://rss.onlinelibrary.wiley.com/doi/pdf/10.2307/2346223 , abstract =

work page doi:10.2307/2346223
[43]

Handbook of Markov Chain Monte Carlo , year =

Brooks, Steve and Gelman, Andrew and Jones, Galin and Meng, Xiao-Li , publisher =. Handbook of Markov Chain Monte Carlo , year =

work page
[44]

Roberts and Richard L

Gareth O. Roberts and Richard L. Tweedie , journal =. Exponential convergence of Langevin distributions and their discrete approximations , year =

work page
[45]

The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo

Hoffman, Matthew D and Gelman, Andrew , journal =. The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. , year =

work page
[46]

and Lee, Daniel and Goodrich, Ben and Betancourt, Michael and Brubaker, Marcus and Guo, Jiqiang and Li, Peter and Riddell, Allen , journal =

Carpenter, Bob and Gelman, Andrew and Hoffman, Matthew D. and Lee, Daniel and Goodrich, Ben and Betancourt, Michael and Brubaker, Marcus and Guo, Jiqiang and Li, Peter and Riddell, Allen , journal =. Stan: A Probabilistic Programming Language , year =

work page
[47]

Lunn and Andrew Thomas and Nicky Best and David Spiegelhalter , journal =

David J. Lunn and Andrew Thomas and Nicky Best and David Spiegelhalter , journal =. WinBUGS - A Bayesian modelling framework: Concepts, structure, and extensibility , year =

work page
[48]

JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling , year =

Plummer, Martyn and others , booktitle =. JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling , year =

work page
[49]

and Kochurov, Maxim and Kumar, Ravin and Lao, Junpeng and Luhmann, Christian C

Abril-Pla, Oriol and Andreani, Virgile and Carroll, Colin and Dong, Larry and Fonnesbeck, Christopher J. and Kochurov, Maxim and Kumar, Ravin and Lao, Junpeng and Luhmann, Christian C. and Martin, Osvaldo A. and Osthege, Michael and Vieira, Ricardo and Wiecki, Thomas and Zinkov, Robert , journal =. 2023 , pages =

work page 2023
[50]

Composable Effects for Flexible and Accelerated Probabilistic Programming in NumPyro

Du Phan and Neeraj Pradhan and Martin Jankowiak , title =. 2019 , archiveprefix =. 1912.11554 , primaryclass =

work page internal anchor Pith review Pith/arXiv arXiv 2019
[51]

and Sun, Xianda and Hauru, Markus and Yong, Penelope and Tebbutt, Will and Ghahramani, Zoubin and Ge, Hong , journal =

Fjelde, Tor Erlend and Xu, Kai and Widmann, David and Tarek, Mohamed and Pfiffer, Cameron and Trapp, Martin and Axen, Seth D. and Sun, Xianda and Hauru, Markus and Yong, Penelope and Tebbutt, Will and Ghahramani, Zoubin and Ge, Hong , journal =. Turing.jl: A General-Purpose Probabilistic Programming Language , year =. doi:10.1145/3711897 , issue_date =

work page doi:10.1145/3711897
[52]

and Hellton, Kristoffer H

Riegler, Michael A. and Hellton, Kristoffer H. and Thambawita, Vajira and Hammer, Hugo L. , journal =. Using large language models to suggest informative prior distributions in. 2025 , issn =. doi:10.1038/s41598-025-18425-9 , keywords =

work page doi:10.1038/s41598-025-18425-9 2025
[53]

2025 , archiveprefix =

Yongchao Huang , title =. 2025 , archiveprefix =. 2508.03766 , primaryclass =

work page arXiv 2025
[54]

Krishnan and Payam Barnaghi , title =

Alexander Capstick and Rahul G. Krishnan and Payam Barnaghi , title =. 2025 , archiveprefix =. 2411.17284 , primaryclass =

work page arXiv 2025
[55]

2025 , archiveprefix =

Jean Feng and Avni Kothari and Luke Zier and Chandan Singh and Yan Shuo Tan , title =. 2025 , archiveprefix =. 2410.15555 , primaryclass =

work page arXiv 2025
[56]

Proceedings of the 6th ACM International Conference on AI in Finance , title =

Li, Kang and Miao, Jiawei and Cucuringu, Mihai and S\'. Proceedings of the 6th ACM International Conference on AI in Finance , title =. 2025 , address =. doi:10.1145/3768292.3770437 , isbn =

work page doi:10.1145/3768292.3770437 2025
[57]

2025 , copyright =

Huang, Yongchao , title =. 2025 , copyright =. doi:10.5281/ZENODO.16756724 , keywords =

work page doi:10.5281/zenodo.16756724 2025
[58]

Ai agentic programming: A survey of techniques, challenges, and opportunities

Huanting Wang and Jingzhi Gong and Huawei Zhang and Jie Xu and Zheng Wang , title =. 2025 , archiveprefix =. 2508.11126 , primaryclass =

work page arXiv 2025
[59]

2025 , archiveprefix =

Yuyao Ge and Lingrui Mei and Zenghao Duan and Tianhao Li and Yujia Zheng and Yiwei Wang and Lexin Wang and Jiayu Yao and Tianyu Liu and Yujun Cai and Baolong Bi and Fangda Guo and Jiafeng Guo and Shenghua Liu and Xueqi Cheng , title =. 2025 , archiveprefix =. 2510.12399 , primaryclass =

work page arXiv 2025
[60]

2024 , archiveprefix =

Avinash Anand and Akshit Gupta and Nishchay Yadav and Shaurya Bajaj , title =. 2024 , archiveprefix =. 2411.07586 , primaryclass =

work page arXiv 2024
[61]

2026 , archiveprefix =

Oliver Dürr , title =. 2026 , archiveprefix =. 2603.27766 , primaryclass =

work page arXiv 2026
[62]

and Ge, Hong , howpublished =

Sun, Xianda and Gordon, Andrew D. and Ge, Hong , howpublished =. Multi-Agent Systems for Traceable

work page
[63]

2024 , archiveprefix =

Måns Magnusson and Jakob Torgander and Paul-Christian Bürkner and Lu Zhang and Bob Carpenter and Aki Vehtari , title =. 2024 , archiveprefix =. 2407.04967 , primaryclass =

work page arXiv 2024
[64]

Bayesian Analysis , title =

Aki Vehtari and Andrew Gelman and Daniel Simpson and Bob Carpenter and Paul-Christian B. Bayesian Analysis , title =. 2021 , number =

work page 2021
[65]

, journal =

Fakhoury, Sarah and Naik, Aaditya and Sakkas, Georgios and Chakraborty, Saikat and Lahiri, Shuvendu K. , journal =. LLM-Based Test-Driven Interactive Code Generation: User Study and Empirical Evaluation , year =. doi:10.1109/TSE.2024.3428972 , keywords =

work page doi:10.1109/tse.2024.3428972 2024
[66]

Pareto smoothed importance sampling , year =

Vehtari, Aki and Simpson, Daniel and Gelman, Andrew and Yao, Yuling and Gabry, Jonah , journal =. Pareto smoothed importance sampling , year =

work page
[67]

O'Hara, Keith , howpublished =

work page
[68]

Leal, Allan , howpublished =

work page
[69]

, booktitle =

Hartung, Joachim and Knapp, Guido and Sinha, Bikash K. , booktitle =. Meta-Regression , year =

work page
[70]

Linero , journal =

Antonio R. Linero , journal =. Generalized Bayesian Additive Regression Trees Models: Beyond Conditional Conjugacy , year =. doi:10.1080/01621459.2024.2337156 , eprint =

work page doi:10.1080/01621459.2024.2337156 2024
[71]

LAMBDA: A Large Model Based Data Agent , year =

Maojun Sun and Ruijian Han and Binyan Jiang and Houduo Qi and Defeng Sun and Yancheng Yuan and Jian Huang , journal =. LAMBDA: A Large Model Based Data Agent , year =

work page
[72]

Externalization in LLM Agents: A Unified Review of Memory, Skills, Protocols and Harness Engineering

Chenyu Zhou and Huacan Chai and Wenteng Chen and Zihan Guo and Rong Shan and Yuanyi Song and Tianyi Xu and Yingxuan Yang and Aofan Yu and Weiming Zhang and Congming Zheng and Jiachen Zhu and Zeyu Zheng and Zhuosheng Zhang and Xingyu Lou and Changwang Zhang and Zhihui Fu and Jun Wang and Weiwen Liu and Jianghao Lin and Weinan Zhang , title =. 2026 , archiv...

work page internal anchor Pith review Pith/arXiv arXiv 2026
[73]

Natural-Language Agent Harnesses

Linyue Pan and Lexiao Zou and Shuo Guo and Jingchen Ni and Hai-Tao Zheng , title =. 2026 , archiveprefix =. 2603.25723 , primaryclass =

work page internal anchor Pith review Pith/arXiv arXiv 2026
[74]

Agentic Harness Engineering: Observability-Driven Automatic Evolution of Coding-Agent Harnesses

Jiahang Lin and Shichun Liu and Chengjun Pan and Lizhi Lin and Shihan Dou and Xuanjing Huang and Hang Yan and Zhenhua Han and Tao Gui , title =. 2026 , archiveprefix =. 2604.25850 , primaryclass =

work page internal anchor Pith review Pith/arXiv arXiv 2026
[75]

and Yang, Yun , journal =

Linero, Antonio R. and Yang, Yun , journal =. Bayesian Regression Tree Ensembles that Adapt to Smoothness and Sparsity , year =

work page
[76]

Gibbs Sampling Methods for Stick-Breaking Priors , year =

Hemant Ishwaran and Lancelot F James , journal =. Gibbs Sampling Methods for Stick-Breaking Priors , year =

work page

[1] [1]

Nonparametric Machine Learning and Efficient Computation with Bayesian Additive Regression Trees: The BART R Package , year =

Sparapani, Rodney and Spanbauer, Charles and McCulloch, Robert , journal =. Nonparametric Machine Learning and Efficient Computation with Bayesian Additive Regression Trees: The BART R Package , year =

work page

[2] [2]

Chipman and Edward I

Hugh A. Chipman and Edward I. George and Robert E. McCulloch , journal =. 2010 , number =

work page 2010

[3] [3]

Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , year =

Geman, Stuart and Geman, Donald , journal =. Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , year =. doi:10.1109/TPAMI.1984.4767596 , keywords =

work page doi:10.1109/tpami.1984.4767596 1984

[4] [4]

and Rosenbluth, Marshall N

Metropolis, Nicholas and Rosenbluth, Arianna W. and Rosenbluth, Marshall N. and Teller, Augusta H. and Teller, Edward , journal =. Equation of State Calculations by Fast Computing Machines , year =. doi:10.1063/1.1699114 , interhash =

work page doi:10.1063/1.1699114

[5] [5]

Hastings, W. K. , journal =. Monte Carlo sampling methods using Markov chains and their applications , year =

work page

[6] [6]

Ieee Software , volume=

Migrating enterprise legacy source code to microservices: on multitenancy, statefulness, and data consistency , author=. Ieee Software , volume=. 2017 , publisher=

work page 2017

[7] [7]

M. J. Betancourt and Mark Girolami , title =. 2013 , archiveprefix =. 1312.0906 , primaryclass =

work page internal anchor Pith review Pith/arXiv arXiv 2013

[8] [8]

Hierarchical two-parameter logistic item response model , url=

Furr, Daniel , year=. Hierarchical two-parameter logistic item response model , url=

work page

[9] [9]

2011 , publisher=

Bayesian population analysis using WinBUGS: a hierarchical perspective , author=. 2011 , publisher=

work page 2011

[10] [10]

2013 , publisher=

Bayesian Cognitive Modeling , author=. 2013 , publisher=

work page 2013

[11] [11]

Applied Psychological Measurement , volume=

Computerized adaptive testing with item cloning , author=. Applied Psychological Measurement , volume=. 2003 , publisher=

work page 2003

[12] [12]

Ecology , volume=

Estimating species richness and accumulation by modeling species occurrence and detectability , author=. Ecology , volume=. 2006 , publisher=

work page 2006

[13] [13]

Geoscientific Model Development , volume=

Modeling radiocarbon dynamics in soils: SoilR version 1.1 , author=. Geoscientific Model Development , volume=. 2014 , publisher=

work page 2014

[14] [14]

New Yorli , year=

The Conservation of thc Wild Life of Canada , author=. New Yorli , year=

work page

[15] [15]

2009 , publisher=

Howard, Peter , title=. 2009 , publisher=

work page 2009

[16] [16]

Baltimore: Waverly , year=

Principles of physical biology , author=. Baltimore: Waverly , year=

work page

[17] [17]

1926 , publisher=

Fluctuations in the abundance of a species considered mathematically , author=. 1926 , publisher=

work page 1926

[18] [18]

1927 , publisher=

Variazioni e fluttuazioni del numero d'individui in specie animali conviventi , author=. 1927 , publisher=

work page 1927

[19] [19]

ACM Computing Surveys (CSUR) , volume=

Feature selection: A data perspective , author=. ACM Computing Surveys (CSUR) , volume=. 2018 , publisher=

work page 2018

[20] [20]

Electronic Journal of Statistics , volume=

Sparsity information and regularization in the horseshoe and other shrinkage priors , author=. Electronic Journal of Statistics , volume=. 2017 , publisher=

work page 2017

[21] [21]

The American Statistician , volume=

Forecasting at scale , author=. The American Statistician , volume=. 2018 , publisher=

work page 2018

[22] [22]

arXiv preprint arXiv:1905.11916 , year=

Selecting the Metric in Hamiltonian Monte Carlo , author=. arXiv preprint arXiv:1905.11916 , year=

work page arXiv 1905

[23] [23]

2016 , publisher=

ggplot2: elegant graphics for data analysis , author=. 2016 , publisher=

work page 2016

[24] [24]

2013 , publisher=

Modern applied statistics with S-PLUS , author=. 2013 , publisher=

work page 2013

[25] [25]

2006 , publisher=

Data analysis using regression and multilevel/hierarchical models , author=. 2006 , publisher=

work page 2006

[26] [26]

2013 , publisher=

Bayesian data analysis , author=. 2013 , publisher=

work page 2013

[27] [27]

Journal of Educational Statistics , volume=

Estimation in parallel randomized experiments , author=. Journal of Educational Statistics , volume=. 1981 , publisher=

work page 1981

[28] [28]

Journal of machine Learning research , volume=

Latent dirichlet allocation , author=. Journal of machine Learning research , volume=

work page

[29] [29]

Epidemiology (Cambridge, Mass.) , volume=

Cholera modeling: challenges to quantitative analysis and predicting the impact of interventions , author=. Epidemiology (Cambridge, Mass.) , volume=. 2012 , publisher=

work page 2012

[30] [30]

Report 13: Estimating the number of infections and the impact of non-pharmaceutical interventions on COVID-19 in 11 European countries , author=

work page

[31] [31]

Proceedings of the IEEE , volume=

Gradient-based learning applied to document recognition , author=. Proceedings of the IEEE , volume=. 1998 , publisher=

work page 1998

[32] [32]

Neural networks , volume=

Bayesian approach for neural networks—review and case studies , author=. Neural networks , volume=. 2001 , publisher=

work page 2001

[33] [33]

2021 , Eprint =

Johan Jonasson and Måns Magnusson , Title =. 2021 , Eprint =

work page 2021

[34] [34]

Andras Farkas , Title =

work page

[35] [35]

Bird-habitat associations predict population trends in central European forest and farmland birds , url =

Jil. Bird-habitat associations predict population trends in central European forest and farmland birds , url =

work page

[36] [36]

Small-area spatiotemporal analysis of pedestrian and bicyclist injuries in

DiMaggio, Charles , journal=. Small-area spatiotemporal analysis of pedestrian and bicyclist injuries in. 2015 , publisher=

work page 2015

[37] [37]

Manuscript in preparation , year=

Fast hierarchical Gaussian processes , author=. Manuscript in preparation , year=

work page

[38] [38]

1995 , publisher=

Euro-barometer 38.1: Consumer protection and perceptions of science and technology, november 1992 , author=. 1995 , publisher=

work page 1992

[39] [39]

MRC Biostatistics Unit , Title =

work page

[40] [40]

Roche and Howard Wainer and David Thissen , title =

Alex F. Roche and Howard Wainer and David Thissen , title =. Official Journal of the American Academy of Pediatrics , year =

work page

[41] [41]

Gelfand and Adrian F

Alan E. Gelfand and Susan E. Hills and Amy Racine-Poon and Adrian F. M. Smith , title =. Journal of the American Statistical Association , volume =. 1990 , publisher =. doi:10.1080/01621459.1990.10474968 , URL =

work page doi:10.1080/01621459.1990.10474968 1990

[42] [42]

Crowder , title =

Martin J. Crowder , title =. Journal of the Royal Statistical Society: Series C (Applied Statistics) , volume =. doi:https://doi.org/10.2307/2346223 , url =. https://rss.onlinelibrary.wiley.com/doi/pdf/10.2307/2346223 , abstract =

work page doi:10.2307/2346223

[43] [43]

Handbook of Markov Chain Monte Carlo , year =

Brooks, Steve and Gelman, Andrew and Jones, Galin and Meng, Xiao-Li , publisher =. Handbook of Markov Chain Monte Carlo , year =

work page

[44] [44]

Roberts and Richard L

Gareth O. Roberts and Richard L. Tweedie , journal =. Exponential convergence of Langevin distributions and their discrete approximations , year =

work page

[45] [45]

The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo

Hoffman, Matthew D and Gelman, Andrew , journal =. The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. , year =

work page

[46] [46]

and Lee, Daniel and Goodrich, Ben and Betancourt, Michael and Brubaker, Marcus and Guo, Jiqiang and Li, Peter and Riddell, Allen , journal =

Carpenter, Bob and Gelman, Andrew and Hoffman, Matthew D. and Lee, Daniel and Goodrich, Ben and Betancourt, Michael and Brubaker, Marcus and Guo, Jiqiang and Li, Peter and Riddell, Allen , journal =. Stan: A Probabilistic Programming Language , year =

work page

[47] [47]

Lunn and Andrew Thomas and Nicky Best and David Spiegelhalter , journal =

David J. Lunn and Andrew Thomas and Nicky Best and David Spiegelhalter , journal =. WinBUGS - A Bayesian modelling framework: Concepts, structure, and extensibility , year =

work page

[48] [48]

JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling , year =

Plummer, Martyn and others , booktitle =. JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling , year =

work page

[49] [49]

and Kochurov, Maxim and Kumar, Ravin and Lao, Junpeng and Luhmann, Christian C

Abril-Pla, Oriol and Andreani, Virgile and Carroll, Colin and Dong, Larry and Fonnesbeck, Christopher J. and Kochurov, Maxim and Kumar, Ravin and Lao, Junpeng and Luhmann, Christian C. and Martin, Osvaldo A. and Osthege, Michael and Vieira, Ricardo and Wiecki, Thomas and Zinkov, Robert , journal =. 2023 , pages =

work page 2023

[50] [50]

Composable Effects for Flexible and Accelerated Probabilistic Programming in NumPyro

Du Phan and Neeraj Pradhan and Martin Jankowiak , title =. 2019 , archiveprefix =. 1912.11554 , primaryclass =

work page internal anchor Pith review Pith/arXiv arXiv 2019

[51] [51]

and Sun, Xianda and Hauru, Markus and Yong, Penelope and Tebbutt, Will and Ghahramani, Zoubin and Ge, Hong , journal =

Fjelde, Tor Erlend and Xu, Kai and Widmann, David and Tarek, Mohamed and Pfiffer, Cameron and Trapp, Martin and Axen, Seth D. and Sun, Xianda and Hauru, Markus and Yong, Penelope and Tebbutt, Will and Ghahramani, Zoubin and Ge, Hong , journal =. Turing.jl: A General-Purpose Probabilistic Programming Language , year =. doi:10.1145/3711897 , issue_date =

work page doi:10.1145/3711897

[52] [52]

and Hellton, Kristoffer H

Riegler, Michael A. and Hellton, Kristoffer H. and Thambawita, Vajira and Hammer, Hugo L. , journal =. Using large language models to suggest informative prior distributions in. 2025 , issn =. doi:10.1038/s41598-025-18425-9 , keywords =

work page doi:10.1038/s41598-025-18425-9 2025

[53] [53]

2025 , archiveprefix =

Yongchao Huang , title =. 2025 , archiveprefix =. 2508.03766 , primaryclass =

work page arXiv 2025

[54] [54]

Krishnan and Payam Barnaghi , title =

Alexander Capstick and Rahul G. Krishnan and Payam Barnaghi , title =. 2025 , archiveprefix =. 2411.17284 , primaryclass =

work page arXiv 2025

[55] [55]

2025 , archiveprefix =

Jean Feng and Avni Kothari and Luke Zier and Chandan Singh and Yan Shuo Tan , title =. 2025 , archiveprefix =. 2410.15555 , primaryclass =

work page arXiv 2025

[56] [56]

Proceedings of the 6th ACM International Conference on AI in Finance , title =

Li, Kang and Miao, Jiawei and Cucuringu, Mihai and S\'. Proceedings of the 6th ACM International Conference on AI in Finance , title =. 2025 , address =. doi:10.1145/3768292.3770437 , isbn =

work page doi:10.1145/3768292.3770437 2025

[57] [57]

2025 , copyright =

Huang, Yongchao , title =. 2025 , copyright =. doi:10.5281/ZENODO.16756724 , keywords =

work page doi:10.5281/zenodo.16756724 2025

[58] [58]

Ai agentic programming: A survey of techniques, challenges, and opportunities

Huanting Wang and Jingzhi Gong and Huawei Zhang and Jie Xu and Zheng Wang , title =. 2025 , archiveprefix =. 2508.11126 , primaryclass =

work page arXiv 2025

[59] [59]

2025 , archiveprefix =

Yuyao Ge and Lingrui Mei and Zenghao Duan and Tianhao Li and Yujia Zheng and Yiwei Wang and Lexin Wang and Jiayu Yao and Tianyu Liu and Yujun Cai and Baolong Bi and Fangda Guo and Jiafeng Guo and Shenghua Liu and Xueqi Cheng , title =. 2025 , archiveprefix =. 2510.12399 , primaryclass =

work page arXiv 2025

[60] [60]

2024 , archiveprefix =

Avinash Anand and Akshit Gupta and Nishchay Yadav and Shaurya Bajaj , title =. 2024 , archiveprefix =. 2411.07586 , primaryclass =

work page arXiv 2024

[61] [61]

2026 , archiveprefix =

Oliver Dürr , title =. 2026 , archiveprefix =. 2603.27766 , primaryclass =

work page arXiv 2026

[62] [62]

and Ge, Hong , howpublished =

Sun, Xianda and Gordon, Andrew D. and Ge, Hong , howpublished =. Multi-Agent Systems for Traceable

work page

[63] [63]

2024 , archiveprefix =

Måns Magnusson and Jakob Torgander and Paul-Christian Bürkner and Lu Zhang and Bob Carpenter and Aki Vehtari , title =. 2024 , archiveprefix =. 2407.04967 , primaryclass =

work page arXiv 2024

[64] [64]

Bayesian Analysis , title =

Aki Vehtari and Andrew Gelman and Daniel Simpson and Bob Carpenter and Paul-Christian B. Bayesian Analysis , title =. 2021 , number =

work page 2021

[65] [65]

, journal =

Fakhoury, Sarah and Naik, Aaditya and Sakkas, Georgios and Chakraborty, Saikat and Lahiri, Shuvendu K. , journal =. LLM-Based Test-Driven Interactive Code Generation: User Study and Empirical Evaluation , year =. doi:10.1109/TSE.2024.3428972 , keywords =

work page doi:10.1109/tse.2024.3428972 2024

[66] [66]

Pareto smoothed importance sampling , year =

Vehtari, Aki and Simpson, Daniel and Gelman, Andrew and Yao, Yuling and Gabry, Jonah , journal =. Pareto smoothed importance sampling , year =

work page

[67] [67]

O'Hara, Keith , howpublished =

work page

[68] [68]

Leal, Allan , howpublished =

work page

[69] [69]

, booktitle =

Hartung, Joachim and Knapp, Guido and Sinha, Bikash K. , booktitle =. Meta-Regression , year =

work page

[70] [70]

Linero , journal =

Antonio R. Linero , journal =. Generalized Bayesian Additive Regression Trees Models: Beyond Conditional Conjugacy , year =. doi:10.1080/01621459.2024.2337156 , eprint =

work page doi:10.1080/01621459.2024.2337156 2024

[71] [71]

LAMBDA: A Large Model Based Data Agent , year =

Maojun Sun and Ruijian Han and Binyan Jiang and Houduo Qi and Defeng Sun and Yancheng Yuan and Jian Huang , journal =. LAMBDA: A Large Model Based Data Agent , year =

work page

[72] [72]

Externalization in LLM Agents: A Unified Review of Memory, Skills, Protocols and Harness Engineering

Chenyu Zhou and Huacan Chai and Wenteng Chen and Zihan Guo and Rong Shan and Yuanyi Song and Tianyi Xu and Yingxuan Yang and Aofan Yu and Weiming Zhang and Congming Zheng and Jiachen Zhu and Zeyu Zheng and Zhuosheng Zhang and Xingyu Lou and Changwang Zhang and Zhihui Fu and Jun Wang and Weiwen Liu and Jianghao Lin and Weinan Zhang , title =. 2026 , archiv...

work page internal anchor Pith review Pith/arXiv arXiv 2026

[73] [73]

Natural-Language Agent Harnesses

Linyue Pan and Lexiao Zou and Shuo Guo and Jingchen Ni and Hai-Tao Zheng , title =. 2026 , archiveprefix =. 2603.25723 , primaryclass =

work page internal anchor Pith review Pith/arXiv arXiv 2026

[74] [74]

Agentic Harness Engineering: Observability-Driven Automatic Evolution of Coding-Agent Harnesses

Jiahang Lin and Shichun Liu and Chengjun Pan and Lizhi Lin and Shihan Dou and Xuanjing Huang and Hang Yan and Zhenhua Han and Tao Gui , title =. 2026 , archiveprefix =. 2604.25850 , primaryclass =

work page internal anchor Pith review Pith/arXiv arXiv 2026

[75] [75]

and Yang, Yun , journal =

Linero, Antonio R. and Yang, Yun , journal =. Bayesian Regression Tree Ensembles that Adapt to Smoothness and Sparsity , year =

work page

[76] [76]

Gibbs Sampling Methods for Stick-Breaking Priors , year =

Hemant Ishwaran and Lancelot F James , journal =. Gibbs Sampling Methods for Stick-Breaking Priors , year =

work page