pith. sign in

arxiv: 2605.18476 · v1 · pith:FNR3SASSnew · submitted 2026-05-18 · 📊 stat.CO · cs.AI· cs.LG

AI4BayesCode: From Natural Language Descriptions to Validated Modular Stateful Bayesian Samplers

Pith reviewed 2026-05-20 01:49 UTC · model grok-4.3

classification 📊 stat.CO cs.AIcs.LG
keywords MCMCBayesian modelingLLM code generationprobabilistic programmingmodular samplingcode validationstateful coding
0
0 comments X

The pith

AI4BayesCode turns natural-language Bayesian model descriptions into validated modular MCMC samplers

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces AI4BayesCode as an LLM-driven system that converts plain-English descriptions of Bayesian models into runnable MCMC sampling code. It works by splitting each model into separate sampling blocks that connect to ready-made components, then checking the input description before generation and the output code afterward for correctness. A new recursively stateful coding method lets these blocks fit together reliably even when they come from separate contributors. A reader would care because this setup could remove the need for users to write or debug complex sampling routines themselves, making advanced Bayesian inference more accessible.

Core claim

AI4BayesCode is an extensible LLM-driven system that translates natural-language Bayesian model descriptions into runnable, validated MCMC samplers. It adopts a modular design that decomposes models into modular sampling blocks and maps each block to a built-in sampling component. Reliability is improved through pre-generation validation of model specifications and post-generation validation of generated sampler code. A novel recursively stateful coding paradigm allows modular sampling components to be composed coherently within larger MCMC procedures.

What carries the argument

Modular decomposition of models into sampling blocks mapped to built-in components, reinforced by pre- and post-generation validation and a recursively stateful coding paradigm that enables coherent composition across modules

If this is right

  • Users can implement a wide range of Bayesian models without coding sampling algorithms from scratch
  • New built-in sampling blocks can be added to expand the system's coverage over time
  • Modules developed independently by different contributors compose reliably thanks to the stateful paradigm
  • A dedicated benchmark suite supports systematic evaluation of sampler generation from descriptions
  • Overall performance improves as the underlying LLM advances and more components become available

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Non-experts could apply advanced MCMC methods to their data without first learning low-level implementation details
  • The same modular-plus-validation pattern might transfer to code generation for other inference algorithms
  • Models with unusual dependence structures could expose limits in how well the current blocks handle edge cases

Load-bearing premise

That modular breakdown into built-in sampling components plus validation steps is enough to produce correct and composable samplers for complex models without users writing algorithms themselves

What would settle it

A natural-language description of a hierarchical model whose generated sampler produces posterior samples that diverge from those of a manually verified reference implementation on the same dataset

Figures

Figures reproduced from arXiv: 2605.18476 by Alex Ziyu Jiang, Jungang Zou, Qixuan Chen.

Figure 1
Figure 1. Figure 1: Overview of AI4BayesCode using a Spike-and-Slab regression model as an example. The system translates a natural-language Bayesian model description into a validated MCMC sampler through pre-generation validation, modular block decomposition and lookup, stateful sampler generation, and post-generation validation. DAG: directed acyclic graph. NUTS: no-U-turn sampler. RJMCMC: reversible-jump MCMC. 2.1 Modular… view at source ↗
Figure 2
Figure 2. Figure 2: Three-tier architecture of AI4BayesCode. Example 2. Three-tier update in Gaussian linear regression Continuing Example 1, the wrapper decomposes the model and assigns σ and β to two separate modular blocks, each associated with its own conditional target: π(σ | y, Xβ, τσ) and π(β | y, X, σ, µ, η2 ). By default, AI4BayesCode uses the NUTS block for continuous parameters and thus assigns both modular blocks … view at source ↗
Figure 3
Figure 3. Figure 3 [PITH_FULL_IMAGE:figures/full_fig_p016_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Results of Experiment 1. Each row corresponds to a model category, and each point represents a model. Bracketed ratios indicate the validated-sampler rate within each category after at most M = 5 attempts. Models with high (≥ 1.05) Rˆ are highlighted in orange; red crosses represent reference samplers, where available. ESS and running time are plotted on log10 x-scales. model aligns with a built-in block, … view at source ↗
Figure 5
Figure 5. Figure 5: Results of Experiment 2. The colored dots represent successful, independently generated AI4BayesCode samplers, and the red crosses represent reference samplers. ESS and running time are plotted on log10 x-scales. 5.4 Experiment 3: Validation beyond code generation alone Finally, we evaluated the extent to which post-generation validation improves results beyond code generation alone. Using the same setting… view at source ↗
read the original abstract

Coding and computation remain major bottlenecks in Markov chain Monte Carlo (MCMC) workflows, especially as modern sampling algorithms have become increasingly complex and existing probabilistic programming systems remain limited in model support, extensibility, and composability. We introduce \textbf{AI4BayesCode}, an extensible LLM-driven system that translates natural-language Bayesian model descriptions into runnable, validated MCMC samplers. To improve reliability, AI4BayesCode adopts a modular design that decomposes models into modular sampling blocks and maps each block to a built-in sampling component, reducing the need to implement complex sampling algorithms from scratch. Reliability is further improved through pre-generation validation of model specifications and post-generation validation of generated sampler code. AI4BayesCode also introduces a novel recursively stateful coding paradigm for MCMC, allowing modular sampling components, potentially developed by different contributors, to be composed coherently within larger MCMC procedures. We develop a benchmark suite to evaluate AI4BayesCode for sampler-generation. Experiments show that AI4BayesCode can implement a wide range of Bayesian models from natural-language descriptions alone. As an open-ended system, its capability can continue to expand with improvements in the underlying AI agent and the addition of new built-in blocks.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces AI4BayesCode, an extensible LLM-driven system that translates natural-language Bayesian model descriptions into runnable, validated MCMC samplers. It employs a modular design that decomposes models into sampling blocks mapped to built-in components, incorporates pre- and post-generation validation, and introduces a recursively stateful coding paradigm to enable coherent composition of modular components potentially contributed by different developers. A benchmark suite is developed to evaluate sampler generation, with the claim that experiments demonstrate successful implementation of a wide range of Bayesian models from natural-language descriptions alone.

Significance. If the central claims hold, the work could meaningfully reduce the coding and implementation barriers in MCMC workflows by leveraging LLMs for model-to-sampler translation while emphasizing extensibility through new blocks and improved underlying agents. The modular decomposition and recursively stateful paradigm are presented as addressing composability limitations in existing probabilistic programming systems; these design choices merit credit as they aim to support community-driven expansion without requiring users to implement algorithms from scratch.

major comments (2)
  1. [Abstract] Abstract: the claim that 'experiments show that AI4BayesCode can implement a wide range of Bayesian models from natural-language descriptions alone' is unsupported by any reported quantitative success rates, failure modes, benchmark details, or error analysis, which is load-bearing for the central reliability claim.
  2. [Abstract / validation description] Post-generation validation (described in Abstract and implied in the method): the validation is stated to improve reliability of generated sampler code, yet no indication is given that it incorporates MCMC statistical diagnostics such as Gelman-Rubin convergence checks, effective sample size, or comparison against known posteriors; this leaves open whether generated samplers are merely runnable or actually sample from the target posterior, especially for non-conjugate or complex models.
minor comments (1)
  1. [Abstract] The recursively stateful coding paradigm is introduced as novel but would benefit from an explicit statement of its composition invariants in the main text to clarify how state is preserved across arbitrary module combinations.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their thoughtful and constructive comments. We address each major comment below and indicate planned revisions to improve clarity and support for the central claims.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the claim that 'experiments show that AI4BayesCode can implement a wide range of Bayesian models from natural-language descriptions alone' is unsupported by any reported quantitative success rates, failure modes, benchmark details, or error analysis, which is load-bearing for the central reliability claim.

    Authors: We agree that the abstract would be strengthened by including quantitative details. The manuscript describes a benchmark suite and reports experimental results across multiple models; in the revised version we will update the abstract to report key success rates (e.g., fraction of natural-language descriptions that produced executable and validated samplers), note the main failure modes observed, and briefly reference the benchmark design. revision: yes

  2. Referee: [Abstract / validation description] Post-generation validation (described in Abstract and implied in the method): the validation is stated to improve reliability of generated sampler code, yet no indication is given that it incorporates MCMC statistical diagnostics such as Gelman-Rubin convergence checks, effective sample size, or comparison against known posteriors; this leaves open whether generated samplers are merely runnable or actually sample from the target posterior, especially for non-conjugate or complex models.

    Authors: We appreciate this clarification request. The post-generation validation currently checks syntactic correctness, successful execution, and consistency with the modular stateful composition rules. It does not yet perform statistical MCMC diagnostics such as Gelman-Rubin or effective sample size. We will revise the manuscript to explicitly state the current scope of validation and to note that rigorous posterior correctness verification via such diagnostics is an important direction for future extensions, particularly for non-conjugate models. revision: partial

Circularity Check

0 steps flagged

No circularity; system claims rest on architecture and external benchmarks

full rationale

The paper presents an LLM-based system for translating natural-language Bayesian model descriptions into modular MCMC samplers, with pre/post-generation validation and a recursively stateful paradigm. No equations, fitted parameters, or derivations are described that reduce outputs to inputs by construction. The central claims rely on a benchmark suite for evaluation and the extensibility of adding new blocks, which are independent of any self-referential loop. No self-citations are invoked as load-bearing uniqueness theorems or ansatzes. The skeptic concern about validation depth addresses empirical correctness rather than circularity in the derivation chain. This is a standard systems paper whose results are falsifiable via the benchmark and do not collapse to tautology.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

Based on the abstract alone, the central claim rests on the assumption that LLMs guided by modular structure and validation can reliably produce correct MCMC code, plus the introduction of a new coding paradigm without external falsifiable evidence beyond the system description itself.

axioms (1)
  • domain assumption LLMs can reliably translate natural language Bayesian model descriptions into correct and composable MCMC sampler code when the model is decomposed into modular blocks with pre- and post-generation validation.
    This assumption underpins the reliability and extensibility claims of the entire system.
invented entities (1)
  • recursively stateful coding paradigm no independent evidence
    purpose: Enables coherent composition of modular sampling components, potentially developed by different contributors, within larger MCMC procedures.
    Presented as a novel technical contribution for stateful MCMC composition.

pith-pipeline@v0.9.0 · 5750 in / 1392 out tokens · 73181 ms · 2026-05-20T01:49:31.635866+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

76 extracted references · 76 canonical work pages · 5 internal anchors

  1. [1]

    Nonparametric Machine Learning and Efficient Computation with Bayesian Additive Regression Trees: The BART R Package , year =

    Sparapani, Rodney and Spanbauer, Charles and McCulloch, Robert , journal =. Nonparametric Machine Learning and Efficient Computation with Bayesian Additive Regression Trees: The BART R Package , year =

  2. [2]

    Chipman and Edward I

    Hugh A. Chipman and Edward I. George and Robert E. McCulloch , journal =. 2010 , number =

  3. [3]

    Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , year =

    Geman, Stuart and Geman, Donald , journal =. Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , year =. doi:10.1109/TPAMI.1984.4767596 , keywords =

  4. [4]

    Metropolis, A

    Metropolis, Nicholas and Rosenbluth, Arianna W. and Rosenbluth, Marshall N. and Teller, Augusta H. and Teller, Edward , journal =. Equation of State Calculations by Fast Computing Machines , year =. doi:10.1063/1.1699114 , interhash =

  5. [5]

    Hastings, W. K. , journal =. Monte Carlo sampling methods using Markov chains and their applications , year =

  6. [6]

    Ieee Software , volume=

    Migrating enterprise legacy source code to microservices: on multitenancy, statefulness, and data consistency , author=. Ieee Software , volume=. 2017 , publisher=

  7. [7]

    M. J. Betancourt and Mark Girolami , title =. 2013 , archiveprefix =. 1312.0906 , primaryclass =

  8. [8]

    Hierarchical two-parameter logistic item response model , url=

    Furr, Daniel , year=. Hierarchical two-parameter logistic item response model , url=

  9. [9]

    2011 , publisher=

    Bayesian population analysis using WinBUGS: a hierarchical perspective , author=. 2011 , publisher=

  10. [10]

    2013 , publisher=

    Bayesian Cognitive Modeling , author=. 2013 , publisher=

  11. [11]

    Applied Psychological Measurement , volume=

    Computerized adaptive testing with item cloning , author=. Applied Psychological Measurement , volume=. 2003 , publisher=

  12. [12]

    Ecology , volume=

    Estimating species richness and accumulation by modeling species occurrence and detectability , author=. Ecology , volume=. 2006 , publisher=

  13. [13]

    Geoscientific Model Development , volume=

    Modeling radiocarbon dynamics in soils: SoilR version 1.1 , author=. Geoscientific Model Development , volume=. 2014 , publisher=

  14. [14]

    New Yorli , year=

    The Conservation of thc Wild Life of Canada , author=. New Yorli , year=

  15. [15]

    2009 , publisher=

    Howard, Peter , title=. 2009 , publisher=

  16. [16]

    Baltimore: Waverly , year=

    Principles of physical biology , author=. Baltimore: Waverly , year=

  17. [17]

    1926 , publisher=

    Fluctuations in the abundance of a species considered mathematically , author=. 1926 , publisher=

  18. [18]

    1927 , publisher=

    Variazioni e fluttuazioni del numero d'individui in specie animali conviventi , author=. 1927 , publisher=

  19. [19]

    ACM Computing Surveys (CSUR) , volume=

    Feature selection: A data perspective , author=. ACM Computing Surveys (CSUR) , volume=. 2018 , publisher=

  20. [20]

    Electronic Journal of Statistics , volume=

    Sparsity information and regularization in the horseshoe and other shrinkage priors , author=. Electronic Journal of Statistics , volume=. 2017 , publisher=

  21. [21]

    The American Statistician , volume=

    Forecasting at scale , author=. The American Statistician , volume=. 2018 , publisher=

  22. [22]

    arXiv preprint arXiv:1905.11916 , year=

    Selecting the Metric in Hamiltonian Monte Carlo , author=. arXiv preprint arXiv:1905.11916 , year=

  23. [23]

    2016 , publisher=

    ggplot2: elegant graphics for data analysis , author=. 2016 , publisher=

  24. [24]

    2013 , publisher=

    Modern applied statistics with S-PLUS , author=. 2013 , publisher=

  25. [25]

    2006 , publisher=

    Data analysis using regression and multilevel/hierarchical models , author=. 2006 , publisher=

  26. [26]

    2013 , publisher=

    Bayesian data analysis , author=. 2013 , publisher=

  27. [27]

    Journal of Educational Statistics , volume=

    Estimation in parallel randomized experiments , author=. Journal of Educational Statistics , volume=. 1981 , publisher=

  28. [28]

    Journal of machine Learning research , volume=

    Latent dirichlet allocation , author=. Journal of machine Learning research , volume=

  29. [29]

    Epidemiology (Cambridge, Mass.) , volume=

    Cholera modeling: challenges to quantitative analysis and predicting the impact of interventions , author=. Epidemiology (Cambridge, Mass.) , volume=. 2012 , publisher=

  30. [30]

    Report 13: Estimating the number of infections and the impact of non-pharmaceutical interventions on COVID-19 in 11 European countries , author=

  31. [31]

    Proceedings of the IEEE , volume=

    Gradient-based learning applied to document recognition , author=. Proceedings of the IEEE , volume=. 1998 , publisher=

  32. [32]

    Neural networks , volume=

    Bayesian approach for neural networks—review and case studies , author=. Neural networks , volume=. 2001 , publisher=

  33. [33]

    2021 , Eprint =

    Johan Jonasson and Måns Magnusson , Title =. 2021 , Eprint =

  34. [34]

    Andras Farkas , Title =

  35. [35]

    Bird-habitat associations predict population trends in central European forest and farmland birds , url =

    Jil. Bird-habitat associations predict population trends in central European forest and farmland birds , url =

  36. [36]

    Small-area spatiotemporal analysis of pedestrian and bicyclist injuries in

    DiMaggio, Charles , journal=. Small-area spatiotemporal analysis of pedestrian and bicyclist injuries in. 2015 , publisher=

  37. [37]

    Manuscript in preparation , year=

    Fast hierarchical Gaussian processes , author=. Manuscript in preparation , year=

  38. [38]

    1995 , publisher=

    Euro-barometer 38.1: Consumer protection and perceptions of science and technology, november 1992 , author=. 1995 , publisher=

  39. [39]

    MRC Biostatistics Unit , Title =

  40. [40]

    Roche and Howard Wainer and David Thissen , title =

    Alex F. Roche and Howard Wainer and David Thissen , title =. Official Journal of the American Academy of Pediatrics , year =

  41. [41]

    Gelfand and Adrian F

    Alan E. Gelfand and Susan E. Hills and Amy Racine-Poon and Adrian F. M. Smith , title =. Journal of the American Statistical Association , volume =. 1990 , publisher =. doi:10.1080/01621459.1990.10474968 , URL =

  42. [42]

    Crowder , title =

    Martin J. Crowder , title =. Journal of the Royal Statistical Society: Series C (Applied Statistics) , volume =. doi:https://doi.org/10.2307/2346223 , url =. https://rss.onlinelibrary.wiley.com/doi/pdf/10.2307/2346223 , abstract =

  43. [43]

    Handbook of Markov Chain Monte Carlo , year =

    Brooks, Steve and Gelman, Andrew and Jones, Galin and Meng, Xiao-Li , publisher =. Handbook of Markov Chain Monte Carlo , year =

  44. [44]

    Roberts and Richard L

    Gareth O. Roberts and Richard L. Tweedie , journal =. Exponential convergence of Langevin distributions and their discrete approximations , year =

  45. [45]

    The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo

    Hoffman, Matthew D and Gelman, Andrew , journal =. The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. , year =

  46. [46]

    and Lee, Daniel and Goodrich, Ben and Betancourt, Michael and Brubaker, Marcus and Guo, Jiqiang and Li, Peter and Riddell, Allen , journal =

    Carpenter, Bob and Gelman, Andrew and Hoffman, Matthew D. and Lee, Daniel and Goodrich, Ben and Betancourt, Michael and Brubaker, Marcus and Guo, Jiqiang and Li, Peter and Riddell, Allen , journal =. Stan: A Probabilistic Programming Language , year =

  47. [47]

    Lunn and Andrew Thomas and Nicky Best and David Spiegelhalter , journal =

    David J. Lunn and Andrew Thomas and Nicky Best and David Spiegelhalter , journal =. WinBUGS - A Bayesian modelling framework: Concepts, structure, and extensibility , year =

  48. [48]

    JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling , year =

    Plummer, Martyn and others , booktitle =. JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling , year =

  49. [49]

    and Kochurov, Maxim and Kumar, Ravin and Lao, Junpeng and Luhmann, Christian C

    Abril-Pla, Oriol and Andreani, Virgile and Carroll, Colin and Dong, Larry and Fonnesbeck, Christopher J. and Kochurov, Maxim and Kumar, Ravin and Lao, Junpeng and Luhmann, Christian C. and Martin, Osvaldo A. and Osthege, Michael and Vieira, Ricardo and Wiecki, Thomas and Zinkov, Robert , journal =. 2023 , pages =

  50. [50]

    Composable Effects for Flexible and Accelerated Probabilistic Programming in NumPyro

    Du Phan and Neeraj Pradhan and Martin Jankowiak , title =. 2019 , archiveprefix =. 1912.11554 , primaryclass =

  51. [51]

    and Sun, Xianda and Hauru, Markus and Yong, Penelope and Tebbutt, Will and Ghahramani, Zoubin and Ge, Hong , journal =

    Fjelde, Tor Erlend and Xu, Kai and Widmann, David and Tarek, Mohamed and Pfiffer, Cameron and Trapp, Martin and Axen, Seth D. and Sun, Xianda and Hauru, Markus and Yong, Penelope and Tebbutt, Will and Ghahramani, Zoubin and Ge, Hong , journal =. Turing.jl: A General-Purpose Probabilistic Programming Language , year =. doi:10.1145/3711897 , issue_date =

  52. [52]

    and Hellton, Kristoffer H

    Riegler, Michael A. and Hellton, Kristoffer H. and Thambawita, Vajira and Hammer, Hugo L. , journal =. Using large language models to suggest informative prior distributions in. 2025 , issn =. doi:10.1038/s41598-025-18425-9 , keywords =

  53. [53]

    2025 , archiveprefix =

    Yongchao Huang , title =. 2025 , archiveprefix =. 2508.03766 , primaryclass =

  54. [54]

    Krishnan and Payam Barnaghi , title =

    Alexander Capstick and Rahul G. Krishnan and Payam Barnaghi , title =. 2025 , archiveprefix =. 2411.17284 , primaryclass =

  55. [55]

    2025 , archiveprefix =

    Jean Feng and Avni Kothari and Luke Zier and Chandan Singh and Yan Shuo Tan , title =. 2025 , archiveprefix =. 2410.15555 , primaryclass =

  56. [56]

    Proceedings of the 6th ACM International Conference on AI in Finance , title =

    Li, Kang and Miao, Jiawei and Cucuringu, Mihai and S\'. Proceedings of the 6th ACM International Conference on AI in Finance , title =. 2025 , address =. doi:10.1145/3768292.3770437 , isbn =

  57. [57]

    2025 , copyright =

    Huang, Yongchao , title =. 2025 , copyright =. doi:10.5281/ZENODO.16756724 , keywords =

  58. [58]

    Ai agentic programming: A survey of techniques, challenges, and opportunities

    Huanting Wang and Jingzhi Gong and Huawei Zhang and Jie Xu and Zheng Wang , title =. 2025 , archiveprefix =. 2508.11126 , primaryclass =

  59. [59]

    2025 , archiveprefix =

    Yuyao Ge and Lingrui Mei and Zenghao Duan and Tianhao Li and Yujia Zheng and Yiwei Wang and Lexin Wang and Jiayu Yao and Tianyu Liu and Yujun Cai and Baolong Bi and Fangda Guo and Jiafeng Guo and Shenghua Liu and Xueqi Cheng , title =. 2025 , archiveprefix =. 2510.12399 , primaryclass =

  60. [60]

    2024 , archiveprefix =

    Avinash Anand and Akshit Gupta and Nishchay Yadav and Shaurya Bajaj , title =. 2024 , archiveprefix =. 2411.07586 , primaryclass =

  61. [61]

    2026 , archiveprefix =

    Oliver Dürr , title =. 2026 , archiveprefix =. 2603.27766 , primaryclass =

  62. [62]

    and Ge, Hong , howpublished =

    Sun, Xianda and Gordon, Andrew D. and Ge, Hong , howpublished =. Multi-Agent Systems for Traceable

  63. [63]

    2024 , archiveprefix =

    Måns Magnusson and Jakob Torgander and Paul-Christian Bürkner and Lu Zhang and Bob Carpenter and Aki Vehtari , title =. 2024 , archiveprefix =. 2407.04967 , primaryclass =

  64. [64]

    Bayesian Analysis , title =

    Aki Vehtari and Andrew Gelman and Daniel Simpson and Bob Carpenter and Paul-Christian B. Bayesian Analysis , title =. 2021 , number =

  65. [65]

    , journal =

    Fakhoury, Sarah and Naik, Aaditya and Sakkas, Georgios and Chakraborty, Saikat and Lahiri, Shuvendu K. , journal =. LLM-Based Test-Driven Interactive Code Generation: User Study and Empirical Evaluation , year =. doi:10.1109/TSE.2024.3428972 , keywords =

  66. [66]

    Pareto smoothed importance sampling , year =

    Vehtari, Aki and Simpson, Daniel and Gelman, Andrew and Yao, Yuling and Gabry, Jonah , journal =. Pareto smoothed importance sampling , year =

  67. [67]

    O'Hara, Keith , howpublished =

  68. [68]

    Leal, Allan , howpublished =

  69. [69]

    , booktitle =

    Hartung, Joachim and Knapp, Guido and Sinha, Bikash K. , booktitle =. Meta-Regression , year =

  70. [70]

    Linero , journal =

    Antonio R. Linero , journal =. Generalized Bayesian Additive Regression Trees Models: Beyond Conditional Conjugacy , year =. doi:10.1080/01621459.2024.2337156 , eprint =

  71. [71]

    LAMBDA: A Large Model Based Data Agent , year =

    Maojun Sun and Ruijian Han and Binyan Jiang and Houduo Qi and Defeng Sun and Yancheng Yuan and Jian Huang , journal =. LAMBDA: A Large Model Based Data Agent , year =

  72. [72]

    Externalization in LLM Agents: A Unified Review of Memory, Skills, Protocols and Harness Engineering

    Chenyu Zhou and Huacan Chai and Wenteng Chen and Zihan Guo and Rong Shan and Yuanyi Song and Tianyi Xu and Yingxuan Yang and Aofan Yu and Weiming Zhang and Congming Zheng and Jiachen Zhu and Zeyu Zheng and Zhuosheng Zhang and Xingyu Lou and Changwang Zhang and Zhihui Fu and Jun Wang and Weiwen Liu and Jianghao Lin and Weinan Zhang , title =. 2026 , archiv...

  73. [73]

    Natural-Language Agent Harnesses

    Linyue Pan and Lexiao Zou and Shuo Guo and Jingchen Ni and Hai-Tao Zheng , title =. 2026 , archiveprefix =. 2603.25723 , primaryclass =

  74. [74]

    Agentic Harness Engineering: Observability-Driven Automatic Evolution of Coding-Agent Harnesses

    Jiahang Lin and Shichun Liu and Chengjun Pan and Lizhi Lin and Shihan Dou and Xuanjing Huang and Hang Yan and Zhenhua Han and Tao Gui , title =. 2026 , archiveprefix =. 2604.25850 , primaryclass =

  75. [75]

    and Yang, Yun , journal =

    Linero, Antonio R. and Yang, Yun , journal =. Bayesian Regression Tree Ensembles that Adapt to Smoothness and Sparsity , year =

  76. [76]

    Gibbs Sampling Methods for Stick-Breaking Priors , year =

    Hemant Ishwaran and Lancelot F James , journal =. Gibbs Sampling Methods for Stick-Breaking Priors , year =