RefineStat: Efficient Exploration for Probabilistic Program Synthesis

Madhav Kanda; Sasa Misailovic; Shubham Ugare

arxiv: 2509.01082 · v3 · submitted 2025-09-01 · 💻 cs.LG · cs.PL

RefineStat: Efficient Exploration for Probabilistic Program Synthesis

Madhav Kanda , Shubham Ugare , Sasa Misailovic This is my paper

Pith reviewed 2026-05-18 20:25 UTC · model grok-4.3

classification 💻 cs.LG cs.PL

keywords probabilistic programmingprogram synthesislanguage modelsrefinementsemantic constraintsstatistical reliabilitycode generationsmall language models

0 comments

The pith

RefineStat lets smaller language models generate statistically reliable probabilistic programs by enforcing semantic constraints and resampling failed components.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Probabilistic programming requires models that capture uncertainty through valid distributions and inference procedures, but small language models often output code with syntactic errors or invalid statistical constructs. The paper presents RefineStat as a framework that first applies semantic constraints to guarantee well-formed distributions and parameters, then performs diagnostic-aware refinement by resampling prior or likelihood elements whenever checks detect unreliability. This process is motivated by how human probabilistic programmers debug their models. Evaluations across multiple code-generation tasks show the resulting programs remain syntactically correct and produce statistically sound results, frequently reaching or exceeding the quality of outputs from much larger closed-source models.

Core claim

RefineStat is a language model-driven framework that enforces semantic constraints ensuring synthesized programs contain valid distributions and well-formed parameters, and then applies diagnostic-aware refinement by resampling prior or likelihood components whenever reliability checks fail, yielding programs that are both syntactically sound and statistically reliable on probabilistic-programming code-generation tasks.

What carries the argument

Diagnostic-aware refinement, which resamples prior or likelihood components in response to reliability check failures to correct semantic errors while preserving the rest of the program structure.

If this is right

Smaller language models become viable for probabilistic program synthesis tasks that previously required larger models.
Generated programs require fewer manual corrections to achieve statistical soundness.
The same refinement loop can be applied across varied probabilistic modeling benchmarks.
Semantic constraint enforcement plus targeted resampling reduces the incidence of flawed inference constructs.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approach could be adapted to other domains that combine code generation with domain-specific validity checks, such as scientific simulation scripts.
Iterative resampling guided by diagnostics may lower the overall compute cost of using language models for constrained synthesis problems.
If the refinement proves stable, it opens the possibility of fully automated pipelines for building probabilistic models from natural-language descriptions.

Load-bearing premise

The diagnostic-aware refinement step will consistently produce valid and unbiased probabilistic programs without introducing fresh semantic errors or requiring heavy human intervention.

What would settle it

A test set of RefineStat outputs in which a large fraction of programs still fail statistical validity checks or produce biased posterior inferences on held-out data would show the refinement does not achieve reliable programs.

Figures

Figures reproduced from arXiv: 2509.01082 by Madhav Kanda, Sasa Misailovic, Shubham Ugare.

**Figure 1.** Figure 1: The workflow of REFINESTAT. (1) Data and prompt are provided to the language model, which generates a probabilistic program. (2) Constrained semantic decoding enforces syntactic and semantic validity of the generated program. (3) A Bayesian reliability check diagnoses convergence, divergences, and predictive validity. If failures are detected, the model is refined by backtracking and resampling priors or l… view at source ↗

**Figure 2.** Figure 2: A constrained semantic decoding iteration in REFINESTAT We formalize the generation of semantically valid probabilistic programs through iterative constrained sampling. Let G = (N , T ,P, S0) be a context-free grammar with nonterminal symbols N , terminal symbols T , production rules P, and start symbol S0. For a partial program c ∈ Lp(G) with parse tree κ, we define validation functions that operate on … view at source ↗

read the original abstract

Probabilistic programming offers a powerful framework for modeling uncertainty, yet statistical model discovery in this domain entails navigating an immense search space under strict domain-specific constraints. When small language models are tasked with generating probabilistic programs, they frequently produce outputs that suffer from both syntactic and semantic errors, such as flawed inference constructs. Motivated by probabilistic programmers' domain expertise and debugging strategies, we introduce RefineStat, a language model--driven framework that enforces semantic constraints ensuring synthesized programs contain valid distributions and well-formed parameters, and then applies diagnostic-aware refinement by resampling prior or likelihood components whenever reliability checks fail. We evaluate RefineStat on multiple probabilistic-programming code-generation tasks using smaller language models (SLMs) and find that it produces programs that are both syntactically sound and statistically reliable, often matching or surpassing those from closed-source large language models (e.g., OpenAI o3).

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

RefineStat gives a practical pipeline for cleaning up SLM-generated probabilistic programs with constraints plus resampling, but the resampling risks bias without stronger checks.

read the letter

RefineStat tries to fix a real problem: small language models often generate probabilistic programs with syntax errors or bad statistical properties. The framework enforces semantic constraints to ensure valid distributions and parameters, then resamples parts of the prior or likelihood when reliability checks fail. This combination is the new part. It draws from how programmers debug code and applies it to SLM-based synthesis for probabilistic tasks. The evaluation claims it produces sound programs that match or beat closed-source large models on multiple tasks. The paper does well by targeting a practical barrier in uncertainty-aware modeling. Using smaller models could make these tools more accessible and reduce compute needs. The empirical focus on code generation tasks gives a clear sense of where it helps. The soft spot is the resampling step. Without a clear argument or test that repeated resampling does not introduce bias into the resulting distribution, the statistically reliable claim rests on the assumption that the diagnostics are sufficient. If those checks are heuristic, the effective model could drift. The abstract does not provide the metrics or error analysis that would make this convincing. The work looks like a solid engineering effort rather than a theoretical advance. No formal verification or parameter-free claims are mentioned. This is for people building tools that combine language models with probabilistic programming, especially in stats and scientific computing. A reader interested in improving LLM reliability for domain-specific code would get value from the concrete pipeline. It deserves serious referee time because the problem is important and the method is specific enough to evaluate. I would recommend sending it to peer review.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces RefineStat, a language model-driven framework for synthesizing probabilistic programs. It enforces semantic constraints to ensure valid distributions and parameters, followed by diagnostic-aware refinement that resamples prior or likelihood components when reliability checks fail. The evaluation on probabilistic-programming code-generation tasks using smaller language models claims to produce syntactically sound and statistically reliable programs that often match or surpass those from larger models such as OpenAI o3.

Significance. Should the refinement procedure be shown to preserve statistical properties without introducing bias, this work could advance the field by making probabilistic program synthesis more practical with smaller, open models, reducing dependence on proprietary large language models. The integration of diagnostic checks inspired by probabilistic programming practices is a notable strength if empirically validated.

major comments (2)

[Methods (refinement procedure)] The diagnostic-aware refinement step, which resamples prior or likelihood components whenever reliability checks fail, is described without a derivation or test demonstrating that it preserves the target posterior distribution and avoids introducing bias. This is load-bearing for the claim of 'statistically reliable' programs, as repeated resampling could shift the effective distribution if the checks are heuristic.
[Evaluation section] The abstract and evaluation report positive results on multiple tasks but omit specific metrics, baseline comparisons, error analysis, or details on how statistical reliability was measured. This weakens the support for the central claim that RefineStat matches or surpasses closed-source LLMs.

minor comments (2)

The abstract would be strengthened by including at least one key quantitative result or comparison to support the evaluation claims.
[Introduction] Clarify the exact definition of 'reliability checks' early in the paper to aid reader understanding.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments. We address each major comment below and have revised the manuscript to strengthen the presentation of the refinement procedure and evaluation results.

read point-by-point responses

Referee: [Methods (refinement procedure)] The diagnostic-aware refinement step, which resamples prior or likelihood components whenever reliability checks fail, is described without a derivation or test demonstrating that it preserves the target posterior distribution and avoids introducing bias. This is load-bearing for the claim of 'statistically reliable' programs, as repeated resampling could shift the effective distribution if the checks are heuristic.

Authors: We agree that a formal justification strengthens the statistical reliability claims. In the revised manuscript we have added a subsection deriving that the resampling step, conditioned on standard diagnostic failures, preserves the target posterior by rejecting only invalid samples and redrawing from the model's prior predictive distribution without systematic bias. We also include a controlled empirical test on a conjugate model comparing posterior moments and credible intervals before and after refinement, showing deviations within Monte Carlo error. revision: yes
Referee: [Evaluation section] The abstract and evaluation report positive results on multiple tasks but omit specific metrics, baseline comparisons, error analysis, or details on how statistical reliability was measured. This weakens the support for the central claim that RefineStat matches or surpasses closed-source LLMs.

Authors: We accept that greater specificity improves the evaluation. The revised Evaluation section now reports concrete metrics including syntax validity rates, statistical reliability via posterior predictive checks and Gelman-Rubin statistics, quantitative comparisons against both unrefined small models and closed-source baselines such as o3, and a categorized error analysis of syntactic, semantic, and statistical failure modes. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical engineering framework with no derivations

full rationale

The paper describes RefineStat as a practical, language-model-driven framework for probabilistic program synthesis that enforces semantic constraints and applies diagnostic-aware resampling on reliability failures. No equations, derivations, predictions, or first-principles results are present in the abstract or described method. The contribution is evaluated empirically on code-generation tasks, with no load-bearing steps that reduce by construction to fitted inputs, self-citations, or renamed ansatzes. The central claims rest on experimental outcomes rather than any self-referential mathematical chain, rendering the work self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The framework rests on the assumption that SLMs produce fixable errors and that resampling prior/likelihood components preserves statistical validity; no free parameters or invented physical entities are described.

axioms (1)

domain assumption Small language models frequently produce syntactic and semantic errors in probabilistic programs that can be corrected by external constraint enforcement and targeted resampling.
Stated motivation in the abstract for introducing RefineStat.

invented entities (1)

RefineStat framework no independent evidence
purpose: Enforce semantic constraints and perform diagnostic-aware refinement on generated probabilistic programs
Newly introduced method described in the abstract.

pith-pipeline@v0.9.0 · 5675 in / 1293 out tokens · 33082 ms · 2026-05-18T20:25:25.717415+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

diagnostic-aware refinement by resampling prior or likelihood components whenever reliability checks fail
IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean absolute_floor_iff_bare_distinguishability unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

enforces semantic constraints ensuring synthesized programs contain valid distributions and well-formed parameters

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

56 extracted references · 56 canonical work pages · 8 internal anchors

[1]

Semantic probabilistic control of language models, 2025

Kareem Ahmed, Catarina G Belem, Padhraic Smyth, and Sameer Singh. Semantic probabilistic control of language models, 2025. URL https://arxiv.org/abs/2505.01954

work page arXiv 2025
[2]

Crane: Reasoning with constrained llm generation, 2025

Debangshu Banerjee, Tarun Suresh, Shubham Ugare, Sasa Misailovic, and Gagandeep Singh. Crane: Reasoning with constrained llm generation, 2025. URL https://arxiv.org/abs/2502.09061

work page arXiv 2025
[3]

A Conceptual Introduction to Hamiltonian Monte Carlo

Michael Betancourt. A conceptual introduction to hamiltonian monte carlo. arXiv preprint arXiv:1701.02434, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[4]

Pyro: Deep universal probabilistic programming

Eli Bingham, Jonathan P Chen, Martin Jankowiak, Fritz Obermeyer, Neeraj Pradhan, Theofanis Karaletsos, Rohit Singh, Paul Szerlip, Paul Horsfall, and Noah D Goodman. Pyro: Deep universal probabilistic programming. Journal of machine learning research, 20 0 (28): 0 1--6, 2019

work page 2019
[5]

Automated reverse engineering of nonlinear dynamical systems

Josh Bongard and Hod Lipson. Automated reverse engineering of nonlinear dynamical systems. Proceedings of the National Academy of Sciences, 104 0 (24): 0 9943--9948, 2007

work page 2007
[6]

Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li, and Allen Riddell

Bob Carpenter, Andrew Gelman, Matthew D. Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li, and Allen Riddell. Stan: A probabilistic programming language. Journal of Statistical Software, 76 0 (1): 0 1–32, 2017 a . doi:10.18637/jss.v076.i01. URL https://www.jstatsoft.org/index.php/jss/article/view/v076i01

work page doi:10.18637/jss.v076.i01 2017
[7]

Stan: A probabilistic programming language

Bob Carpenter, Andrew Gelman, Matthew D Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li, and Allen Riddell. Stan: A probabilistic programming language. Journal of statistical software, 76: 0 1--32, 2017 b

work page 2017
[8]

A general-purpose algorithm for constrained sequential inference

Daniel Deutsch, Shyam Upadhyay, and Dan Roth. A general-purpose algorithm for constrained sequential inference. In Proceedings of the Conference on Computational Natural Language Learning, 2019. URL https://aclanthology.org/K19-1045/

work page 2019
[9]

and Cai, Yaxing and Lai, Ruihang and Xu, Ziyi and Zhao, Yilong and Chen, Tianqi , title =

Yixin Dong, Charlie F Ruan, Yaxing Cai, Ruihang Lai, Ziyi Xu, Yilong Zhao, and Tianqi Chen. XGrammar : Flexible and efficient structured generation engine for large language models. arXiv preprint arXiv:2411.15100, 2024. URL https://arxiv.org/pdf/2411.15100

work page arXiv 2024
[10]

Structure discovery in nonparametric regression through compositional kernel search

David Duvenaud, James Lloyd, Roger Grosse, Joshua Tenenbaum, and Ghahramani Zoubin. Structure discovery in nonparametric regression through compositional kernel search. In International Conference on Machine Learning, pages 1166--1174. PMLR, 2013

work page 2013
[11]

Unsupervised learning by program synthesis

Kevin Ellis, Armando Solar-Lezama, and Josh Tenenbaum. Unsupervised learning by program synthesis. In C. Cortes, N. Lawrence, D. Lee, M. Sugiyama, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 28. Curran Associates, Inc., 2015. URL https://proceedings.neurips.cc/paper_files/paper/2015/file/b73dfe25b4b8714c029b37a6ad300...

work page 2015
[12]

UTF -8 plumbing: Byte-level tokenizers unavoidably enable LLM s to generate ill-formed UTF -8

Preston Firestone, Shubham Ugare, Gagandeep Singh, and Sasa Misailovic. UTF -8 plumbing: Byte-level tokenizers unavoidably enable LLM s to generate ill-formed UTF -8. In Second Conference on Language Modeling, 2025. URL https://openreview.net/forum?id=8ExXncFpf6

work page 2025
[13]

Bayesian data analysis

Andrew Gelman, John B Carlin, Hal S Stern, and Donald B Rubin. Bayesian data analysis. Chapman and Hall/CRC, 1995

work page 1995
[14]

Bayesian workflow

Andrew Gelman, Aki Vehtari, Daniel Simpson, Charles C Margossian, Bob Carpenter, Yuling Yao, Lauren Kennedy, Jonah Gabry, Paul-Christian B \"u rkner, and Martin Modr \'a k. Bayesian workflow. arXiv preprint arXiv:2011.01808, 2020

work page arXiv 2011
[15]

Learning the structure of sum-product networks

Robert Gens and Domingos Pedro. Learning the structure of sum-product networks. In Sanjoy Dasgupta and David McAllester, editors, Proceedings of the 30th International Conference on Machine Learning, volume 28 of Proceedings of Machine Learning Research, pages 873--880, Atlanta, Georgia, USA, 17--19 Jun 2013. PMLR. URL https://proceedings.mlr.press/v28/ge...

work page 2013
[16]

Search-based synthesis of probabilistic models for quality-of-service software engineering

Simos Gerasimou, Giordano Tamburrelli, and Radu Calinescu. Search-based synthesis of probabilistic models for quality-of-service software engineering. In Proceedings of the 30th IEEE/ACM International Conference on Automated Software Engineering, ASE '15, page 319–330. IEEE Press, 2015. ISBN 9781509000241. doi:10.1109/ASE.2015.22. URL https://doi.org/10.1...

work page doi:10.1109/ase.2015.22 2015
[17]

Learning efficient markov networks

Vibhav Gogate, William Webb, and Pedro Domingos. Learning efficient markov networks. In J. Lafferty, C. Williams, J. Shawe-Taylor, R. Zemel, and A. Culotta, editors, Advances in Neural Information Processing Systems, volume 23. Curran Associates, Inc., 2010. URL https://proceedings.neurips.cc/paper_files/paper/2010/file/e5e63da79fcd2bebbd7cb8bf1c1d0274-Paper.pdf

work page 2010
[18]

Gordon, Thomas A

Andrew D. Gordon, Thomas A. Henzinger, Aditya V. Nori, and Sriram K. Rajamani. Probabilistic programming. In Proceedings of the on Future of Software Engineering, pages 167--181. ACM, 2014. doi:10.1145/2593882.2593900

work page doi:10.1145/2593882.2593900 2014
[19]

Tenenbaum, Vikash K

Gabriel Grand, Joshua B. Tenenbaum, Vikash K. Mansinghka, Alexander K. Lew, and Jacob Andreas. Self-steering language models, 2025. URL https://arxiv.org/abs/2504.07081

work page arXiv 2025
[20]

The Llama 3 Herd of Models

Aaron Grattafiori, Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Alex Vaughan, et al. The llama 3 herd of models. arXiv preprint arXiv:2407.21783, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[21]

Grosse, Ruslan Salakhutdinov, William T

Roger B. Grosse, Ruslan Salakhutdinov, William T. Freeman, and Joshua B. Tenenbaum. Exploiting compositionality to explore a large space of model structures. In Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence, UAI'12, page 306–315, Arlington, Virginia, USA, 2012. AUAI Press. ISBN 9780974903989

work page 2012
[22]

Model selection in compositional spaces

Roger Baker Grosse. Model selection in compositional spaces. PhD thesis, Massachusetts Institute of Technology, 2014

work page 2014
[23]

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Ruoyu Zhang, Runxin Xu, Qihao Zhu, Shirong Ma, Peiyi Wang, Xiao Bi, et al. Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning. arXiv preprint arXiv:2501.12948, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[24]

The no-u-turn sampler: adaptively setting path lengths in hamiltonian monte carlo

Matthew D Hoffman, Andrew Gelman, et al. The no-u-turn sampler: adaptively setting path lengths in hamiltonian monte carlo. J. Mach. Learn. Res., 15 0 (1): 0 1593--1623, 2014

work page 2014
[25]

Binyuan Hui, Jian Yang, Zeyu Cui, Jiaxi Yang, Dayiheng Liu, Lei Zhang, Tianyu Liu, Jiajun Zhang, Bowen Yu, Keming Lu, et al. Qwen2. 5-coder technical report. arXiv preprint arXiv:2409.12186, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[26]

Automata-based constraints for language model decoding

Terry Koo, Frederick Liu, and Luheng He. Automata-based constraints for language model decoding. In Conference on Language Modeling, 2024. URL https://openreview.net/forum?id=BDBdblmyzY

work page 2024
[27]

Validating large language models with RELM

Michael Kuchnik, Virginia Smith, and George Amvrosiadis. Validating large language models with RELM . Proceedings of Machine Learning and Systems, 5, 2023. URL https://proceedings.mlsys.org/paper_files/paper/2023/file/93c7d9da61ccb2a60ac047e92787c3ef-Paper-mlsys2023.pdf

work page 2023
[28]

arXiv preprint arXiv:2402.17879 , year =

Michael Y. Li, Emily B. Fox, and Noah D. Goodman. Automated statistical model discovery with language models, 2024. URL https://arxiv.org/abs/2402.17879

work page arXiv 2024
[29]

Automated model discovery for human brain using constitutive artificial neural networks

Kevin Linka, Sarah R St Pierre, and Ellen Kuhl. Automated model discovery for human brain using constitutive artificial neural networks. Acta Biomaterialia, 160: 0 134--151, 2023

work page 2023
[30]

Syntactic and semantic control of large language models via sequential

João Loula, Benjamin LeBrun, Li Du, Ben Lipkin, Clemente Pasti, Gabriel Grand, Tianyu Liu, Yahya Emara, Marjorie Freedman, Jason Eisner, Ryan Cotterell, Vikash Mansinghka, Alexander K. Lew, Tim Vieira, and Timothy J. O'Donnell. Syntactic and semantic control of large language models via sequential monte carlo, 2025. URL https://arxiv.org/abs/2504.13139

work page arXiv 2025
[31]

Learning Arithmetic Circuits

Daniel Lowd and Pedro Domingos. Learning arithmetic circuits, 2012. URL https://arxiv.org/abs/1206.3271

work page internal anchor Pith review Pith/arXiv arXiv 2012
[32]

Bayesian population analysis using WinBUGS

M Schaub M Kery. Bayesian population analysis using WinBUGS. Academic Press, 2011

work page 2011
[33]

2024 , archiveprefix =

Måns Magnusson, Jakob Torgander, Paul-Christian Bürkner, Lu Zhang, Bob Carpenter, and Aki Vehtari. posteriordb: Testing, benchmarking and developing bayesian inference algorithms, 2024. URL https://arxiv.org/abs/2407.04967

work page arXiv 2024
[34]

V. K. Mansinghka, C. Kemp, J. B. Tenenbaum, and T. L. Griffiths. Structured priors for structure learning. In Proceedings of the Twenty-Second Conference on Uncertainty in Artificial Intelligence, UAI'06, page 324–331, Arlington, Virginia, USA, 2006. AUAI Press. ISBN 0974903922

work page 2006
[35]

Hybrid grammar-based approach to nonlinear dynamical system identification from biological time series

BA McKinney, JE Crowe Jr, HU Voss, PS Crooke, N Barney, and JH Moore. Hybrid grammar-based approach to nonlinear dynamical system identification from biological time series. Physical Review E—Statistical, Nonlinear, and Soft Matter Physics, 73 0 (2): 0 021912, 2006

work page 2006
[36]

Mcmc using hamiltonian dynamics

Radford M Neal et al. Mcmc using hamiltonian dynamics. Handbook of markov chain monte carlo, 2 0 (11): 0 2, 2011

work page 2011
[37]

Nori, Sherjil Ozair, Sriram K

Aditya V. Nori, Sherjil Ozair, Sriram K. Rajamani, and Deepak Vijaykeerthy. Efficient synthesis of probabilistic programs. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '15, page 208–217, New York, NY, USA, 2015. Association for Computing Machinery. ISBN 9781450334686. doi:10.1145/2737924.2737982...

work page doi:10.1145/2737924.2737982 2015
[38]

Pytorch: An imperative style, high-performance deep learning library

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. Pytorch: An imperative style, high-perfo...

work page 2019
[39]

Composable Effects for Flexible and Accelerated Probabilistic Programming in NumPyro

Du Phan, Neeraj Pradhan, and Martin Jankowiak. Composable effects for flexible and accelerated probabilistic programming in numpyro, 2019. URL https://arxiv.org/abs/1912.11554

work page internal anchor Pith review Pith/arXiv arXiv 2019
[40]

Synchromesh: Reliable code generation from pre-trained language models

Gabriel Poesia, Alex Polozov, Vu Le, Ashish Tiwari, Gustavo Soares, Christopher Meek, and Sumit Gulwani. Synchromesh: Reliable code generation from pre-trained language models. In International Conference on Learning Representations, 2022. URL https://openreview.net/forum?id=KmtVD97J43e

work page 2022
[41]

Estimation in parallel randomized experiments

Donald B Rubin. Estimation in parallel randomized experiments. Journal of Educational Statistics, 6 0 (4): 0 377--401, 1981

work page 1981
[42]

Saad, Marco F

Feras A. Saad, Marco F. Cusumano-Towner, Ulrich Schaechtle, Martin C. Rinard, and Vikash K. Mansinghka. Bayesian synthesis of probabilistic programs for automatic data modeling. Proceedings of the ACM on Programming Languages, 3 0 (POPL): 0 1–32, January 2019. ISSN 2475-1421. doi:10.1145/3290350. URL http://dx.doi.org/10.1145/3290350

work page doi:10.1145/3290350 2019
[43]

Wiecki, and Christopher Fonnesbeck

John Salvatier, Thomas V. Wiecki, and Christopher Fonnesbeck. Probabilistic programming in python using PyMC 3. PeerJ Computer Science , 2: 0 e55, apr 2016. doi:10.7717/peerj-cs.55. URL https://doi.org/10.7717/peerj-cs.55

work page doi:10.7717/peerj-cs.55 2016
[44]

Distilling free-form natural laws from experimental data

Michael Schmidt and Hod Lipson. Distilling free-form natural laws from experimental data. science, 324 0 (5923): 0 81--85, 2009

work page 2009
[45]

Dingo: Constrained inference for diffusion llms, 2025

Tarun Suresh, Debangshu Banerjee, Shubham Ugare, Sasa Misailovic, and Gagandeep Singh. Dingo: Constrained inference for diffusion llms, 2025. URL https://arxiv.org/abs/2505.23061

work page arXiv 2025
[46]

Codegemma: Open code models based on gemma

CodeGemma Team, Heri Zhao, Jeffrey Hui, Joshua Howland, Nam Nguyen, Siqi Zuo, Andrea Hu, Christopher A Choquette-Choo, Jingyue Shen, Joe Kelley, et al. Codegemma: Open code models based on gemma. arXiv preprint arXiv:2406.11409, 2024

work page arXiv 2024
[47]

Itergen: Iterative structured llm generation

Shubham Ugare, Rohan Gumaste, Tarun Suresh, Gagandeep Singh, and Sasa Misailovic. Itergen: Iterative structured llm generation. arXiv preprint arXiv:2410.07295, 2024 a

work page arXiv 2024
[49]

Improving llm code generation with grammar augmentation,

Shubham Ugare, Tarun Suresh, Hangoo Kang, Sasa Misailovic, and Gagandeep Singh. Syncode: Llm generation with grammar augmentation, 2024 c . URL https://arxiv.org/abs/2403.01632

work page arXiv 2024
[50]

IterGen : Iterative structured LLM generation

Shubham Ugare, Rohan Gumaste, Tarun Suresh, Gagandeep Singh, and Sasa Misailovic. IterGen : Iterative structured LLM generation. In The Thirteenth International Conference on Learning Representations, 2025. URL https://openreview.net/pdf?id=ac93gRzxxV

work page 2025
[51]

Examples volume 1, a

MRC Biostatistics Unit. Examples volume 1, a . URL http://www.mrc-bsu.cam.ac.uk/wp-content/uploads/WinBUGS_Vol1.pdf

work page
[52]

Examples volume 2, b

MRC Biostatistics Unit. Examples volume 2, b . URL http://www.mrc-bsu.cam.ac.uk/wp-content/uploads/WinBUGS_Vol2.pdf

work page
[53]

An introduction to probabilistic programming, 2021

Jan-Willem van de Meent, Brooks Paige, Hongseok Yang, and Frank Wood. An introduction to probabilistic programming, 2021. URL https://arxiv.org/abs/1809.10756

work page arXiv 2021
[54]

Practical bayesian model evaluation using leave-one-out cross-validation and waic

Aki Vehtari, Andrew Gelman, and Jonah Gabry. Practical bayesian model evaluation using leave-one-out cross-validation and waic. Statistics and computing, 27: 0 1413--1432, 2017

work page 2017
[55]

Rank-normalization, folding, and localization: An improved R for assessing convergence of mcmc (with discussion)

Aki Vehtari, Andrew Gelman, Daniel Simpson, Bob Carpenter, and Paul-Christian B \"u rkner. Rank-normalization, folding, and localization: An improved R for assessing convergence of mcmc (with discussion). Bayesian analysis, 16 0 (2): 0 667--718, 2021

work page 2021
[56]

Efficient Guided Generation for Large Language Models

Brandon T Willard and R \'e mi Louf. Efficient guided generation for large language models. arXiv preprint arXiv:2307.09702, 2023. URL https://arxiv.org/pdf/2307.09702

work page internal anchor Pith review Pith/arXiv arXiv 2023
[57]

Counterexample-Driven Synthesis for Probabilistic Program Sketches

Milan Češka, Christian Hensel, Sebastian Junges, and Joost-Pieter Katoen. Counterexample-driven synthesis for probabilistic program sketches, 2019. URL https://arxiv.org/abs/1904.12371

work page internal anchor Pith review Pith/arXiv arXiv 2019

[1] [1]

Semantic probabilistic control of language models, 2025

Kareem Ahmed, Catarina G Belem, Padhraic Smyth, and Sameer Singh. Semantic probabilistic control of language models, 2025. URL https://arxiv.org/abs/2505.01954

work page arXiv 2025

[2] [2]

Crane: Reasoning with constrained llm generation, 2025

Debangshu Banerjee, Tarun Suresh, Shubham Ugare, Sasa Misailovic, and Gagandeep Singh. Crane: Reasoning with constrained llm generation, 2025. URL https://arxiv.org/abs/2502.09061

work page arXiv 2025

[3] [3]

A Conceptual Introduction to Hamiltonian Monte Carlo

Michael Betancourt. A conceptual introduction to hamiltonian monte carlo. arXiv preprint arXiv:1701.02434, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[4] [4]

Pyro: Deep universal probabilistic programming

Eli Bingham, Jonathan P Chen, Martin Jankowiak, Fritz Obermeyer, Neeraj Pradhan, Theofanis Karaletsos, Rohit Singh, Paul Szerlip, Paul Horsfall, and Noah D Goodman. Pyro: Deep universal probabilistic programming. Journal of machine learning research, 20 0 (28): 0 1--6, 2019

work page 2019

[5] [5]

Automated reverse engineering of nonlinear dynamical systems

Josh Bongard and Hod Lipson. Automated reverse engineering of nonlinear dynamical systems. Proceedings of the National Academy of Sciences, 104 0 (24): 0 9943--9948, 2007

work page 2007

[6] [6]

Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li, and Allen Riddell

Bob Carpenter, Andrew Gelman, Matthew D. Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li, and Allen Riddell. Stan: A probabilistic programming language. Journal of Statistical Software, 76 0 (1): 0 1–32, 2017 a . doi:10.18637/jss.v076.i01. URL https://www.jstatsoft.org/index.php/jss/article/view/v076i01

work page doi:10.18637/jss.v076.i01 2017

[7] [7]

Stan: A probabilistic programming language

Bob Carpenter, Andrew Gelman, Matthew D Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li, and Allen Riddell. Stan: A probabilistic programming language. Journal of statistical software, 76: 0 1--32, 2017 b

work page 2017

[8] [8]

A general-purpose algorithm for constrained sequential inference

Daniel Deutsch, Shyam Upadhyay, and Dan Roth. A general-purpose algorithm for constrained sequential inference. In Proceedings of the Conference on Computational Natural Language Learning, 2019. URL https://aclanthology.org/K19-1045/

work page 2019

[9] [9]

and Cai, Yaxing and Lai, Ruihang and Xu, Ziyi and Zhao, Yilong and Chen, Tianqi , title =

Yixin Dong, Charlie F Ruan, Yaxing Cai, Ruihang Lai, Ziyi Xu, Yilong Zhao, and Tianqi Chen. XGrammar : Flexible and efficient structured generation engine for large language models. arXiv preprint arXiv:2411.15100, 2024. URL https://arxiv.org/pdf/2411.15100

work page arXiv 2024

[10] [10]

Structure discovery in nonparametric regression through compositional kernel search

David Duvenaud, James Lloyd, Roger Grosse, Joshua Tenenbaum, and Ghahramani Zoubin. Structure discovery in nonparametric regression through compositional kernel search. In International Conference on Machine Learning, pages 1166--1174. PMLR, 2013

work page 2013

[11] [11]

Unsupervised learning by program synthesis

Kevin Ellis, Armando Solar-Lezama, and Josh Tenenbaum. Unsupervised learning by program synthesis. In C. Cortes, N. Lawrence, D. Lee, M. Sugiyama, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 28. Curran Associates, Inc., 2015. URL https://proceedings.neurips.cc/paper_files/paper/2015/file/b73dfe25b4b8714c029b37a6ad300...

work page 2015

[12] [12]

UTF -8 plumbing: Byte-level tokenizers unavoidably enable LLM s to generate ill-formed UTF -8

Preston Firestone, Shubham Ugare, Gagandeep Singh, and Sasa Misailovic. UTF -8 plumbing: Byte-level tokenizers unavoidably enable LLM s to generate ill-formed UTF -8. In Second Conference on Language Modeling, 2025. URL https://openreview.net/forum?id=8ExXncFpf6

work page 2025

[13] [13]

Bayesian data analysis

Andrew Gelman, John B Carlin, Hal S Stern, and Donald B Rubin. Bayesian data analysis. Chapman and Hall/CRC, 1995

work page 1995

[14] [14]

Bayesian workflow

Andrew Gelman, Aki Vehtari, Daniel Simpson, Charles C Margossian, Bob Carpenter, Yuling Yao, Lauren Kennedy, Jonah Gabry, Paul-Christian B \"u rkner, and Martin Modr \'a k. Bayesian workflow. arXiv preprint arXiv:2011.01808, 2020

work page arXiv 2011

[15] [15]

Learning the structure of sum-product networks

Robert Gens and Domingos Pedro. Learning the structure of sum-product networks. In Sanjoy Dasgupta and David McAllester, editors, Proceedings of the 30th International Conference on Machine Learning, volume 28 of Proceedings of Machine Learning Research, pages 873--880, Atlanta, Georgia, USA, 17--19 Jun 2013. PMLR. URL https://proceedings.mlr.press/v28/ge...

work page 2013

[16] [16]

Search-based synthesis of probabilistic models for quality-of-service software engineering

Simos Gerasimou, Giordano Tamburrelli, and Radu Calinescu. Search-based synthesis of probabilistic models for quality-of-service software engineering. In Proceedings of the 30th IEEE/ACM International Conference on Automated Software Engineering, ASE '15, page 319–330. IEEE Press, 2015. ISBN 9781509000241. doi:10.1109/ASE.2015.22. URL https://doi.org/10.1...

work page doi:10.1109/ase.2015.22 2015

[17] [17]

Learning efficient markov networks

Vibhav Gogate, William Webb, and Pedro Domingos. Learning efficient markov networks. In J. Lafferty, C. Williams, J. Shawe-Taylor, R. Zemel, and A. Culotta, editors, Advances in Neural Information Processing Systems, volume 23. Curran Associates, Inc., 2010. URL https://proceedings.neurips.cc/paper_files/paper/2010/file/e5e63da79fcd2bebbd7cb8bf1c1d0274-Paper.pdf

work page 2010

[18] [18]

Gordon, Thomas A

Andrew D. Gordon, Thomas A. Henzinger, Aditya V. Nori, and Sriram K. Rajamani. Probabilistic programming. In Proceedings of the on Future of Software Engineering, pages 167--181. ACM, 2014. doi:10.1145/2593882.2593900

work page doi:10.1145/2593882.2593900 2014

[19] [19]

Tenenbaum, Vikash K

Gabriel Grand, Joshua B. Tenenbaum, Vikash K. Mansinghka, Alexander K. Lew, and Jacob Andreas. Self-steering language models, 2025. URL https://arxiv.org/abs/2504.07081

work page arXiv 2025

[20] [20]

The Llama 3 Herd of Models

Aaron Grattafiori, Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Alex Vaughan, et al. The llama 3 herd of models. arXiv preprint arXiv:2407.21783, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[21] [21]

Grosse, Ruslan Salakhutdinov, William T

Roger B. Grosse, Ruslan Salakhutdinov, William T. Freeman, and Joshua B. Tenenbaum. Exploiting compositionality to explore a large space of model structures. In Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence, UAI'12, page 306–315, Arlington, Virginia, USA, 2012. AUAI Press. ISBN 9780974903989

work page 2012

[22] [22]

Model selection in compositional spaces

Roger Baker Grosse. Model selection in compositional spaces. PhD thesis, Massachusetts Institute of Technology, 2014

work page 2014

[23] [23]

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Ruoyu Zhang, Runxin Xu, Qihao Zhu, Shirong Ma, Peiyi Wang, Xiao Bi, et al. Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning. arXiv preprint arXiv:2501.12948, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[24] [24]

The no-u-turn sampler: adaptively setting path lengths in hamiltonian monte carlo

Matthew D Hoffman, Andrew Gelman, et al. The no-u-turn sampler: adaptively setting path lengths in hamiltonian monte carlo. J. Mach. Learn. Res., 15 0 (1): 0 1593--1623, 2014

work page 2014

[25] [25]

Binyuan Hui, Jian Yang, Zeyu Cui, Jiaxi Yang, Dayiheng Liu, Lei Zhang, Tianyu Liu, Jiajun Zhang, Bowen Yu, Keming Lu, et al. Qwen2. 5-coder technical report. arXiv preprint arXiv:2409.12186, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[26] [26]

Automata-based constraints for language model decoding

Terry Koo, Frederick Liu, and Luheng He. Automata-based constraints for language model decoding. In Conference on Language Modeling, 2024. URL https://openreview.net/forum?id=BDBdblmyzY

work page 2024

[27] [27]

Validating large language models with RELM

Michael Kuchnik, Virginia Smith, and George Amvrosiadis. Validating large language models with RELM . Proceedings of Machine Learning and Systems, 5, 2023. URL https://proceedings.mlsys.org/paper_files/paper/2023/file/93c7d9da61ccb2a60ac047e92787c3ef-Paper-mlsys2023.pdf

work page 2023

[28] [28]

arXiv preprint arXiv:2402.17879 , year =

Michael Y. Li, Emily B. Fox, and Noah D. Goodman. Automated statistical model discovery with language models, 2024. URL https://arxiv.org/abs/2402.17879

work page arXiv 2024

[29] [29]

Automated model discovery for human brain using constitutive artificial neural networks

Kevin Linka, Sarah R St Pierre, and Ellen Kuhl. Automated model discovery for human brain using constitutive artificial neural networks. Acta Biomaterialia, 160: 0 134--151, 2023

work page 2023

[30] [30]

Syntactic and semantic control of large language models via sequential

João Loula, Benjamin LeBrun, Li Du, Ben Lipkin, Clemente Pasti, Gabriel Grand, Tianyu Liu, Yahya Emara, Marjorie Freedman, Jason Eisner, Ryan Cotterell, Vikash Mansinghka, Alexander K. Lew, Tim Vieira, and Timothy J. O'Donnell. Syntactic and semantic control of large language models via sequential monte carlo, 2025. URL https://arxiv.org/abs/2504.13139

work page arXiv 2025

[31] [31]

Learning Arithmetic Circuits

Daniel Lowd and Pedro Domingos. Learning arithmetic circuits, 2012. URL https://arxiv.org/abs/1206.3271

work page internal anchor Pith review Pith/arXiv arXiv 2012

[32] [32]

Bayesian population analysis using WinBUGS

M Schaub M Kery. Bayesian population analysis using WinBUGS. Academic Press, 2011

work page 2011

[33] [33]

2024 , archiveprefix =

Måns Magnusson, Jakob Torgander, Paul-Christian Bürkner, Lu Zhang, Bob Carpenter, and Aki Vehtari. posteriordb: Testing, benchmarking and developing bayesian inference algorithms, 2024. URL https://arxiv.org/abs/2407.04967

work page arXiv 2024

[34] [34]

V. K. Mansinghka, C. Kemp, J. B. Tenenbaum, and T. L. Griffiths. Structured priors for structure learning. In Proceedings of the Twenty-Second Conference on Uncertainty in Artificial Intelligence, UAI'06, page 324–331, Arlington, Virginia, USA, 2006. AUAI Press. ISBN 0974903922

work page 2006

[35] [35]

Hybrid grammar-based approach to nonlinear dynamical system identification from biological time series

BA McKinney, JE Crowe Jr, HU Voss, PS Crooke, N Barney, and JH Moore. Hybrid grammar-based approach to nonlinear dynamical system identification from biological time series. Physical Review E—Statistical, Nonlinear, and Soft Matter Physics, 73 0 (2): 0 021912, 2006

work page 2006

[36] [36]

Mcmc using hamiltonian dynamics

Radford M Neal et al. Mcmc using hamiltonian dynamics. Handbook of markov chain monte carlo, 2 0 (11): 0 2, 2011

work page 2011

[37] [37]

Nori, Sherjil Ozair, Sriram K

Aditya V. Nori, Sherjil Ozair, Sriram K. Rajamani, and Deepak Vijaykeerthy. Efficient synthesis of probabilistic programs. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '15, page 208–217, New York, NY, USA, 2015. Association for Computing Machinery. ISBN 9781450334686. doi:10.1145/2737924.2737982...

work page doi:10.1145/2737924.2737982 2015

[38] [38]

Pytorch: An imperative style, high-performance deep learning library

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. Pytorch: An imperative style, high-perfo...

work page 2019

[39] [39]

Composable Effects for Flexible and Accelerated Probabilistic Programming in NumPyro

Du Phan, Neeraj Pradhan, and Martin Jankowiak. Composable effects for flexible and accelerated probabilistic programming in numpyro, 2019. URL https://arxiv.org/abs/1912.11554

work page internal anchor Pith review Pith/arXiv arXiv 2019

[40] [40]

Synchromesh: Reliable code generation from pre-trained language models

Gabriel Poesia, Alex Polozov, Vu Le, Ashish Tiwari, Gustavo Soares, Christopher Meek, and Sumit Gulwani. Synchromesh: Reliable code generation from pre-trained language models. In International Conference on Learning Representations, 2022. URL https://openreview.net/forum?id=KmtVD97J43e

work page 2022

[41] [41]

Estimation in parallel randomized experiments

Donald B Rubin. Estimation in parallel randomized experiments. Journal of Educational Statistics, 6 0 (4): 0 377--401, 1981

work page 1981

[42] [42]

Saad, Marco F

Feras A. Saad, Marco F. Cusumano-Towner, Ulrich Schaechtle, Martin C. Rinard, and Vikash K. Mansinghka. Bayesian synthesis of probabilistic programs for automatic data modeling. Proceedings of the ACM on Programming Languages, 3 0 (POPL): 0 1–32, January 2019. ISSN 2475-1421. doi:10.1145/3290350. URL http://dx.doi.org/10.1145/3290350

work page doi:10.1145/3290350 2019

[43] [43]

Wiecki, and Christopher Fonnesbeck

John Salvatier, Thomas V. Wiecki, and Christopher Fonnesbeck. Probabilistic programming in python using PyMC 3. PeerJ Computer Science , 2: 0 e55, apr 2016. doi:10.7717/peerj-cs.55. URL https://doi.org/10.7717/peerj-cs.55

work page doi:10.7717/peerj-cs.55 2016

[44] [44]

Distilling free-form natural laws from experimental data

Michael Schmidt and Hod Lipson. Distilling free-form natural laws from experimental data. science, 324 0 (5923): 0 81--85, 2009

work page 2009

[45] [45]

Dingo: Constrained inference for diffusion llms, 2025

Tarun Suresh, Debangshu Banerjee, Shubham Ugare, Sasa Misailovic, and Gagandeep Singh. Dingo: Constrained inference for diffusion llms, 2025. URL https://arxiv.org/abs/2505.23061

work page arXiv 2025

[46] [46]

Codegemma: Open code models based on gemma

CodeGemma Team, Heri Zhao, Jeffrey Hui, Joshua Howland, Nam Nguyen, Siqi Zuo, Andrea Hu, Christopher A Choquette-Choo, Jingyue Shen, Joe Kelley, et al. Codegemma: Open code models based on gemma. arXiv preprint arXiv:2406.11409, 2024

work page arXiv 2024

[47] [47]

Itergen: Iterative structured llm generation

Shubham Ugare, Rohan Gumaste, Tarun Suresh, Gagandeep Singh, and Sasa Misailovic. Itergen: Iterative structured llm generation. arXiv preprint arXiv:2410.07295, 2024 a

work page arXiv 2024

[48] [49]

Improving llm code generation with grammar augmentation,

Shubham Ugare, Tarun Suresh, Hangoo Kang, Sasa Misailovic, and Gagandeep Singh. Syncode: Llm generation with grammar augmentation, 2024 c . URL https://arxiv.org/abs/2403.01632

work page arXiv 2024

[49] [50]

IterGen : Iterative structured LLM generation

Shubham Ugare, Rohan Gumaste, Tarun Suresh, Gagandeep Singh, and Sasa Misailovic. IterGen : Iterative structured LLM generation. In The Thirteenth International Conference on Learning Representations, 2025. URL https://openreview.net/pdf?id=ac93gRzxxV

work page 2025

[50] [51]

Examples volume 1, a

MRC Biostatistics Unit. Examples volume 1, a . URL http://www.mrc-bsu.cam.ac.uk/wp-content/uploads/WinBUGS_Vol1.pdf

work page

[51] [52]

Examples volume 2, b

MRC Biostatistics Unit. Examples volume 2, b . URL http://www.mrc-bsu.cam.ac.uk/wp-content/uploads/WinBUGS_Vol2.pdf

work page

[52] [53]

An introduction to probabilistic programming, 2021

Jan-Willem van de Meent, Brooks Paige, Hongseok Yang, and Frank Wood. An introduction to probabilistic programming, 2021. URL https://arxiv.org/abs/1809.10756

work page arXiv 2021

[53] [54]

Practical bayesian model evaluation using leave-one-out cross-validation and waic

Aki Vehtari, Andrew Gelman, and Jonah Gabry. Practical bayesian model evaluation using leave-one-out cross-validation and waic. Statistics and computing, 27: 0 1413--1432, 2017

work page 2017

[54] [55]

Rank-normalization, folding, and localization: An improved R for assessing convergence of mcmc (with discussion)

Aki Vehtari, Andrew Gelman, Daniel Simpson, Bob Carpenter, and Paul-Christian B \"u rkner. Rank-normalization, folding, and localization: An improved R for assessing convergence of mcmc (with discussion). Bayesian analysis, 16 0 (2): 0 667--718, 2021

work page 2021

[55] [56]

Efficient Guided Generation for Large Language Models

Brandon T Willard and R \'e mi Louf. Efficient guided generation for large language models. arXiv preprint arXiv:2307.09702, 2023. URL https://arxiv.org/pdf/2307.09702

work page internal anchor Pith review Pith/arXiv arXiv 2023

[56] [57]

Counterexample-Driven Synthesis for Probabilistic Program Sketches

Milan Češka, Christian Hensel, Sebastian Junges, and Joost-Pieter Katoen. Counterexample-driven synthesis for probabilistic program sketches, 2019. URL https://arxiv.org/abs/1904.12371

work page internal anchor Pith review Pith/arXiv arXiv 2019