arxiv: 2605.00742 · v2 · submitted 2026-05-01 · 💻 cs.AI · cs.LG· stat.ML

Recognition: unknown

Position: agentic AI orchestration should be Bayes-consistent

Theodore Papamarkou , Pierre Alquier , Matthias Bauer , Wray Buntine , Andrew Davison , Gintare Karolina Dziugaite , Maurizio Filippone , Andrew Y. K. Foong

show 22 more authors

Vincent Fortuin Dimitris Fouskakis Jes Frellsen Eyke H\"ullermeier Theofanis Karaletsos Mohammad Emtiyaz Khan Nikita Kotelevskii Salem Lahlou Yingzhen Li Fang Liu Clare Lyle Thomas M\"ollenhoff Konstantina Palla Maxim Panov Yusuf Sale Kajetan Schweighofer Artem Shelmanov Siddharth Swaroop Martin Trapp Willem Waegeman Andrew Gordon Wilson Alexey Zaytsev

Authors on Pith no claims yet

Pith reviewed 2026-05-09 19:26 UTC · model grok-4.3

classification 💻 cs.AI cs.LGstat.ML

keywords agentic AIBayesian decision theoryorchestration layerbelief updatinguncertaintyLLM agentsdecision-making under uncertainty

0 comments

The pith

Agentic AI orchestration should apply Bayesian decision theory to maintain beliefs and choose actions under uncertainty.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper argues that coherent decisions in agentic systems, such as choosing tools or allocating resources, require Bayesian principles at the orchestration layer that controls LLMs and tools. This layer can maintain beliefs over task-relevant latent quantities, update them from observed interactions, and select actions via utility-aware policies. Making the LLMs themselves explicit Bayesian engines is computationally heavy and conceptually difficult, so the position focuses on the control layer instead. The authors outline practical properties for Bayesian control that align with existing agentic architectures and human-AI collaboration, along with design patterns and examples.

Core claim

Bayesian decision theory supplies a framework for agentic AI systems by maintaining beliefs over latent quantities, updating those beliefs from agentic and human-AI interactions, and selecting actions that reflect calibrated beliefs and utilities. This framework is most naturally placed at the orchestration layer rather than inside LLM parameters, because the latter remains computationally intensive and conceptually nontrivial. The paper articulates concrete properties for such Bayesian control and illustrates them with design patterns that fit modern agentic deployments.

What carries the argument

Bayesian belief updating and decision-making applied at the orchestration layer of an agentic AI system

If this is right

Agentic systems can maintain and update coherent beliefs over task-relevant latent quantities across multiple interactions.
Action selection at the orchestration level becomes utility-aware rather than purely heuristic.
The approach applies without requiring LLMs themselves to perform explicit Bayesian updating.
Design patterns become available for integrating calibrated uncertainty handling into existing agentic workflows.
Human-AI collaboration improves when the control layer can represent and update beliefs about human inputs.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Orchestration-layer Bayesian control could be combined with existing LLM uncertainty estimates to create hybrid systems without retraining base models.
This separation of concerns may scale more readily to multi-agent setups where different agents share a common belief state managed at the orchestration level.
Empirical tests could compare decision regret or calibration error between Bayesian-orchestrated and standard agentic baselines on benchmark tasks involving tool use and planning.

Load-bearing premise

Bayesian belief updating and decision-making can be implemented practically at the orchestration layer of current agentic systems without prohibitive computational cost or conceptual barriers.

What would settle it

An implemented orchestration layer that achieves equivalent calibration and decision quality to a non-Bayesian alternative across repeated tool-selection and resource-allocation tasks under uncertainty would falsify the position.

Figures

Figures reproduced from arXiv: 2605.00742 by Alexey Zaytsev, Andrew Davison, Andrew Gordon Wilson, Andrew Y. K. Foong, Artem Shelmanov, Clare Lyle, Dimitris Fouskakis, Eyke H\"ullermeier, Fang Liu, Gintare Karolina Dziugaite, Jes Frellsen, Kajetan Schweighofer, Konstantina Palla, Martin Trapp, Matthias Bauer, Maurizio Filippone, Maxim Panov, Mohammad Emtiyaz Khan, Nikita Kotelevskii, Pierre Alquier, Salem Lahlou, Siddharth Swaroop, Theodore Papamarkou, Theofanis Karaletsos, Thomas M\"ollenhoff, Vincent Fortuin, Willem Waegeman, Wray Buntine, Yingzhen Li, Yusuf Sale.

**Figure 1.** Figure 1: Left: schematic of the example of Section 4.1. At each step t, an orchestrator selects an agent it at cost cit , receives a message Zt, and updates a task-level belief state represented by the probability mass function rt(y) = p(Y = y | D1:t) over the binary codetesting outcome Y ∈ {0, 1}. Right: schematic of the example of Section 4.2. At each step t, an orchestrator selects an agent it at cost cit , rec… view at source ↗

read the original abstract

LLMs excel at predictive tasks and complex reasoning tasks, but many high-value deployments rely on decisions under uncertainty, for example, which tool to call, which expert to consult, or how many resources to invest. While the usefulness and feasibility of Bayesian approaches remain unclear for LLM inference, this position paper argues that the control layer of an agentic AI system (that orchestrates LLMs and tools) is a clear case where Bayesian principles should shine. Bayesian decision theory provides a framework for agentic systems that can help to maintain beliefs over task-relevant latent quantities, to update these beliefs from observed agentic and human-AI interactions, and to choose actions. Making LLMs themselves explicitly Bayesian belief-updating engines remains computationally intensive and conceptually nontrivial as a general modeling target. In contrast, this paper argues that coherent decision-making requires Bayesian principles at the orchestration level of the agentic system, not necessarily the LLM agent parameters. This paper articulates practical properties for Bayesian control that fit modern agentic AI systems and human-AI collaboration, and provides concrete examples and design patterns to illustrate how calibrated beliefs and utility-aware policies can improve agentic AI orchestration.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. This position paper argues that while applying Bayesian methods directly to LLM inference remains computationally and conceptually challenging, the orchestration layer of agentic AI systems should follow Bayesian decision theory. It claims that this layer should maintain beliefs over task-relevant latent quantities, update them from agentic and human-AI interactions, and select actions using utility-aware policies. The manuscript articulates practical properties for such Bayesian control, contrasts it with LLM-level Bayesianism, and supplies concrete examples and design patterns for modern agentic systems and human-AI collaboration.

Significance. If adopted, the position would provide a pragmatic framework for improving coherence in agentic AI decisions under uncertainty (e.g., tool selection, expert consultation, resource allocation) by leveraging standard Bayesian tools at the control layer rather than inside LLM parameters. The emphasis on design patterns and the explicit separation of orchestration from LLM internals is a constructive contribution that could guide implementations and empirical follow-up work.

major comments (1)

[practical properties section] In the section articulating practical properties for Bayesian control: the claim that these properties 'fit modern agentic AI systems' without prohibitive computational cost is asserted but unsupported by any analysis, complexity discussion, or reference to existing implementations of belief maintenance at the orchestration layer; this assumption is load-bearing for the central advocacy that orchestration is a 'clear case' where Bayesian principles should shine.

minor comments (1)

[Abstract] Abstract: the title uses the term 'Bayes-consistent' but the abstract and body primarily employ 'Bayesian principles' and 'Bayesian decision theory' without defining or aligning the terminology; a short clarifying sentence would improve consistency.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the positive summary and significance assessment, as well as for identifying a point that can strengthen the manuscript. We address the major comment below and will revise accordingly.

read point-by-point responses

Referee: [practical properties section] In the section articulating practical properties for Bayesian control: the claim that these properties 'fit modern agentic AI systems' without prohibitive computational cost is asserted but unsupported by any analysis, complexity discussion, or reference to existing implementations of belief maintenance at the orchestration layer; this assumption is load-bearing for the central advocacy that orchestration is a 'clear case' where Bayesian principles should shine.

Authors: We agree that the current draft asserts computational practicality without dedicated analysis or references, which weakens the central claim. In the revision we will add a short subsection (or expanded paragraph) to the practical properties section that (i) sketches the relevant complexity: orchestration-level updates operate over low-dimensional task latents (typically 5–20 variables) using standard approximations such as Kalman filters, small-particle SMC, or variational message passing, each of which is linear or low-polynomial in the number of observations and far cheaper than gradient-based inference over billions of LLM parameters; (ii) cites existing agentic systems that already maintain lightweight beliefs at the control layer (e.g., uncertainty-aware tool-selection modules in LangChain/LlamaIndex extensions and Bayesian optimization loops in Auto-GPT-style agents); and (iii) contrasts this cost with the prohibitive expense of making the LLM itself a full Bayesian engine. These additions will make the “clear case” argument evidence-based rather than purely positional. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

This position paper advocates applying standard Bayesian decision theory (belief maintenance, updating, and utility-aware selection) specifically to the orchestration layer of agentic systems, while acknowledging computational barriers for making LLM inference itself Bayesian. It contains no mathematical derivations, equations, fitted parameters, or self-referential definitions that reduce claims to inputs by construction. The central argument rests on established external Bayesian principles applied to a new context, with concrete design patterns offered as illustrations rather than derivations. No self-citation chains, uniqueness theorems, or ansatzes are invoked in a load-bearing way that would create circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper relies on standard Bayesian decision theory without introducing new free parameters, axioms beyond domain assumptions, or invented entities.

axioms (1)

domain assumption Bayesian decision theory supplies a coherent framework for maintaining beliefs and selecting actions under uncertainty
Invoked throughout as the normative basis for the orchestration layer recommendation.

pith-pipeline@v0.9.0 · 5649 in / 1140 out tokens · 53511 ms · 2026-05-09T19:26:42.662916+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

19 extracted references · 15 canonical work pages · 5 internal anchors

[1]

Dalal, and Vishal Misra

Agarwal, N., Dalal, S. R., and Misra, V . The Bayesian geometry of transformer attention.arXiv preprint arXiv:2512.22471, 2025a. Agarwal, N., Dalal, S. R., and Misra, V . Gradient dynamics of attention: How cross-entropy sculpts Bayesian mani- folds.arXiv preprint arXiv:2512.22473, 2025b. Aichberger, L., Schweighofer, K., Ielanskyi, M., and Hochreiter, S....

work page arXiv
[2]

Bayesian orchestration of multi-LLM agents for cost-aware sequential decision-making.arXiv preprint arXiv:2601.01522,

Amin, D. Bayesian orchestration of multi-LLM agents for cost-aware sequential decision-making.arXiv preprint arXiv:2601.01522,

work page arXiv
[3]

arXiv preprint arXiv:2002.09018 , year=

Anil, R., Gupta, V ., Koren, T., Regan, K., and Singer, Y . Scalable second order optimization for deep learning. arXiv preprint arXiv:2002.09018,

work page arXiv 2002
[4]

BASIL: Bayesian Assessment of Sycophancy in LLMs

Atwell, K., Heydari, P., Sicilia, A., and Alikhani, M. BASIL: Bayesian assessment of sycophancy in LLMs.arXiv preprint arXiv:2508.16846,

work page internal anchor Pith review Pith/arXiv arXiv
[5]

N., Belém, C

Bakman, Y ., Kang, S., Huang, Z., Yaldiz, D. N., Belém, C. G., Zhu, C., Kumar, A., Samuel, A., Avestimehr, S., Liu, D., and Karimireddy, S. P. Uncertainty as feature gaps: Epistemic uncertainty quantification of LLMs in contextual question-answering.arXiv preprint arXiv:2510.02671,

work page arXiv
[6]

Lejepa: Provable and scalable self-supervised learning without the heuristics, 2025

Balestriero, R. and LeCun, Y . LeJEPA: Provable and scal- able self-supervised learning without the heuristics.arXiv preprint arXiv:2511.08544,

work page arXiv
[7]

Chlon, L., Rashidi, S., Khamis, Z., and Awada, M. M. LLMs are Bayesian, in expectation, not in realization.arXiv preprint arXiv:2507.11768,

work page arXiv
[8]

World Models

Ha, D. and Schmidhuber, J. World models.arXiv preprint arXiv:1803.10122,

work page internal anchor Pith review arXiv
[9]

AgentCoder: Multi-Agent-based Code Generation with Iterative Testing and Optimisation

Huang, D., Zhang, J. M., Luck, M., Bu, Q., Qing, Y ., and Cui, H. AgentCoder: Multi-agent-based code generation with iterative testing and optimisation.arXiv preprint arXiv:2312.13010,

work page internal anchor Pith review arXiv
[10]

Kaddour, J., Patel, S., Dovonon, G., Richter, L., Minervini, P., and Kusner, M

URL https: //kellerjordan.github.io/posts/muon/. Kaddour, J., Patel, S., Dovonon, G., Richter, L., Minervini, P., and Kusner, M. J. Agentic uncertainty reveals agentic overconfidence.arXiv preprint arXiv:2602.06948,

work page arXiv
[11]

Reasoning with sampling: Your base model is smarter than you think.arXiv preprint arXiv:2510.14901,

Karan, A. and Du, Y . Reasoning with sampling: Your base model is smarter than you think.arXiv preprint arXiv:2510.14901,

work page arXiv
[12]

Khan, M. E. Knowledge adaptation as posterior correction. arXiv preprint arXiv:2506.14262,

work page arXiv
[13]

OpsAgent: An Evolving Multi-agent System for Incident Management in Microservices

Luo, Y ., Jiang, J., Feng, J., Tao, L., Zhang, Q., Wen, X., Sun, Y ., Zhang, S., and Pei, D. From observability data to diagnosis: An evolving multi-agent system for incident management in cloud systems.arXiv preprint arXiv:2510.24145,

work page internal anchor Pith review arXiv
[14]

Meta-learning of Sequential Strategies

Ortega, P. A., Wang, J. X., Rowland, M., Genewein, T., Kurth-Nelson, Z., Pascanu, R., Heess, N., Veness, J., Pritzel, A., Sprechmann, P., Jayakumar, S. M., McGrath, T., Miller, K., Azar, M., Osband, I., Rabinowitz, N., György, A., Chiappa, S., Osindero, S., Teh, Y . W., van Hasselt, H., de Freitas, N., Botvinick, M., and Legg, S. Meta-learning of sequenti...

work page Pith review arXiv 1905
[15]

Structured Uncertainty guided Clarification for LLM Agents

Suri, M., Mathur, P., Lipka, N., Dernoncourt, F., Rossi, R. A., and Manocha, D. Structured uncertainty guided clarification for LLM agents.arXiv preprint arXiv:2511.08798,

work page internal anchor Pith review Pith/arXiv arXiv
[16]

cold posterior

13 Position: agentic AI orchestration should be Bayes-consistent A. Extended background and related work Bayesian ideas enter agentic machine learning at multiple levels, ranging from implicit belief-state updates learned by meta- training to explicit probabilistic models used for uncertainty quantification and decision making. The current appendix review...

2020
[17]

succeed by marginalizing over broad solution basins, capturing functional diversity that local Gaussian approximations miss. Even in this setting, feasibility is a practical constraint, since ensembles require multiple trained models and multiple forward passes at inference time, which becomes expensive at LLM scale. However, recent findings suggest that ...

2022
[18]

for images or video, audio language models (Borsos et al., 2023), and multimodal large language models (Chen et al., 2023; Yin et al.,

2023
[19]

lower-dimensional

and AlphaZero (Silver et al., 2017; 2018), whose internal state and search are highly complex, yet whose control problem is still a selection among available actions guided by an objective. In this sense, the “lower-dimensional” belief states emphasized in this paper should be read as a typical feasibility-oriented choice, often smaller than raw interacti...

2017