pith. machine review for the scientific record. sign in

arxiv: 2508.21184 · v3 · submitted 2025-08-28 · 💻 cs.CL · cs.AI· stat.ML

Recognition: unknown

BED-LLM: Intelligent Information Gathering with LLMs and Bayesian Experimental Design

Authors on Pith no claims yet
classification 💻 cs.CL cs.AIstat.ML
keywords designbayesianbed-llmexperimentalinformationllmsapproachexternal
0
0 comments X
read the original abstract

We propose a general-purpose approach for improving the ability of large language models (LLMs) to intelligently and adaptively gather information from a user or other external source using the framework of sequential Bayesian experimental design (BED). This enables LLMs to act as effective multi-turn conversational agents and interactively interface with external environments. Our approach, which we call BED-LLM (Bayesian experimental design with large language models), is based on iteratively choosing questions or queries that maximize the expected information gain (EIG) with respect to a variable of interest given the responses gathered previously. We show how this EIG can be formulated (and then estimated) in a principled way using a probabilistic model derived from the LLM's predictive distributions and provide detailed insights into key decisions in its construction and updating procedure. We find that BED-LLM achieves substantial gains in performance across a wide range of tests based on the 20 Questions game and using the LLM to actively infer user preferences, compared to purely prompting-based design generation and other adaptive design strategies.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 6 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Uncertainty Propagation in LLM-Based Systems

    cs.SE 2026-04 unverdicted novelty 7.0

    This paper introduces a systems-level conceptual framing and a three-level taxonomy (intra-model, system-level, socio-technical) for uncertainty propagation in compound LLM applications, along with engineering insight...

  2. LLMs are not (consistently) Bayesian: Quantifying internal (in)consistencies of LLMs' probabilistic beliefs

    cs.LG 2026-05 unverdicted novelty 6.0

    LLMs do not consistently perform Bayesian updates on probabilistic beliefs; heuristic approaches often outperform exact Bayesian computation on downstream tasks, indicating misspecified internal models of the world.

  3. The Perceptual Bandwidth Bottleneck in Vision-Language Models: Active Visual Reasoning via Sequential Experimental Design

    cs.CV 2026-05 unverdicted novelty 6.0

    VLMs improve high-resolution reasoning by framing it as sequential Bayesian optimal experimental design, using a coverage-resolution proxy and the FOVEA procedure to acquire task-relevant visual evidence, yielding gai...

  4. The Perceptual Bandwidth Bottleneck in Vision-Language Models: Active Visual Reasoning via Sequential Experimental Design

    cs.CV 2026-05 unverdicted novelty 6.0

    VLMs suffer from a perceptual bandwidth bottleneck; the paper formalizes active visual reasoning as sequential Bayesian optimal experimental design, derives a coverage-resolution proxy objective, and introduces the tr...

  5. Planning to Explore: Curiosity-Driven Planning for LLM Test Generation

    cs.SE 2026-04 unverdicted novelty 6.0

    CovQValue achieves 51-77% higher branch coverage than greedy baselines on TestGenEval Lite by using coverage feedback and LLM-estimated Q-values to select informative test plans.

  6. Statistics, Not Scale: Modular Medical Dialogue with Bayesian Belief Engine

    cs.LG 2026-04 unverdicted novelty 5.0

    BMBE separates LLM language handling from a standalone Bayesian diagnostic engine, producing calibrated selective diagnosis, a performance gap over frontier LLMs, and robustness to adversarial inputs.