A framework elicits discriminative MRF statistics from an LLM and closes the model via maximum entropy to enable zero-shot active feature acquisition, outperforming baselines on IBD patient data especially for hardest cases.
On the self-verification limitations of large language models on reasoning and planning tasks
8 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 2polarities
background 2representative citing papers
CAPS is a four-stage inference-only cascade that adapts how much of each solution the verifier sees and how comparisons are distributed, halving per-candidate verifier tokens while outperforming uniform pairwise verification on most benchmarks.
Describes a conceptual agentic prototype for AI translation that operationalizes skopos theory and GEMBA-MQM verification into a four-stage cycle with user dialogue and memory for coherence.
Weighted rules extend stable model semantics to support probabilistic reasoning, model ranking, and statistical inference in answer set programs.
Entropy minimization on self-generated outputs elicits strong reasoning in pretrained LLMs, matching or exceeding supervised RL methods on benchmarks.
SPIN enforces DAG-valid plans and prefix-based stopping for LLM agents, cutting executed tasks from 1061 to 623 and tool calls from 11.81 to 6.82 per run on AssetOpsBench while raising success from 0.638 to 0.706.
U-Define improves user control in LLM planning by letting people define hard rules and soft preferences in natural language with matching verification methods, raising usefulness and satisfaction scores.
This survey frames foundation agents using brain-inspired modular architectures and reviews challenges in evolution, collaboration, and safety.
citing papers explorer
-
The Unreasonable Effectiveness of Entropy Minimization in LLM Reasoning
Entropy minimization on self-generated outputs elicits strong reasoning in pretrained LLMs, matching or exceeding supervised RL methods on benchmarks.
-
Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems
This survey frames foundation agents using brain-inspired modular architectures and reviews challenges in evolution, collaboration, and safety.