PriorZero uses root-only LLM prior injection in MCTS and alternating world-model training with LLM fine-tuning to raise exploration efficiency and final performance on Jericho text games and BabyAI gridworlds.
arXiv preprint arXiv:2402.02392 , year=
7 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 2polarities
background 2representative citing papers
SemBid injects LLM-encoded Task, History, and Strategy semantics as tokens into offline bidding trajectories and uses self-attention to outperform numerical-only baselines in performance, constraint satisfaction, and robustness.
A small set of sparse autoencoder features in LLMs drives shifts between generous and selfish allocations in dictator games, with causal patching and steering confirming their role and generalization to other social games.
Reddit analysis shows users detect AI sycophancy through comparisons and consistency checks, apply mitigation prompts, and sometimes seek affirmative responses for support, indicating context-aware design is better than total elimination.
Output prefilling with a structured prefix steers LLMs to produce cleaner first tokens in MCQA, raising accuracy and calibration over standard first-token probability.
TokUR estimates token-level uncertainty via low-rank weight perturbations in LLMs, aggregates signals to correlate with correctness, and uses them to improve reasoning performance on math tasks.
A survey compiling roles, applications, benchmarks, challenges, and future directions for large language models in operations research.
citing papers explorer
-
PriorZero: Bridging Language Priors and World Models for Decision Making
PriorZero uses root-only LLM prior injection in MCTS and alternating world-model training with LLM fine-tuning to raise exploration efficiency and final performance on Jericho text games and BabyAI gridworlds.
-
On the Role of Language Representations in Auto-Bidding: Findings and Implications
SemBid injects LLM-encoded Task, History, and Strategy semantics as tokens into offline bidding trajectories and uses self-attention to outperform numerical-only baselines in performance, constraint satisfaction, and robustness.
-
Understanding the Mechanism of Altruism in Large Language Models
A small set of sparse autoencoder features in LLMs drives shifts between generous and selfish allocations in dictator games, with causal patching and steering confirming their role and generalization to other social games.
-
User Detection and Response Patterns of Sycophantic Behavior in Conversational AI
Reddit analysis shows users detect AI sycophancy through comparisons and consistency checks, apply mitigation prompts, and sometimes seek affirmative responses for support, indicating context-aware design is better than total elimination.
-
Improving LLM First-Token Predictions in Multiple-Choice Question Answering via Output Prefilling
Output prefilling with a structured prefix steers LLMs to produce cleaner first tokens in MCQA, raising accuracy and calibration over standard first-token probability.
-
TokUR: Token-Level Uncertainty Estimation for Large Language Model Reasoning
TokUR estimates token-level uncertainty via low-rank weight perturbations in LLMs, aggregates signals to correlate with correctness, and uses them to improve reasoning performance on math tasks.
-
Large Language Models for Operations Research: A Comprehensive Survey
A survey compiling roles, applications, benchmarks, challenges, and future directions for large language models in operations research.