Defines canonical credal set oracles as Knaster-Tarski least fixed points of isotone operators on closed credal sets, proving self-consistency and reduction to point estimates when non-performative.
hub Mixed citations
Superintelligent agents pose catastrophic risks: Can scientist ai offer a safer path?
Mixed citation behavior. Most common role is background (50%).
hub tools
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 13representative citing papers
NEO is a probabilistic neural model that induces compositional programs as a learned Language of Thought from non-textual observations and executes them via a shared transition model to enable explanation-driven generalization.
A disinterested Bayesian Predictor trained on contextualized statements has low probability of producing harmful agency because dangerous behaviors require rare coordinated underestimation of harm with no training signal favoring them.
Characterizes spurious correlation mechanisms in preference optimization via mean spurious bias and causal-spurious correlation leakage, demonstrates irreducible vulnerability to distribution shift, and introduces tie training as selective mitigation with validation on log-linear models and empirica
Neighbor-Consistency Belief (NCB) measures LLM belief robustness across conceptual neighborhoods, revealing that high-NCB facts resist contextual interference better, and Structure-Aware Training reduces brittleness by about 30%.
Agentic misalignment in multi-agent systems arises from generic utilities causing posterior collapse; Agentic Evidence Attribution using self-reflection or weak-to-strong generalization provides context-specific evidence to align agent posteriors.
AI value alignment is reconceptualized as a pluralistic governance problem arising along three axes—objectives, information, and principals—making it inherently context-dependent and unsolvable by technical design alone.
ADAPT is an online reweighting framework for LLM training that outperforms offline data selection and mixing methods in cross-benchmark generalization under equal compute.
LLM agents use a Cartesian split between learned prediction and engineered control, enabling modularity but creating sensitivity and bottlenecks unlike integrated biological systems.
AI risks arise from growth-oriented economies, and post-growth concepts such as satisficing, the Doughnut model, and resource caps can reduce those risks while prioritizing tool-like AI over agentic systems.
The paper unifies perspectives on Long CoT in reasoning LLMs by introducing a taxonomy, detailing characteristics of deep reasoning and reflection, and discussing emergence phenomena and future directions.
The paper proposes a four-role framework for LLMs in scientific innovation and reviews methods, benchmarks, and limitations across Assistant, Collaborator, Scientist, and Evaluator roles.
This survey frames foundation agents using brain-inspired modular architectures and reviews challenges in evolution, collaboration, and safety.
citing papers explorer
-
Learning to Theorize the World from Observation
NEO is a probabilistic neural model that induces compositional programs as a learned Language of Thought from non-textual observations and executes them via a shared transition model to enable explanation-driven generalization.