Proceedings of the AAAI Conference on Artificial Intelligence , volume =

· 2024 · DOI 10.1609/aaai.v38i16.29728

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open at publisher browse 2 citing papers

representative citing papers

Don't Blindly Trust It: How Unreliable Feedback Breaks Tool-Using LLM Agents

cs.AI · 2026-06-19 · unverdicted · novelty 6.0

Misleading tool feedback produces value inversion in LLM agents, with performance dropping below matched no-feedback baselines on HotpotQA and similar tasks.

A Systems-Level Analysis of Sensitivity, Robustness, and Stability in Retrieval-Augmented Generation

cs.IR · 2026-05-29 · unverdicted · novelty 4.0

Empirical runs across 56 settings on a fixed 500-question set show non-monotonic downstream scores and preprocessing losses, leading to a call for multi-stage RAG evaluation.

citing papers explorer

Showing 2 of 2 citing papers after filters.

Don't Blindly Trust It: How Unreliable Feedback Breaks Tool-Using LLM Agents cs.AI · 2026-06-19 · unverdicted · none · ref 35
Misleading tool feedback produces value inversion in LLM agents, with performance dropping below matched no-feedback baselines on HotpotQA and similar tasks.
A Systems-Level Analysis of Sensitivity, Robustness, and Stability in Retrieval-Augmented Generation cs.IR · 2026-05-29 · unverdicted · none · ref 11
Empirical runs across 56 settings on a fixed 500-question set show non-monotonic downstream scores and preprocessing losses, leading to a call for multi-stage RAG evaluation.

Proceedings of the AAAI Conference on Artificial Intelligence , volume =

fields

years

verdicts

representative citing papers

citing papers explorer