RAFFLES: Reasoning-based attribution of faults for LLM systems.arXiv preprint arXiv:2509.06822,

Zhu, C · 2025 · arXiv 2509.06822

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

Strained Coherence: A Pre-Failure Signal in Coding Agent Execution Trajectories

cs.LG · 2026-06-05 · unverdicted · novelty 6.0

Strained coherence flagged by Claude judge on 44 coding trajectories predicts failure (94% vs 46%, p=0.003), with partial replication on second model.

StepFinder: A Temporal Semantic Framework for Failure Attribution in Multi-Agent Systems

cs.AI · 2026-06-02 · unverdicted · novelty 6.0

StepFinder turns execution logs into temporal semantic sequences via LLMs then uses temporal modeling plus attention to attribute failures to specific steps more accurately and 79% faster than direct LLM methods on the Who&When benchmark.

citing papers explorer

Showing 1 of 1 citing paper after filters.

StepFinder: A Temporal Semantic Framework for Failure Attribution in Multi-Agent Systems cs.AI · 2026-06-02 · unverdicted · none · ref 56
StepFinder turns execution logs into temporal semantic sequences via LLMs then uses temporal modeling plus attention to attribute failures to specific steps more accurately and 79% faster than direct LLM methods on the Who&When benchmark.

RAFFLES: Reasoning-based attribution of faults for LLM systems.arXiv preprint arXiv:2509.06822,

fields

years

verdicts

representative citing papers

citing papers explorer