pith. machine review for the scientific record.
sign in

hub

Trail: Trace reasoning and agentic issue localization

12 Pith papers cite this work. Polarity classification is still indexing.

12 Pith papers citing it

hub tools

citation-role summary

background 3

citation-polarity summary

years

2026 10 2025 2

roles

background 3

polarities

background 3

representative citing papers

Holistic Evaluation and Failure Diagnosis of AI Agents

cs.AI · 2026-05-14 · unverdicted · novelty 7.0

A span-decomposed evaluation framework for AI agents achieves state-of-the-art results on GAIA and SWE-Bench with up to 3.5x gains in localization accuracy by breaking traces into independent per-span judgments.

Process-Centric Analysis of Agentic Software Systems

cs.SE · 2025-12-02 · unverdicted · novelty 6.0

Graphectory turns stochastic agent trajectories into analyzable graphs, showing that stronger models and successful fixes follow coherent localization-validation steps while failures are chaotic, and online detection plus rollback improves resolution rates by 6.9-23.5%.

Towards Self-Improving Error Diagnosis in Multi-Agent Systems

cs.MA · 2026-04-19 · unverdicted · novelty 5.0

ErrorProbe introduces a self-improving pipeline for attributing semantic failures in LLM multi-agent systems to specific agents and steps via anomaly detection, backward tracing, and tool-grounded validation with verified episodic memory.

A pragmatic approach to regulating AI agents

cs.CY · 2026-04-16 · unverdicted · novelty 5.0

AI agents require distinct regulation as AI systems under the EU AI Act with orchestration-layer oversight and a risk-based traffic light authorization system in contract law to preserve human accountability.

A Survey of Context Engineering for Large Language Models

cs.CL · 2025-07-17 · accept · novelty 4.0

The survey organizes Context Engineering into retrieval, processing, management, and integrated systems like RAG and multi-agent setups while identifying an asymmetry where LLMs handle complex inputs well but struggle with equally sophisticated long outputs.

citing papers explorer

Showing 12 of 12 citing papers.