Learning from failure: Integrating negative examples when fine-tuning large language models as agents

Renxi Wang, Haonan Li, Xudong Han, Yixuan Zhang, Timothy Baldwin · 2024 · arXiv 2402.11651

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

read on arXiv browse 4 citing papers

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

Reversa: A Reverse Documentation Engineering Framework for Converting Legacy Software into Operational Specifications for AI Agents

cs.SE · 2026-05-18 · conditional · novelty 7.0

Reversa is a reverse documentation engineering framework that deploys a multi-agent pipeline to extract implicit rules from legacy software and produce traceable specifications with confidence scores and explicit gaps for human review.

On-Policy Self-Evolution via Failure Trajectories for Agentic Safety Alignment

cs.AI · 2026-05-12 · unverdicted · novelty 6.0

FATE lets LLM agents self-evolve safer behaviors by generating and filtering repairs from their own failure trajectories using verifiers and Pareto optimization.

SpanVLA: Efficient Action Bridging and Learning from Negative-Recovery Samples for Vision-Language-Action Model

cs.CV · 2026-04-21 · unverdicted · novelty 5.0

SpanVLA reduces action generation latency via flow-matching conditioned on history and improves robustness by training on negative-recovery samples with GRPO and a dedicated reasoning dataset.

AdaSwitch: Adaptive Switching between Small and Large Agents for Effective Cloud-Local Collaborative Learning

cs.CL · 2024-10-17 · unverdicted · novelty 5.0

AdaSwitch improves small local LLM performance on reasoning tasks by adaptively switching to a large cloud LLM upon detected errors, sometimes matching cloud results with far less overhead.

citing papers explorer

Showing 4 of 4 citing papers.

Reversa: A Reverse Documentation Engineering Framework for Converting Legacy Software into Operational Specifications for AI Agents cs.SE · 2026-05-18 · conditional · none · ref 35
Reversa is a reverse documentation engineering framework that deploys a multi-agent pipeline to extract implicit rules from legacy software and produce traceable specifications with confidence scores and explicit gaps for human review.
On-Policy Self-Evolution via Failure Trajectories for Agentic Safety Alignment cs.AI · 2026-05-12 · unverdicted · none · ref 40
FATE lets LLM agents self-evolve safer behaviors by generating and filtering repairs from their own failure trajectories using verifiers and Pareto optimization.
SpanVLA: Efficient Action Bridging and Learning from Negative-Recovery Samples for Vision-Language-Action Model cs.CV · 2026-04-21 · unverdicted · none · ref 67
SpanVLA reduces action generation latency via flow-matching conditioned on history and improves robustness by training on negative-recovery samples with GRPO and a dedicated reasoning dataset.
AdaSwitch: Adaptive Switching between Small and Large Agents for Effective Cloud-Local Collaborative Learning cs.CL · 2024-10-17 · unverdicted · none · ref 26
AdaSwitch improves small local LLM performance on reasoning tasks by adaptively switching to a large cloud LLM upon detected errors, sometimes matching cloud results with far less overhead.

Learning from failure: Integrating negative examples when fine-tuning large language models as agents

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer