Teaching large language models to self-debug

Xinyun Chen, Maxwell Lin, Nathanael Schärli, Denny Zhou · 2023

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

cs.LG · 2025-03-18 · conditional · novelty 6.0

DAPO introduces decoupled clipping and dynamic sampling for LLM RL, achieving 50 on AIME 2024 with Qwen2.5-32B while fully open-sourcing code, data, and the verl-based training system.

Multi-Agent Collaboration: Harnessing the Power of Intelligent LLM Agents

cs.AI · 2023-06-05 · unverdicted · novelty 2.0

The paper introduces a collaborative multi-agent framework for LLMs and applies it conceptually to existing models like Auto-GPT, BabyAGI, and Gorilla through case studies in domains such as courtroom simulations and software development.

citing papers explorer

Showing 2 of 2 citing papers.

DAPO: An Open-Source LLM Reinforcement Learning System at Scale cs.LG · 2025-03-18 · conditional · none · ref 36
DAPO introduces decoupled clipping and dynamic sampling for LLM RL, achieving 50 on AIME 2024 with Qwen2.5-32B while fully open-sourcing code, data, and the verl-based training system.
Multi-Agent Collaboration: Harnessing the Power of Intelligent LLM Agents cs.AI · 2023-06-05 · unverdicted · none · ref 9
The paper introduces a collaborative multi-agent framework for LLMs and applies it conceptually to existing models like Auto-GPT, BabyAGI, and Gorilla through case studies in domains such as courtroom simulations and software development.

Teaching large language models to self-debug

fields

years

verdicts

representative citing papers

citing papers explorer