Guoxiong Gao, Haocheng Ju, Jiedong Jiang, Zihan Qin, and Bin Dong

ISSN 1364-6613 · 1999 · DOI 10.1016/s1364-6613(99)01294-2

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

open at publisher browse 5 citing papers

representative citing papers

Spectral Unforgetting: Post-Hoc Recovery of Damaged Capabilities Without Retraining

cs.LG · 2026-05-19 · unverdicted · novelty 6.0

DG-Hard uses Donoho-Gavish hard thresholding on the fine-tuning weight delta to separate task-aligned signal from noise-like residual, recovering damaged capabilities while preserving target-task gains.

Sharpness-Aware Pretraining Mitigates Catastrophic Forgetting

cs.LG · 2026-05-04 · unverdicted · novelty 6.0

Sharpness-aware pretraining and related flat-minima interventions reduce catastrophic forgetting by up to 80% after post-training across 20M-150M models and by 31-40% at 1B scale.

Awakening the Sleeping Agent: Lean-Specific Agentic Data Reactivates General Tool Use in Goedel Prover

cs.AI · 2026-04-09 · unverdicted · novelty 6.0

Heavy supervised fine-tuning on formal math suppresses tool-calling in Goedel-Prover-V2 from 89.4% to near 0%, but 100 Lean agentic traces restore it to 83.8% on the Berkeley Function Calling Leaderboard with in-domain gains on ProofNet.

Cyclic Adaptive Private Synthesis for Sharing Real-World Data in Education

cs.CY · 2026-02-09 · unverdicted · novelty 6.0

CAPS provides an iterative differentially private synthesis method that outperforms one-shot baselines on authentic educational real-world data.

ARROW: Augmented Replay for RObust World models

cs.LG · 2026-03-12 · unverdicted · novelty 5.0

ARROW adds a distribution-matching long-term replay buffer to DreamerV3 and shows reduced forgetting versus same-size baselines on Atari and Procgen continual RL benchmarks.

citing papers explorer

Showing 5 of 5 citing papers.

Spectral Unforgetting: Post-Hoc Recovery of Damaged Capabilities Without Retraining cs.LG · 2026-05-19 · unverdicted · none · ref 3
DG-Hard uses Donoho-Gavish hard thresholding on the fine-tuning weight delta to separate task-aligned signal from noise-like residual, recovering damaged capabilities while preserving target-task gains.
Sharpness-Aware Pretraining Mitigates Catastrophic Forgetting cs.LG · 2026-05-04 · unverdicted · none · ref 59
Sharpness-aware pretraining and related flat-minima interventions reduce catastrophic forgetting by up to 80% after post-training across 20M-150M models and by 31-40% at 1B scale.
Awakening the Sleeping Agent: Lean-Specific Agentic Data Reactivates General Tool Use in Goedel Prover cs.AI · 2026-04-09 · unverdicted · none · ref 2
Heavy supervised fine-tuning on formal math suppresses tool-calling in Goedel-Prover-V2 from 89.4% to near 0%, but 100 Lean agentic traces restore it to 83.8% on the Berkeley Function Calling Leaderboard with in-domain gains on ProofNet.
Cyclic Adaptive Private Synthesis for Sharing Real-World Data in Education cs.CY · 2026-02-09 · unverdicted · none · ref 18
CAPS provides an iterative differentially private synthesis method that outperforms one-shot baselines on authentic educational real-world data.
ARROW: Augmented Replay for RObust World models cs.LG · 2026-03-12 · unverdicted · none · ref 6
ARROW adds a distribution-matching long-term replay buffer to DreamerV3 and shows reduced forgetting versus same-size baselines on Atari and Procgen continual RL benchmarks.

Guoxiong Gao, Haocheng Ju, Jiedong Jiang, Zihan Qin, and Bin Dong

fields

years

verdicts

representative citing papers

citing papers explorer