FINCH is a loss-adaptive learning-rate schedule that reduces forgetting by 93% on average during LLM fine-tuning while matching standard task performance across several benchmarks.
How to alleviate catastrophic forgetting in llms finetuning? hierarchical layer-wise and element-wise regularization
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
DualSFT derives parameter masks and data subsets as row- and column-wise aggregations of one gradient interaction matrix under first- and second-order validation-improvement approximations.
MedXIAOHE is a medical MLLM that claims state-of-the-art benchmark performance through specialized pretraining to cover long-tail diseases and RL-based reasoning training.
citing papers explorer
-
Fine-Tuning Without Forgetting via Loss-Adaptive Learning Rates
FINCH is a loss-adaptive learning-rate schedule that reduces forgetting by 93% on average during LLM fine-tuning while matching standard task performance across several benchmarks.
-
One Algorithm, Two Goals: Dual Scoring for Parameter and Data Selection in LLM Fine-Tuning
DualSFT derives parameter masks and data subsets as row- and column-wise aggregations of one gradient interaction matrix under first- and second-order validation-improvement approximations.
-
MedXIAOHE: A Comprehensive Recipe for Building Medical MLLMs
MedXIAOHE is a medical MLLM that claims state-of-the-art benchmark performance through specialized pretraining to cover long-tail diseases and RL-based reasoning training.