Understanding pre-training and fine-tuning from loss landscape perspectives

· 2025 · arXiv 2505.17646

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

read on arXiv browse 7 citing papers

citation-role summary

background 3

citation-polarity summary

background 3

representative citing papers

Beyond Perplexity: A Geometric and Spectral Study of Low-Rank Pre-Training

cs.LG · 2026-05-13 · unverdicted · novelty 7.0 · 2 refs

Low-rank pre-training methods converge to geometrically and spectrally distinct basins and show diverging activations compared to full-rank training at 60M-350M scales.

RACC: Representation-Aware Coverage Criteria for LLM Safety Testing

cs.SE · 2026-02-02 · unverdicted · novelty 7.0

RACC defines six representation-aware coverage criteria that score jailbreak test suites by measuring activation of safety concepts extracted from LLM hidden states on a calibration set.

You Snooze, You Lose: Automatic Safety Alignment Restoration through Neural Weight Translation

cs.CR · 2026-05-06 · unverdicted · novelty 6.0

NeWTral is a non-linear weight translation framework using MoE routing that reduces average attack success rate from 70% to 13% on unsafe domain adapters across Llama, Mistral, Qwen, and Gemma models up to 72B while retaining 90% knowledge fidelity.

Nexus: Same Pretraining Loss, Better Downstream Generalization via Common Minima

cs.LG · 2026-04-10 · unverdicted · novelty 6.0

Nexus optimizer improves LLM downstream performance by converging to common minima across data sources despite identical pretraining loss.

Preconditioned Norms: A Unified Framework for Steepest Descent, Quasi-Newton and Adaptive Methods

cs.LG · 2025-10-12 · unverdicted · novelty 6.0

Preconditioned matrix norms unify steepest descent, quasi-Newton, and adaptive optimizers, revealing SGD, Adam, Muon, KL-Shampoo, SOAP, and SPlus as special cases and enabling new methods MuAdam and MuAdam-SANIA that are competitive in experiments.

Secure LLM Fine-Tuning via Safety-Aware Probing

cs.LG · 2025-05-22 · unverdicted · novelty 6.0

SAP locates safety-correlated directions via contrastive signals and perturbs hidden-state propagation with a lightweight probe to preserve safety while fine-tuning LLMs for task performance.

ReGA: Model-Based Safeguard for LLMs via Representation-Guided Abstraction

cs.CR · 2025-06-02 · unverdicted · novelty 5.0

ReGA uses safety-critical representations to guide abstraction in model-based analysis, enabling scalable detection of harmful LLM inputs with reported AUROC of 0.975 at prompt level.

citing papers explorer

Showing 7 of 7 citing papers.

Beyond Perplexity: A Geometric and Spectral Study of Low-Rank Pre-Training cs.LG · 2026-05-13 · unverdicted · none · ref 2 · 2 links
Low-rank pre-training methods converge to geometrically and spectrally distinct basins and show diverging activations compared to full-rank training at 60M-350M scales.
RACC: Representation-Aware Coverage Criteria for LLM Safety Testing cs.SE · 2026-02-02 · unverdicted · none · ref 14
RACC defines six representation-aware coverage criteria that score jailbreak test suites by measuring activation of safety concepts extracted from LLM hidden states on a calibration set.
You Snooze, You Lose: Automatic Safety Alignment Restoration through Neural Weight Translation cs.CR · 2026-05-06 · unverdicted · none · ref 24
NeWTral is a non-linear weight translation framework using MoE routing that reduces average attack success rate from 70% to 13% on unsafe domain adapters across Llama, Mistral, Qwen, and Gemma models up to 72B while retaining 90% knowledge fidelity.
Nexus: Same Pretraining Loss, Better Downstream Generalization via Common Minima cs.LG · 2026-04-10 · unverdicted · none · ref 4
Nexus optimizer improves LLM downstream performance by converging to common minima across data sources despite identical pretraining loss.
Preconditioned Norms: A Unified Framework for Steepest Descent, Quasi-Newton and Adaptive Methods cs.LG · 2025-10-12 · unverdicted · none · ref 6
Preconditioned matrix norms unify steepest descent, quasi-Newton, and adaptive optimizers, revealing SGD, Adam, Muon, KL-Shampoo, SOAP, and SPlus as special cases and enabling new methods MuAdam and MuAdam-SANIA that are competitive in experiments.
Secure LLM Fine-Tuning via Safety-Aware Probing cs.LG · 2025-05-22 · unverdicted · none · ref 12
SAP locates safety-correlated directions via contrastive signals and perturbs hidden-state propagation with a lightweight probe to preserve safety while fine-tuning LLMs for task performance.
ReGA: Model-Based Safeguard for LLMs via Representation-Guided Abstraction cs.CR · 2025-06-02 · unverdicted · none · ref 47
ReGA uses safety-critical representations to guide abstraction in model-based analysis, enabling scalable detection of harmful LLM inputs with reported AUROC of 0.975 at prompt level.

Understanding pre-training and fine-tuning from loss landscape perspectives

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer