Empirical test shows top-1 argmax concentration has zero precision as collapse warning in DLM LoRA training due to pre-equilibrium saturation while max gradient norm provides usable but family-specific detection on short-horizon runs.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
When Top-1 Fails: Calibrating LoRA Monitors for Masked Diffusion LMs
Empirical test shows top-1 argmax concentration has zero precision as collapse warning in DLM LoRA training due to pre-equilibrium saturation while max gradient norm provides usable but family-specific detection on short-horizon runs.