pith. sign in

arxiv: 2602.18647 · v2 · pith:I2QLG7LCnew · submitted 2026-02-20 · 💻 cs.LG · cs.AI· cs.CV· cs.IT· math.IT

Noise Scheduling as Information-Guided Allocation in Diffusion Training

classification 💻 cs.LG cs.AIcs.CVcs.ITmath.IT
keywords noisetraininginfonoisescheduleallocationdenoisingfixedprofile
0
0 comments X
read the original abstract

We introduce InfoNoise, an online adaptive noise schedule for diffusion training that reallocates optimization effort toward noise levels where denoising is most informative. Together with loss weighting, a noise schedule induces an effective allocation across denoising problems, often fixed before informative noise levels are known. InfoNoise makes this allocation data-adaptive by estimating a conditional-entropy-rate profile from denoising losses during training, without auxiliary models or offline search. Through I--MMSE, this profile identifies where noisy observations rapidly reduce uncertainty about the clean sample and guides adaptation of the training noise distribution. It changes only this distribution, keeping the objective, weighting, and parameterization fixed. On image benchmarks, where schedules have been extensively tuned, InfoNoise matches or slightly exceeds strong baselines and can reach the same quality with fewer updates. On representation, sequence, and modality shifts, including DNA and language generation, InfoNoise improves over fixed and adaptive baselines and reaches target quality with up to $3\times$ less training compute. These results establish the conditional-entropy-rate profile as the data-dependent target for noise schedule design and make online adaptation a practical alternative to manual schedule search.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Towards Closing the Autoregressive Gap in Language Modeling via Entropy-Gated Continuous Bitstream Diffusion

    cs.CL 2026-05 unverdicted novelty 7.0

    A 130M-parameter continuous bitstream diffusion model with entropy-gated Langevin sampling achieves GenPPL 59.76 on LM1B and 27.06 on OWT, closing the gap to autoregressive models at matched entropy with 256 NFEs.

  2. NoiseRater: Meta-Learned Noise Valuation for Diffusion Model Training

    cs.LG 2026-05 unverdicted novelty 6.0

    NoiseRater meta-learns instance-level importance scores for noise in diffusion training via bilevel optimization, then uses a two-stage pipeline to improve efficiency and generation quality on FFHQ and ImageNet.