Cyclic Denoising Reveals Ultrastable Memories in Diffusion Models
Pith reviewed 2026-06-26 08:26 UTC · model grok-4.3
The pith
Cyclic denoising extracts ultrastable attractors that match memorized training images from diffusion models.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Cyclic denoising exposes ultrastable attractors in diffusion models that regenerate after near-total corruption and persist through thousands of noising-denoising cycles. Many of these attractors correspond to memorized training images. The protocol works in both latent-space models such as Stable Diffusion v1.4 and pixel-space DDPMs, requires no gradients or conditioning, and displays a yielding-like transition: low noise amplitudes produce trivial fixed points while larger amplitudes produce basin hopping and long-lived trapping in structured memorized basins.
What carries the argument
Cyclic denoising: repeated forward and reverse diffusion at controlled noise amplitudes that drives samples toward attractors with a broad stability spectrum.
If this is right
- Ultrastable attractors regenerate after near-total corruption and persist through thousands of cycles.
- Many attractors correspond to memorized training images including stock photographs, brand watermarks, and web-crawl artifacts.
- The attack works fully unconditioned and requires only sampler-level control with no gradients, weights, or prompts.
- Noise amplitude controls a yielding-like transition from trivial fixed points to rearrangements and trapping in memorized basins.
- The recovered attractor set shows hierarchical partial absorption, prompt-stabilized basins, and universality across different initial conditions.
Where Pith is reading between the lines
- Developers could run cyclic denoising on their own models before deployment to locate and remove memorized content.
- The same cycling procedure might serve as a general probe for memorization in other generative architectures beyond diffusion.
- Adding an explicit membership-inference verification step would make the attack more robust against false positives.
- The observed cross-initial-condition universality suggests the memorized basins occupy a sizable fraction of the model's measure.
Load-bearing premise
The recovered ultrastable attractors are verifiably memorized training images rather than model-generated artifacts that happen to resemble training data.
What would settle it
Direct comparison of the extracted attractor images against the full training set shows that none of them match any training example, or the attractors fail to reappear after a second round of near-total corruption.
Figures
read the original abstract
We introduce cyclic denoising -- repeated forward and reverse diffusion at controlled noise amplitudes -- as an extraction attack for image diffusion models. Inspired by random organization in disordered solids, cyclic denoising exposes regions of the learned distribution that are largely inaccessible to standard sampling. The dynamics drive samples toward attractors with a broad stability spectrum. The deepest attractors are ultrastable: they regenerate after near-total corruption and persist through thousands of noising-denoising cycles. Many of these attractors correspond to memorized training images, including stock photographs, brand watermarks, and web-crawl artifacts. The attack requires only sampler-level control, with no gradients, weight inspection, prompts, captions, or prior knowledge of the training data. Unlike generate-and-filter attacks, which rely on large-scale prompted generation and post-hoc similarity or membership-inference filtering, our main protocol is fully unconditioned. We demonstrate the phenomenon in Stable Diffusion v1.4 and in a pixel-space DDPM, showing consistent behavior across latent- and pixel-space diffusion models. Across noise amplitudes, we observe a yielding-like transition: low-amplitude cycling produces trivial absorbing fixed points or limit cycles, while larger amplitudes induce rearrangements, basin hopping, and long-lived trapping in structured memorized attractor basins. We also observe hierarchical partial absorption, prompt-stabilized basins, and cross-initial-condition universality of the recovered attractor set. Our results therefore show that cyclic denoising is both a physics-inspired probe of generative landscapes and a practical tool for memorization auditing, with implications for privacy, copyright compliance, and model fingerprinting.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces cyclic denoising—repeated forward and reverse diffusion at controlled noise amplitudes—as an extraction attack that drives diffusion models toward ultrastable attractors. These attractors are claimed to regenerate after near-total corruption, persist through thousands of cycles, and frequently correspond to memorized training images (stock photographs, watermarks, web artifacts). The method is fully unconditioned, requires only sampler control, and is demonstrated on Stable Diffusion v1.4 and a pixel-space DDPM, with observations of yielding-like transitions, hierarchical absorption, and cross-initial-condition universality.
Significance. If the claimed correspondence to memorized training images can be secured with explicit verification protocols, the work would provide a novel physics-inspired probe of generative landscapes and a practical, gradient-free tool for memorization auditing with implications for privacy and copyright compliance.
major comments (2)
- [Abstract] Abstract: the assertion that 'many of these attractors correspond to memorized training images, including stock photographs, brand watermarks, and web-crawl artifacts' is load-bearing for the central claim yet supplies no membership test, exact-match criterion, similarity threshold, or false-positive control (e.g., recovery rate on non-training images).
- [Demonstration sections] Demonstration sections (Stable Diffusion v1.4 and pixel-space DDPM experiments): no quantitative verification statistics, error bars, dataset sizes, or membership-inference results are reported to establish that recovered attractors are verifiably training-set images rather than de-novo model outputs.
minor comments (1)
- [Abstract] Abstract: the phrase 'yielding-like transition' invokes a physics analogy without a precise operational definition or reference to the disordered-solids literature.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. We address each major comment below, indicating planned revisions where the manuscript can be strengthened without misrepresenting our results.
read point-by-point responses
-
Referee: [Abstract] Abstract: the assertion that 'many of these attractors correspond to memorized training images, including stock photographs, brand watermarks, and web-crawl artifacts' is load-bearing for the central claim yet supplies no membership test, exact-match criterion, similarity threshold, or false-positive control (e.g., recovery rate on non-training images).
Authors: The current manuscript identifies memorized images through visual inspection of distinctive, recognizable features (e.g., brand watermarks and web artifacts) that standard sampling rarely produces. We agree this falls short of rigorous verification. We will revise the abstract to qualify the language and add a new subsection with quantitative metrics, including image similarity thresholds and controls on non-training images, for the experiments under our control. revision: yes
-
Referee: [Demonstration sections] Demonstration sections (Stable Diffusion v1.4 and pixel-space DDPM experiments): no quantitative verification statistics, error bars, dataset sizes, or membership-inference results are reported to establish that recovered attractors are verifiably training-set images rather than de-novo model outputs.
Authors: We acknowledge the lack of quantitative statistics and error bars in the presented demonstrations. For the pixel-space DDPM (trained in-house), we will add dataset sizes, error bars, and basic verification statistics. For Stable Diffusion v1.4, we will explicitly discuss the limitations imposed by proprietary training data while retaining the qualitative observations. revision: partial
- Full membership-inference testing for Stable Diffusion v1.4 cannot be performed because its training dataset is not publicly available.
Circularity Check
No circularity: empirical protocol with independent observations
full rationale
The paper introduces cyclic denoising as a sampler-level protocol and reports direct empirical observations of attractor stability and correspondence to training data. No derivation chain, equations, or first-principles predictions are claimed. Results follow from applying the described forward-reverse cycling procedure to existing models without reduction to fitted parameters, self-definitions, or load-bearing self-citations. The method is presented as unconditioned and independent of training-data knowledge.
Axiom & Free-Parameter Ledger
Forward citations
Cited by 1 Pith paper
-
Flow Reasoning Models: Scaling Reasoning Through Iterative Self-Refinement
Flow models reach 99.2% Sudoku accuracy in 7 passes and 96.1% on out-of-distribution Sudoku-Extreme by selecting dynamically stable candidates and training with self-conditioning plus DPO to avoid failed outputs.
Reference graph
Works this paper leans on
-
[1]
Random organization in periodically driven systems , journal =
Cort. Random organization in periodically driven systems , journal =. 2008 , volume =. doi:10.1038/nphys891 , url =
-
[2]
Self-Organization and Memory in a Disordered Solid Subject to Random Driving , author =. Phys. Rev. Lett. , volume =. 2025 , month =. doi:10.1103/PhysRevLett.134.178203 , url =
-
[3]
Sharma, Rishabh and Karmakar, Smarajit , title =. Nature Physics , year =. doi:10.1038/s41567-024-02724-5 , url =
-
[4]
2023 , eprint=
Extracting Training Data from Diffusion Models , author=. 2023 , eprint=
2023
-
[5]
2023 , eprint=
A Reproducible Extraction of Training Images from Diffusion Models , author=. 2023 , eprint=
2023
-
[6]
Antonio Sclocchi and Alessandro Favero and Matthieu Wyart , title =. Proceedings of the National Academy of Sciences , volume =. 2025 , doi =. https://www.pnas.org/doi/pdf/10.1073/pnas.2408799121 , abstract =
-
[7]
Why Diffusion Models Don
Tony Bonnaire and Rapha. Why Diffusion Models Don. The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=
-
[8]
The Fourteenth International Conference on Learning Representations , year=
Navigating the Latent Space Dynamics of Neural Models , author=. The Fourteenth International Conference on Learning Representations , year=
-
[9]
2022 , eprint=
Diffusion Art or Digital Forgery? Investigating Data Replication in Diffusion Models , author=. 2022 , eprint=
2022
-
[10]
2023 , eprint=
Understanding and Mitigating Copying in Diffusion Models , author=. 2023 , eprint=
2023
-
[11]
2025 , eprint=
SIDE: Surrogate Conditional Data Extraction from Diffusion Models , author=. 2025 , eprint=
2025
-
[12]
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , month =
Rombach, Robin and Blattmann, Andreas and Lorenz, Dominik and Esser, Patrick and Ommer, Bj\"orn , title =. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , month =. 2022 , pages =
2022
-
[13]
2020 , eprint=
Denoising Diffusion Probabilistic Models , author=. 2020 , eprint=
2020
-
[14]
Absorbing state dynamics of stochastic gradient descent , author =. Phys. Rev. E , volume =. 2026 , month =. doi:10.1103/sbv7-syp7 , url =
-
[15]
2025 , eprint=
Memorization to Generalization: Emergence of Diffusion Models from Associative Memory , author=. 2025 , eprint=
2025
-
[16]
Proceedings of Bridges 2015: Mathematics, Music, Art, Architecture, Culture , year =
Werth, Andrew , title =. Proceedings of Bridges 2015: Mathematics, Music, Art, Architecture, Culture , year =
2015
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.