pith. sign in

arxiv: 2606.02453 · v1 · pith:JVYRCE6Snew · submitted 2026-06-01 · 💻 cs.CV · cs.AI

Initialization is Half the Battle: Generating Diverse Images from a Guidance Potential Posterior

Pith reviewed 2026-06-28 15:17 UTC · model grok-4.3

classification 💻 cs.CV cs.AI
keywords mode collapsediversity enhancementdiffusion modelsinitializationLangevin dynamicsflow matchinginference-time methodguidance potential
0
0 comments X

The pith

Selecting initial noise from a guidance potential posterior re-weights diffusion trajectories toward diversity-rich regions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper identifies that standard Gaussian noise starts in diffusion and flow-matching models are blind to the guidance landscape and therefore drive most trajectories into a few dominant modes. It reframes the starting noise as a draw from a guidance potential posterior that up-weights regions likely to produce varied outputs. Langevin dynamics are used to sample from this posterior efficiently while keeping samples on the data manifold. The resulting DivIn procedure runs at inference time and works with both diffusion and flow models. Because it is orthogonal to methods that intervene later in the trajectory, the two families can be combined to improve the diversity-fidelity frontier.

Core claim

We formulate selecting the initial noise from a guidance potential posterior, which effectively re-weights the prior towards diversity-rich regions. To sample from this distribution efficiently, we introduce Diversity-inducing Initialization (DivIn), which leverages Langevin dynamics to actively navigate the initialization landscape, steering initial noise away from collapsing regions while anchoring them to the valid data manifold. Our method serves as an inference-time diversity enhancement compatible with both diffusion and flow matching models.

What carries the argument

The guidance potential posterior over initial noise, sampled by Langevin dynamics inside DivIn to steer away from mode-collapse basins.

If this is right

  • DivIn raises diversity scores in both class-conditional and text-conditional image generation while preserving sample quality.
  • The method applies unchanged to both diffusion and flow-matching generators.
  • Because DivIn acts only at initialization, it can be stacked with any trajectory-level diversity technique to expand the achievable diversity-quality frontier.
  • The re-weighting effect is achieved without retraining the underlying generative model.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the guidance potential can be estimated from a small number of forward passes, DivIn could be made cheaper for very large models.
  • The same initialization logic might apply to other iterative generative processes whose early steps determine later mode selection.
  • Measuring the correlation between guidance potential values and empirical mode frequencies would give a direct diagnostic for when the method is most needed.

Load-bearing premise

Standard Gaussian initialization is agnostic to the guidance potential landscape and therefore drives trajectories into a small number of dominant modes.

What would settle it

Generate a large set of images with DivIn and the baseline Gaussian start on the same model and prompts; if the number of distinct modes or recall metric shows no reliable increase while fidelity metrics stay comparable, the central claim is falsified.

Figures

Figures reproduced from arXiv: 2606.02453 by Dianbo Liu, Kenji Kawaguchi, Xiang Li.

Figure 1
Figure 1. Figure 1: Standard prior (blue) often lands in sharp, high potential hills that attract trajectories into a single dominant mode (mode collapse). Without changing the generation process, our proposed DivIn (green) leverages Langevin dynamics to sample from flat, low-potential basins as a diversity posterior, while staying an￾chored to the valid Gaussian distribution to enhance the generation diversity-quality trade-… view at source ↗
Figure 2
Figure 2. Figure 2: Qualitative comparison on various prompts and different architectures. By sampling initial noise from regions with low guidance potential, DivIn (bottom row) discovers more modes and generates diverse images. verge, leading to the discovery of diverse modes. However, we observe that deterministic seed optimization methods are inherently mode-seeking and prior-breaking, and thus inevitably collapse the init… view at source ↗
Figure 3
Figure 3. Figure 3: Comparison of mode discovery on a 2D toy distribution with 9 modes. (b) Standard Gaussian initialization (black “x”) concentrates samples in the high-potential region (dark red) driven by the central dominant mode, leading to (d) mode collapse where only 5 modes are recovered. (c) Our proposed initialization (x ∗ T , green “o”) disperses samples across the low-potential landscape, successfully recovering (… view at source ↗
Figure 4
Figure 4. Figure 4: We analyze the initialization potential against generative diversity. We sample 10 independent initial noises for each of the 1000 distinct prompts and calculate the Vendi score on a kernel of 10 and average potential across the 10 latents for each prompt. The binned mean (red line) reveals a strong negative correlation that initial noises with high guidance potential lead to mode collapse, while those wit… view at source ↗
Figure 5
Figure 5. Figure 5: , while SAIL’s deterministic updates drive all latents to collapse into the sharp local minimum, DivIn success￾fully disperses samples across the low-potential basin. (a) Deterministic optimization (SAIL) (b) Stochastic langevin dynamics (Ours) [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Diversity-fidelity Pareto frontiers. The plots illustrate how the hyperparameters of different methods affect the trade-off between image quality (x-axis) and diversity (y-axis) on ImageNet (left) and general prompts (right). Baselines with DivIn integrated (solid lines) consistently expand the Pareto frontiers of the baselines alone (dashed lines). Full results in Appendix D. Base Model Base Model + DivIn… view at source ↗
Figure 7
Figure 7. Figure 7: Visualizing the orthogonality of DivIn to baseline models. We compare standard sampling (Base Model), trajectory￾based methods (PG, CADS, IG), and their combinations with DivIn on the prompt “Van Gogh painting”. While baselines tend to collapse to a single dominant copy (first column), plugging in DivIn (right four columns) restores diversity across all methods, introducing distinct variations. lustrates t… view at source ↗
Figure 8
Figure 8. Figure 8: We track the guidance potential/sharpness (orange, lower is better) and latent Gaussianity (green, mean latent norm ||xT ||2) over 10 iteration steps. (a) DivIn successfully minimizes poten￾tial while maintaining Gaussianity. (b) SAIL optimizes sharpness but drifts the latent norm off-manifold, leading to artifacts. (c) Similar to SAIL, without noise injection, the latent collapses. (d) Without the prior t… view at source ↗
Figure 11
Figure 11. Figure 11: In addition to [PITH_FULL_IMAGE:figures/full_fig_p015_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Diversity and prompt alignment for prompts of varying length. We split the general prompts into 10 categories based on the number of characters. We find that DivIn achieves a consistently higher diversity than the base model for both short and long prompts, while keeping the same CLIP score as the base model. An example short prompt is “A whimsical candy shop”. An example long prompt is “cinema 4d colorfu… view at source ↗
Figure 13
Figure 13. Figure 13: Impact of Classifier-Free Guidance (CFG) weight. We evaluate diversity (Recall, Vendi, Coverage) and quality (Precision, Density, FID, FDDINOv2) metrics on ImageNet using SD v1.4 while varying the guidance scale. DivIn (orange line) consistently outperforms the Base Model (blue line) in diversity metrics across all guidance weights, effectively increasing diversity across varying guidance scales. 20 [PIT… view at source ↗
Figure 14
Figure 14. Figure 14: Images generated with Stable Diffusion v3.5 Medium with and without DivIn for general prompt “The Eiffel Tower”. Each row is a batch of four images with the same seed. (a) Base model without DivIn (b) Base model + DivIn [PITH_FULL_IMAGE:figures/full_fig_p021_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: Images generated with Stable Diffusion v3.5 Medium with and without DivIn for general prompt “A hidden cave with glowing crystals”. Each row is a batch of four images with the same seed. 21 [PITH_FULL_IMAGE:figures/full_fig_p021_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Images generated with Stable Diffusion v3.5 Medium with and without DivIn for general prompt “A traditional Mongolian yurt in the steppe”. Each row is a batch of four images with the same seed. (a) Base model without DivIn (b) Base model + DivIn [PITH_FULL_IMAGE:figures/full_fig_p022_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: Images generated with Stable Diffusion v3.5 Medium with and without DivIn for general prompt “portrait of Sporty Spice as a Poison Ivy. intricate artwork. by wlop, octane render, trending on artstation, very coherent symmetrical artwork. cinematic, hyper realism, high detail, octane render, 8k”. Each row is a batch of four images with the same seed. 22 [PITH_FULL_IMAGE:figures/full_fig_p022_17.png] view at source ↗
Figure 18
Figure 18. Figure 18: Images generated with Stable Diffusion v1.4 with and without DivIn for general prompt “a photo of a greenhouse”. Each row is a batch of five images with the same seed. (a) Base model without DivIn (b) Base model + DivIn [PITH_FULL_IMAGE:figures/full_fig_p023_18.png] view at source ↗
Figure 19
Figure 19. Figure 19: Images generated with Stable Diffusion v1.4 with and without DivIn for general prompt “a photo of a peacock”. Each row is a batch of five images with the same seed. 23 [PITH_FULL_IMAGE:figures/full_fig_p023_19.png] view at source ↗
Figure 20
Figure 20. Figure 20: Images generated with Stable Diffusion v1.4 with and without DivIn for general prompt “a photo of a boxer”. Each row is a batch of five images with the same seed. (a) Base model without DivIn (b) Base model + DivIn [PITH_FULL_IMAGE:figures/full_fig_p024_20.png] view at source ↗
Figure 21
Figure 21. Figure 21: Images generated with Stable Diffusion v1.4 with and without DivIn for general prompt “VAN GOGH CAFE TERASSE copy.jpg”. Each row is a batch of five images with the same seed. 24 [PITH_FULL_IMAGE:figures/full_fig_p024_21.png] view at source ↗
read the original abstract

Despite the remarkable fidelity of generative models, they frequently suffer from mode collapse. Existing strategies for enhancing diversity predominantly focus on intervening during the generation trajectory. We identify a critical oversight that the standard Gaussian initialization often causes trajectories to collapse into dominant modes because it is agnostic to the guidance potential landscape. In this work, we formulate selecting the initial noise from a guidance potential posterior, which effectively re-weights the prior towards diversity-rich regions. To sample from this distribution efficiently, we introduce Diversity-inducing Initialization (DivIn), which leverages Langevin dynamics to actively navigate the initialization landscape, steering initial noise away from collapsing regions while anchoring them to the valid data manifold. Our method serves as an inference-time diversity enhancement compatible with both diffusion and flow matching models. Extensive experiments show that DivIn exhibits a superior performance in both class-to-image and text-to-image scenarios. Furthermore, we highlight that as DivIn is orthogonal to trajectory-based methods, combining them significantly expands the diversity-quality Pareto frontier beyond what either achieves in isolation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The paper claims that standard Gaussian initialization in diffusion and flow matching models leads to mode collapse because it ignores the guidance potential landscape. It proposes sampling initial noise from a guidance potential posterior via Langevin dynamics in the DivIn method to re-weight the prior toward diversity-rich regions while remaining on the data manifold. The approach is presented as an inference-time technique orthogonal to trajectory-based diversity methods, compatible with both diffusion and flow matching, and supported by experiments showing superior performance in class-to-image and text-to-image generation, with combinations expanding the diversity-quality Pareto frontier.

Significance. If the empirical results hold, this provides a practical, orthogonal inference-time lever for diversity that can be combined with existing trajectory interventions. The compatibility with both diffusion and flow matching frameworks and the explicit focus on initialization as a source of mode collapse are strengths. The use of Langevin dynamics for navigating the initialization landscape is a coherent application of established sampling tools.

minor comments (2)
  1. [Abstract] Abstract: the high-level description of the guidance potential posterior and DivIn does not include even a brief equation or pseudocode sketch, which would help readers immediately grasp the re-weighting mechanism.
  2. The manuscript would benefit from an explicit statement of the computational cost of the Langevin sampling step relative to standard initialization, as this is relevant for practical adoption.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their thoughtful summary and positive assessment of our work, including recognition of DivIn as an orthogonal inference-time approach compatible with both diffusion and flow matching. We appreciate the recommendation for minor revision.

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper presents a methodological proposal for re-weighting initial noise via a guidance potential posterior sampled by Langevin dynamics (DivIn). No equations, fitted parameters, or derivations are visible in the provided text that reduce the claimed diversity improvement to a self-definition, a renamed input, or a self-citation chain. The central claim is framed as an empirical inference-time technique whose validity rests on external experiments rather than tautological reduction to its own inputs. The derivation chain is therefore self-contained against the listed circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no explicit free parameters, axioms, or invented entities beyond the high-level concepts named; ledger is therefore empty pending full text.

pith-pipeline@v0.9.1-grok · 5704 in / 1106 out tokens · 22718 ms · 2026-06-28T15:17:16.909733+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

107 extracted references · 1 canonical work pages

  1. [1]

    Advances in Neural Information Processing Systems , volume=

    Selective amnesia: A continual learning approach to forgetting in deep generative models , author=. Advances in Neural Information Processing Systems , volume=

  2. [2]

    Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

    Erasing concepts from diffusion models , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

  3. [3]

    Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

    Ablating concepts in text-to-image diffusion models , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

  4. [4]

    Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision , pages=

    Unified concept editing in diffusion models , author=. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision , pages=

  5. [5]

    The Twelfth International Conference on Learning Representations , year=

    SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation , author=. The Twelfth International Conference on Learning Representations , year=

  6. [6]

    arXiv preprint arXiv:2405.15234 , year=

    Defensive Unlearning with Adversarial Training for Robust Concept Erasure in Diffusion Models , author=. arXiv preprint arXiv:2405.15234 , year=

  7. [7]

    arXiv preprint arXiv:2402.11846 , year=

    UnlearnCanvas: A Stylized Image Dataset to Benchmark Machine Unlearning for Diffusion Models , author=. arXiv preprint arXiv:2402.11846 , year=

  8. [8]

    arXiv preprint arXiv:2404.15146 , year=

    Rethinking llm memorization through the lens of adversarial compression , author=. arXiv preprint arXiv:2404.15146 , year=

  9. [9]

    arXiv preprint arXiv:2405.03097 , year=

    To Each (Textual Sequence) Its Own: Improving Memorized-Data Unlearning in Large Language Models , author=. arXiv preprint arXiv:2405.03097 , year=

  10. [10]

    arXiv preprint arXiv:2406.01257 , year=

    What makes unlearning hard and what to do about it , author=. arXiv preprint arXiv:2406.01257 , year=

  11. [11]

    arXiv preprint arXiv:2406.07698 , year=

    Label Smoothing Improves Machine Unlearning , author=. arXiv preprint arXiv:2406.07698 , year=

  12. [12]

    Advances in Neural Information Processing Systems , volume=

    Doremi: Optimizing data mixtures speeds up language model pretraining , author=. Advances in Neural Information Processing Systems , volume=

  13. [13]

    Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

    Safe latent diffusion: Mitigating inappropriate degeneration in diffusion models , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

  14. [14]

    Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

    High-resolution image synthesis with latent diffusion models , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

  15. [15]

    Information , volume=

    Fastai: a layered API for deep learning , author=. Information , volume=. 2020 , publisher=

  16. [16]

    Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

    Deep residual learning for image recognition , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

  17. [17]

    Learning multiple layers of features from tiny images , author=

  18. [18]

    Reading digits in natural images with unsupervised feature learning , author=

  19. [19]

    arXiv preprint arXiv:2108.11577 , year=

    Machine unlearning of features and labels , author=. arXiv preprint arXiv:2108.11577 , year=

  20. [20]

    Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

    Eternal sunshine of the spotless net: Selective forgetting in deep networks , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

  21. [21]

    2022 IEEE 7th European Symposium on Security and Privacy (EuroS&P) , pages=

    Unrolling sgd: Understanding factors influencing machine unlearning , author=. 2022 IEEE 7th European Symposium on Security and Privacy (EuroS&P) , pages=. 2022 , organization=

  22. [22]

    International conference on machine learning , pages=

    Understanding black-box predictions via influence functions , author=. International conference on machine learning , pages=. 2017 , organization=

  23. [23]

    International Conference on Artificial Intelligence and Statistics , pages=

    Approximate data deletion from machine learning models , author=. International Conference on Artificial Intelligence and Statistics , pages=. 2021 , organization=

  24. [24]

    arXiv preprint arXiv:2303.17591 , year=

    Forget-me-not: Learning to forget in text-to-image diffusion models , author=. arXiv preprint arXiv:2303.17591 , year=

  25. [25]

    arXiv preprint arXiv:2403.07362 , year=

    Fan, Chongyu and Liu, Jiancheng and Hero, Alfred and Liu, Sijia , title=. arXiv preprint arXiv:2403.07362 , year=

  26. [26]

    2021 IEEE Symposium on Security and Privacy (SP) , pages=

    Machine unlearning , author=. 2021 IEEE Symposium on Security and Privacy (SP) , pages=. 2021 , organization=

  27. [27]

    32nd USENIX Security Symposium (USENIX Security 23) , pages=

    Extracting training data from diffusion models , author=. 32nd USENIX Security Symposium (USENIX Security 23) , pages=

  28. [28]

    In���� �������� ���������� �� �������� ������ ��� ������� ����������� ������

    Gowthami Somepalli and Vasu Singla and Micah Goldblum and Jonas Geiping and Tom Goldstein , title =. 2023 , url =. doi:10.1109/CVPR52729.2023.00586 , timestamp =

  29. [29]

    32nd USENIX Security Symposium (USENIX Security 23) , pages=

    Glaze: Protecting artists from style mimicry by \ Text-to-Image \ models , author=. 32nd USENIX Security Symposium (USENIX Security 23) , pages=

  30. [30]

    arXiv preprint arXiv:2210.04610 , year=

    Red-teaming the stable diffusion safety filter , author=. arXiv preprint arXiv:2210.04610 , year=

  31. [31]

    Advances in Neural Information Processing Systems , volume=

    Laion-5b: An open large-scale dataset for training next generation image-text models , author=. Advances in Neural Information Processing Systems , volume=

  32. [32]

    Advances in neural information processing systems , volume=

    Making ai forget you: Data deletion in machine learning , author=. Advances in neural information processing systems , volume=

  33. [33]

    Algorithmic Learning Theory , pages=

    Descent-to-delete: Gradient-based methods for machine unlearning , author=. Algorithmic Learning Theory , pages=. 2021 , organization=

  34. [34]

    Conference on Learning Theory , pages=

    Machine unlearning via algorithmic stability , author=. Conference on Learning Theory , pages=. 2021 , organization=

  35. [35]

    Advances in Neural Information Processing Systems , volume=

    Remember what you want to forget: Algorithms for machine unlearning , author=. Advances in Neural Information Processing Systems , volume=

  36. [36]

    arXiv preprint arXiv:1911.03030 , year=

    Certified data removal from machine learning models , author=. arXiv preprint arXiv:1911.03030 , year=

  37. [37]

    Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

    Boundary unlearning: Rapid forgetting of deep networks via shifting the decision boundary , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

  38. [38]

    Advances in neural information processing systems , volume=

    Denoising diffusion probabilistic models , author=. Advances in neural information processing systems , volume=

  39. [39]

    Advances in Neural Information Processing Systems , volume=

    Model sparsity can simplify machine unlearning , author=. Advances in Neural Information Processing Systems , volume=

  40. [40]

    International Conference on Machine Learning , pages=

    On provable copyright protection for generative models , author=. International Conference on Machine Learning , pages=. 2023 , organization=

  41. [41]

    Adversarial Example Does Good: Preventing Painting Imitation from Diffusion Models via Adversarial Examples , booktitle =

    Chumeng Liang and Xiaoyu Wu and Yang Hua and Jiaru Zhang and Yiming Xue and Tao Song and Zhengui Xue and Ruhui Ma and Haibing Guan , editor =. Adversarial Example Does Good: Preventing Painting Imitation from Diffusion Models via Adversarial Examples , booktitle =. 2023 , url =

  42. [42]

    arXiv preprint arXiv:2305.12683 , year=

    Mist: Towards Improved Adversarial Examples for Diffusion Models , author=. arXiv preprint arXiv:2305.12683 , year=

  43. [43]

    Raising the Cost of Malicious AI-Powered Image Editing , booktitle =

    Hadi Salman and Alaa Khaddaj and Guillaume Leclerc and Andrew Ilyas and Aleksander Madry , editor =. Raising the Cost of Malicious AI-Powered Image Editing , booktitle =. 2023 , url =

  44. [44]

    International Conference on Machine Learning , pages=

    Certified Data Removal from Machine Learning Models , author=. International Conference on Machine Learning , pages=. 2020 , organization=

  45. [45]

    Nudenet: Neural nets for nudity classification, detection and selective censoring , author=

  46. [46]

    2022 IEEE Symposium on Security and Privacy (SP) , pages=

    Membership Inference Attacks From First Principles , author=. 2022 IEEE Symposium on Security and Privacy (SP) , pages=. 2022 , organization=

  47. [47]

    Proceedings of the AAAI Conference on Artificial Intelligence , volume=

    Amnesiac machine learning , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

  48. [48]

    arXiv preprint arXiv:2310.02238 , year=

    Who's Harry Potter? Approximate Unlearning in LLMs , author=. arXiv preprint arXiv:2310.02238 , year=

  49. [49]

    Can Sensitive Information Be Deleted From

    Vaidehi Patil and Peter Hase and Mohit Bansal , booktitle=. Can Sensitive Information Be Deleted From. 2024 , url=

  50. [50]

    arXiv preprint arXiv:2310.10683 , year=

    Large language model unlearning , author=. arXiv preprint arXiv:2310.10683 , year=

  51. [51]

    Nathaniel Li and Alexander Pan and Anjali Gopal and Summer Yue and Daniel Berrios and Alice Gatti and Justin D. Li and Ann-Kathrin Dombrowski and Shashwat Goel and Gabriel Mukobi and Nathan Helm-Burger and Rassin Lababidi and Lennart Justen and Andrew Bo Liu and Michael Chen and Isabelle Barrass and Oliver Zhang and Xiaoyuan Zhu and Rishub Tamirisa and Bh...

  52. [52]

    Forty-first International Conference on Machine Learning , year=

    In-Context Unlearning: Language Models as Few-Shot Unlearners , author=. Forty-first International Conference on Machine Learning , year=

  53. [53]

    For Now , author=

    To Generate or Not? Safety-Driven Unlearned Diffusion Models Are Still Easy to Generate Unsafe Images... For Now , author=. European Conference on Computer Vision , pages=

  54. [54]

    Computer vision--ECCV 2014: 13th European conference, zurich, Switzerland, September 6-12, 2014, proceedings, part v 13 , pages=

    Microsoft coco: Common objects in context , author=. Computer vision--ECCV 2014: 13th European conference, zurich, Switzerland, September 6-12, 2014, proceedings, part v 13 , pages=. 2014 , organization=

  55. [55]

    The Thirteenth International Conference on Learning Representations , year=

    Dynamic Loss-Based Sample Reweighting for Improved Large Language Model Pretraining , author=. The Thirteenth International Conference on Learning Representations , year=

  56. [56]

    Proceedings of the 35th International Conference on Machine Learning , pages =

    Learning to Reweight Examples for Robust Deep Learning , author =. Proceedings of the 35th International Conference on Machine Learning , pages =. 2018 , editor =

  57. [57]

    Proceedings of the IEEE international conference on computer vision , pages=

    Focal loss for dense object detection , author=. Proceedings of the IEEE international conference on computer vision , pages=

  58. [58]

    The Thirty-eighth Annual Conference on Neural Information Processing Systems , year=

    Not All Tokens Are What You Need for Pretraining , author=. The Thirty-eighth Annual Conference on Neural Information Processing Systems , year=

  59. [59]

    2024 , editor =

    Fan, Simin and Pagliardini, Matteo and Jaggi, Martin , booktitle =. 2024 , editor =

  60. [60]

    arXiv preprint arXiv:1910.00762 , year=

    Accelerating deep learning by focusing on the biggest losers , author=. arXiv preprint arXiv:1910.00762 , year=

  61. [61]

    arXiv preprint arXiv:1511.06343 , year=

    Online batch selection for faster training of neural networks , author=. arXiv preprint arXiv:1511.06343 , year=

  62. [62]

    International conference on machine learning , pages=

    Not all samples are created equal: Deep learning with importance sampling , author=. International conference on machine learning , pages=. 2018 , organization=

  63. [63]

    International Conference on Learning Representations , year=

    Geometry-aware Instance-reweighted Adversarial Training , author=. International Conference on Learning Representations , year=

  64. [64]

    Advances in Neural Information Processing Systems , volume=

    Probabilistic margins for instance reweighting in adversarial training , author=. Advances in Neural Information Processing Systems , volume=

  65. [65]

    Proceedings of the AAAI Conference on Artificial Intelligence , volume=

    Are adversarial examples created equal? a learnable weighted minimax risk for robustness under non-uniform attacks , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

  66. [66]

    Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics , pages=

    Instance Weighting for Domain Adaptation in NLP , author=. Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics , pages=

  67. [67]

    Advances in neural information processing systems , volume=

    Rethinking importance weighting for deep learning under distribution shift , author=. Advances in neural information processing systems , volume=

  68. [68]

    International Conference on Learning Representations , year=

    Reweighting Augmented Samples by Minimizing the Maximal Expected Loss , author=. International Conference on Learning Representations , year=

  69. [69]

    CS 231N , volume=

    Tiny imagenet visual recognition challenge , author=. CS 231N , volume=

  70. [70]

    The Twelfth International Conference on Learning Representations , year=

    CADS: Unleashing the Diversity of Diffusion Models through Condition-Annealed Sampling , author=. The Twelfth International Conference on Learning Representations , year=

  71. [71]

    The Twelfth International Conference on Learning Representations , year=

    Particle Guidance: non-IID Diverse Sampling with Diffusion Models , author=. The Twelfth International Conference on Learning Representations , year=

  72. [72]

    Advances in Neural Information Processing Systems , volume=

    Applying guidance in a limited interval improves sample and distribution quality in diffusion models , author=. Advances in Neural Information Processing Systems , volume=

  73. [73]

    The Twelfth International Conference on Learning Representations , year=

    Detecting, explaining, and mitigating memorization in diffusion models , author=. The Twelfth International Conference on Learning Representations , year=

  74. [74]

    Forty-second International Conference on Machine Learning , year=

    Shielded Diffusion: Generating Novel and Diverse Images using Sparse Repellency , author=. Forty-second International Conference on Machine Learning , year=

  75. [75]

    Forty-second International Conference on Machine Learning , year=

    Understanding and Mitigating Memorization in Generative Models via Sharpness of Probability Landscapes , author=. Forty-second International Conference on Machine Learning , year=

  76. [76]

    Proceedings of the Computer Vision and Pattern Recognition Conference , pages=

    DiverseFlow: Sample-Efficient Diverse Mode Coverage in Flows , author=. Proceedings of the Computer Vision and Pattern Recognition Conference , pages=

  77. [77]

    Forty-first international conference on machine learning , year=

    Scaling rectified flow transformers for high-resolution image synthesis , author=. Forty-first international conference on machine learning , year=

  78. [78]

    Advances in neural information processing systems , volume=

    Improved precision and recall metric for assessing generative models , author=. Advances in neural information processing systems , volume=

  79. [79]

    International conference on machine learning , pages=

    Reliable fidelity and diversity metrics for generative models , author=. International conference on machine learning , pages=. 2020 , organization=

  80. [80]

    Advances in neural information processing systems , volume=

    Gans trained by a two time-scale update rule converge to a local nash equilibrium , author=. Advances in neural information processing systems , volume=

Showing first 80 references.