CoFi-UCGen: Coarse-to-Fine Unsupervised Conditional Generation without Label Priors
Pith reviewed 2026-06-28 02:20 UTC · model grok-4.3
The pith
CoFi-UCGen disentangles global and fine-grained semantics to enable label-free conditional image generation at both coarse and fine levels.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that an adversarial semantic reciprocal learning theory can ensure semantic consistency and completeness between images and latent spaces, permitting bit-codes to encode distinct global semantics independently of noise sampling, which in turn supports building a fine-grained semantic basis and hierarchical modulation in diffusion models for layer-wise control of fine attributes.
What carries the argument
Bit-codes for structured coarse-grained latent space combined with hierarchical modulation mechanism in diffusion models.
If this is right
- Achieves both coarse- and fine-grained conditional generation without labels.
- Consistently outperforms prior UCGen methods on image quality, semantic consistency, and control accuracy.
- Maintains independent noise sampling while capturing global semantics.
- Enables layer-wise injection from coarse conditions to fine attributes in diffusion models.
- Works without pre-trained feature extractors or label priors.
Where Pith is reading between the lines
- The method might generalize to video or 3D generation tasks where semantic hierarchies are important.
- It suggests that explicit decomposition of semantics can improve controllability even in unsupervised settings.
- Applications could include more accessible creative tools that do not require annotated training data.
- Further work could explore the scalability of bit-codes to higher numbers of semantic classes.
Load-bearing premise
The adversarial semantic reciprocal learning theory ensures semantic consistency and completeness between images and latent spaces so that bit-codes can capture distinct global semantics independently of noise.
What would settle it
Running the generation process with fixed noise but varied bit-codes and checking if the resulting images exhibit clearly distinct global semantic attributes without any label supervision during training.
Figures
read the original abstract
Unsupervised conditional image generation (UCGen) aims to control generation without relying on manually annotated labels, yet remains challenging due to unstructured semantic representations across granularities. To address this, we propose a novel coarse-to-fine UCGen framework (CoFi-UCGen) that explicitly disentangles global semantics from fine-grained variations, which to the best of our knowledge, sets out the first successful attempt for both coarse- and fine-grained conditional generation without any labels. More specifically, we first propose the adversarial semantic reciprocal learning theory to ensure the semantic consistency and completeness between images and latent spaces. Based on the consistency, we propose the bit-codes to learn a structured coarse-grained latent space, and further prove distinct global semantics inherent from our bit-codes while preserving independent noise sampling for generation. Building upon these bit-codes, we establish a fine-grained semantic basis and introduce a hierarchical modulation mechanism in diffusion models, by enabling layer-wise injection from coarse conditions to progressively control fine-grained attributes during generation. Extensive experiments demonstrate that without any label priors or pre-trained feature extractors, our CoFi-UCGen consistently outperforms existing UCGen methods in terms of image quality, semantic consistency, and control accuracy, verifying the effectiveness of explicit coarse-to-fine semantic decomposition for the challenging UCGen task.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes CoFi-UCGen, a coarse-to-fine framework for unsupervised conditional image generation (UCGen) that disentangles global semantics from fine-grained variations without any labels or pre-trained extractors. It introduces an adversarial semantic reciprocal learning theory to enforce consistency and completeness between images and latent spaces, derives bit-codes to structure a coarse-grained latent space (claimed to prove distinct global semantics while allowing independent noise sampling), and uses these to drive a hierarchical modulation mechanism inside diffusion models for progressive fine-grained control. Experiments are said to show consistent outperformance over prior UCGen methods on image quality, semantic consistency, and control accuracy, positioning the work as the first successful attempt at both coarse- and fine-grained label-free conditional generation.
Significance. If the central construction and empirical claims hold, the result would be significant: it would supply the first explicit mechanism for multi-granularity control in UCGen without label priors, potentially enabling more structured generation in annotation-scarce domains. The combination of a reciprocal-learning theory, bit-code discretization, and diffusion-based hierarchical modulation is a coherent architectural response to the unstructured semantics problem highlighted in the abstract.
major comments (2)
- [Abstract] Abstract: the claim that bit-codes 'further prove distinct global semantics inherent from our bit-codes while preserving independent noise sampling' is load-bearing for the coarse-control contribution, yet the provided text supplies no derivation, theorem, or bound (e.g., mutual-information lower bound, injectivity argument, or separation guarantee) showing that the bit-code space must factor into distinct globals rather than merely correlating with some unsupervised factors. The 'proof' therefore reduces to the empirical behavior of the proposed losses, which is insufficient to support the stated guarantee.
- [Abstract] Abstract: the manuscript asserts both 'proofs' and consistent outperformance, but the abstract contains no equations, loss definitions, or experimental protocol details; without these, the central claims cannot be assessed for internal consistency or reproducibility from the given material.
Simulated Author's Rebuttal
We thank the referee for the careful reading and constructive feedback on the abstract. We respond to each major comment below.
read point-by-point responses
-
Referee: [Abstract] the claim that bit-codes 'further prove distinct global semantics inherent from our bit-codes while preserving independent noise sampling' is load-bearing, yet the provided text supplies no derivation, theorem, or bound showing that the bit-code space must factor into distinct globals. The 'proof' therefore reduces to the empirical behavior of the proposed losses.
Authors: We agree the abstract itself contains no derivation, as is conventional due to length constraints. The full theoretical argument—including the mutual-information lower bound, injectivity argument, and separation guarantee derived from the adversarial semantic reciprocal learning theory—is given in Section 3.2 of the manuscript. We will revise the abstract to explicitly reference this section so readers know the guarantee is not merely empirical. revision: yes
-
Referee: [Abstract] the manuscript asserts both 'proofs' and consistent outperformance, but the abstract contains no equations, loss definitions, or experimental protocol details; without these, the central claims cannot be assessed for internal consistency or reproducibility from the given material.
Authors: Abstracts are intentionally free of equations and protocol details to remain concise and accessible; all loss definitions, theoretical derivations, and experimental protocols appear in Sections 3 and 4. The outperformance claims are substantiated by the quantitative results and ablation studies reported in the main text. No change to the abstract format is required. revision: no
Circularity Check
No significant circularity; derivation presented as novel proposal without reduction to inputs by construction
full rationale
The provided abstract and description introduce a new 'adversarial semantic reciprocal learning theory' as a proposal, followed by bit-codes derived from it and a claim to 'further prove' properties. No equations, self-citations, fitted parameters renamed as predictions, or uniqueness theorems from prior author work are quoted that would reduce the central claims to the inputs by definition. The derivation chain is therefore self-contained as a sequence of novel constructions rather than tautological or load-bearing self-references.
Axiom & Free-Parameter Ledger
invented entities (1)
-
bit-codes
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Auto-Encoding Variational Bayes
Diederik P Kingma and Max Welling, “Auto-encoding variational bayes,”arXiv preprint arXiv:1312.6114, 2013
work page internal anchor Pith review Pith/arXiv arXiv 2013
-
[2]
Generative adversarial nets,
Ian J Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio, “Generative adversarial nets,” inAdvances in Neural Inf. Process. Syst., 2014, vol. 27
2014
-
[3]
High-resolution image synthesis with latent diffu- sion models,
Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bj ¨orn Ommer, “High-resolution image synthesis with latent diffu- sion models,” inProc. IEEE Conf. Comp. Vis. Patt. Recogn., 2022, pp. 10684–10695
2022
-
[4]
Ganspace: Discovering interpretable gan controls,
Erik H ¨ark¨onen, Aaron Hertzmann, Jaakko Lehtinen, and Sylvain Paris, “Ganspace: Discovering interpretable gan controls,” inAdvances in Neural Inf. Process. Syst., 2020, vol. 33, pp. 9841–9850
2020
-
[5]
Diffusion models already have a semantic latent space,
Mingi Kwon, Jaeseok Jeong, and Youngjung Uh, “Diffusion models already have a semantic latent space,” inProc. Int. Conf. Learn. Representations, 2023
2023
-
[6]
Multi-label condi- tional generation from pre-trained models,
Magdalena Proszewska, Maciej Wołczyk, Maciej Zieba, Patryk Wielopolski, Łukasz Maziarka, and Marek ´Smieja, “Multi-label condi- tional generation from pre-trained models,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 46, no. 9, pp. 6185–6198, 2024
2024
-
[7]
Conditional Generative Adversarial Nets
Mehdi Mirza and Simon Osindero, “Conditional generative adversarial nets,”arXiv preprint arXiv:1411.1784, 2014
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[8]
Conditional image synthesis with auxiliary classifier gans,
Augustus Odena, Christopher Olah, and Jonathon Shlens, “Conditional image synthesis with auxiliary classifier gans,” inProc. Int. Conf. Mach. Learn.PMLR, 2017, pp. 2642–2651
2017
-
[9]
Diffusion models beat gans on image synthesis,
Prafulla Dhariwal and Alexander Nichol, “Diffusion models beat gans on image synthesis,” inAdvances in Neural Inf. Process. Syst., 2021, vol. 34, pp. 8780–8794
2021
-
[10]
Neural characteristic function learning for conditional image generation,
Shengxi Li, Jialu Zhang, Yifei Li, Mai Xu, Xin Deng, and Li Li, “Neural characteristic function learning for conditional image generation,” in Proc. IEEE Int. Conf. Comp. Vis., 2023, pp. 7204–7214
2023
-
[11]
Generative attribute controller with conditional filtered generative adversarial net- works,
Takuhiro Kaneko, Kaoru Hiramatsu, and Kunio Kashino, “Generative attribute controller with conditional filtered generative adversarial net- works,” inProc. IEEE Conf. Comp. Vis. Patt. Recogn., 2017, pp. 6089– 6098
2017
-
[12]
Attribute-guided face generation using conditional cyclegan,
Yongyi Lu, Yu-Wing Tai, and Chi-Keung Tang, “Attribute-guided face generation using conditional cyclegan,” inProc. Eur. Conf. Comp. Vis., 2018, pp. 282–297
2018
-
[13]
Conditional gans with auxiliary discriminative classifier,
Liang Hou, Qi Cao, Huawei Shen, Siyuan Pan, Xiaoshuang Li, and Xueqi Cheng, “Conditional gans with auxiliary discriminative classifier,” inProc. Int. Conf. Mach. Learn.PMLR, 2022, pp. 8888–8902
2022
-
[14]
TAC-GAN - Text Conditioned Auxiliary Classifier Generative Adversarial Network
Ayushman Dash, John Cristian Borges Gamboa, Sheraz Ahmed, Marcus Liwicki, and Muhammad Zeshan Afzal, “Tac-gan-text conditioned auxiliary classifier generative adversarial network,”arXiv preprint arXiv:1703.06412, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[15]
Text to image generation with semantic-spatial aware gan,
Wentong Liao, Kai Hu, Michael Ying Yang, and Bodo Rosenhahn, “Text to image generation with semantic-spatial aware gan,” inProc. IEEE Conf. Comp. Vis. Patt. Recogn., 2022, pp. 18187–18196
2022
-
[16]
Motiondiffuse: Text-driven human motion generation with diffusion model,
Mingyuan Zhang, Zhongang Cai, Liang Pan, Fangzhou Hong, Xinying Guo, Lei Yang, and Ziwei Liu, “Motiondiffuse: Text-driven human motion generation with diffusion model,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 46, no. 6, pp. 4115–4128, 2024
2024
-
[17]
Drb-gan: A dynamic resblock generative adversarial network for artistic style transfer,
Wenju Xu, Chengjiang Long, Ruisheng Wang, and Guanghui Wang, “Drb-gan: A dynamic resblock generative adversarial network for artistic style transfer,” inProc. IEEE Int. Conf. Comp. Vis., 2021, pp. 6383– 6392
2021
-
[18]
Stylediffusion: Controllable disentangled style transfer via diffusion models,
Zhizhong Wang, Lei Zhao, and Wei Xing, “Stylediffusion: Controllable disentangled style transfer via diffusion models,” inProc. IEEE Int. Conf. Comp. Vis., 2023, pp. 7677–7689
2023
-
[19]
Sketchygan: Towards diverse and realistic sketch to image synthesis,
Wengling Chen and James Hays, “Sketchygan: Towards diverse and realistic sketch to image synthesis,” inProc. IEEE Conf. Comp. Vis. Patt. Recogn., 2018, pp. 9416–9425
2018
-
[20]
Image generation from sketch constraint using contextual gan,
Yongyi Lu, Shangzhe Wu, Yu-Wing Tai, and Chi-Keung Tang, “Image generation from sketch constraint using contextual gan,” inProc. Eur. Conf. Comp. Vis., 2018, pp. 205–220
2018
-
[21]
Conditional image generation with pixelcnn decoders,
Aaron Van den Oord, Nal Kalchbrenner, Lasse Espeholt, Oriol Vinyals, Alex Graves, et al., “Conditional image generation with pixelcnn decoders,” inAdvances in Neural Inf. Process. Syst., 2016, vol. 29
2016
-
[23]
At- tribute2image: Conditional image generation from visual attributes,
Xinchen Yan, Jimei Yang, Kihyuk Sohn, and Honglak Lee, “At- tribute2image: Conditional image generation from visual attributes,” in Proc. Eur. Conf. Comp. Vis.Springer, 2016, pp. 776–791
2016
-
[24]
A simple framework for contrastive learning of visual representations,
Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton, “A simple framework for contrastive learning of visual representations,” inProc. Int. Conf. Mach. Learn.PmLR, 2020, pp. 1597–1607
2020
-
[25]
Emerging properties in self-supervised vision transformers,
Mathilde Caron, Hugo Touvron, Ishan Misra, Herv ´e J ´egou, Julien Mairal, Piotr Bojanowski, and Armand Joulin, “Emerging properties in self-supervised vision transformers,” inProc. IEEE Int. Conf. Comp. Vis., 2021, pp. 9650–9660
2021
-
[26]
DINOv2: Learning Robust Visual Features without Supervision
Maxime Oquab, Timoth ´ee Darcet, Th ´eo Moutakanni, Huy V o, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, et al., “Dinov2: Learning robust visual features without supervision,”arXiv preprint arXiv:2304.07193, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[27]
Warpedganspace: Finding non-linear rbf paths in gan latent space,
Christos Tzelepis, Georgios Tzimiropoulos, and Ioannis Patras, “Warpedganspace: Finding non-linear rbf paths in gan latent space,” inProc. IEEE Int. Conf. Comp. Vis., 2021, pp. 6393–6402
2021
-
[28]
Clustergan: Latent space clustering in generative adversarial networks,
Sudipto Mukherjee, Himanshu Asnani, Eugene Lin, and Sreeram Kan- nan, “Clustergan: Latent space clustering in generative adversarial networks,” inProc. Conf. AAAI, 2019, pp. 4610–4617
2019
-
[29]
Diverse Image Generation via Self-Conditioned GANs,
Steven Liu, Tongzhou Wang, David Bau, Jun-Yan Zhu, and Antonio Torralba, “Diverse Image Generation via Self-Conditioned GANs,” in Proc. IEEE Conf. Comp. Vis. Patt. Recogn., 2020, pp. 14286–14295
2020
-
[30]
Matan Ben-Yosef and Daphna Weinshall, “Gaussian mixture generative adversarial networks for diverse datasets, and the unsupervised clustering of images,”arXiv preprint arXiv:1808.10356, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[31]
Unsuper- vised image generation with infinite generative adversarial networks,
Hui Ying, He Wang, Tianjia Shao, Yin Yang, and Kun Zhou, “Unsuper- vised image generation with infinite generative adversarial networks,” inProc. IEEE Int. Conf. Comp. Vis., 2021, pp. 14284–14293
2021
-
[32]
A style-based generator architecture for generative adversarial networks,
Tero Karras, Samuli Laine, and Timo Aila, “A style-based generator architecture for generative adversarial networks,” inProc. IEEE Conf. Comp. Vis. Patt. Recogn., 2019, pp. 4401–4410
2019
-
[33]
Gan dissection: Visualizing and understanding generative adversarial networks,
David Bau, Jun-Yan Zhu, Hendrik Strobelt, Bolei Zhou, Joshua B Tenenbaum, William T Freeman, and Antonio Torralba, “Gan dissection: Visualizing and understanding generative adversarial networks,” inProc. Int. Conf. Learn. Representations, 2019
2019
-
[34]
Label-efficient semantic segmentation with diffusion models,
Dmitry Baranchuk, Andrey V oynov, Ivan Rubachev, Valentin Khrulkov, and Artem Babenko, “Label-efficient semantic segmentation with diffusion models,” inProc. Int. Conf. Learn. Representations, 2022
2022
-
[35]
InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets,
Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, and Pieter Abbeel, “InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets,” inAdvances in Neural Inf. Process. Syst., 2016
2016
-
[36]
Challenging common assumptions in the unsupervised learning of disentangled representations,
Francesco Locatello, Stefan Bauer, Mario Lucic, Gunnar Raetsch, Syl- vain Gelly, Bernhard Sch ¨olkopf, and Olivier Bachem, “Challenging common assumptions in the unsupervised learning of disentangled representations,” inProc. Int. Conf. Mach. Learn.PMLR, 2019, pp. 4114–4124
2019
-
[37]
Posterior collapse and latent variable non-identifiability,
Yixin Wang, David Blei, and John P Cunningham, “Posterior collapse and latent variable non-identifiability,”Advances in Neural Inf. Process. Syst., vol. 34, pp. 5443–5455, 2021
2021
-
[38]
Self-guided diffusion models,
Vincent Tao Hu, David W Zhang, Yuki M Asano, Gertjan J Burghouts, and Cees GM Snoek, “Self-guided diffusion models,” inProc. IEEE Conf. Comp. Vis. Patt. Recogn., 2023, pp. 18413–18422
2023
-
[39]
Instance-conditioned gan,
Arantxa Casanova, Marlene Careil, Jakob Verbeek, Michal Drozdzal, and Adriana Romero Soriano, “Instance-conditioned gan,” inAdvances in Neural Inf. Process. Syst., 2021. JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 17
2021
-
[40]
Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
Sihyun Yu, Sangkyung Kwak, Huiwon Jang, Jongheon Jeong, Jonathan Huang, Jinwoo Shin, and Saining Xie, “Representation alignment for generation: Training diffusion transformers is easier than you think,” arXiv preprint arXiv:2410.06940, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[41]
Closed-form factorization of latent semantics in gans,
Yujun Shen and Bolei Zhou, “Closed-form factorization of latent semantics in gans,” inProc. IEEE Conf. Comp. Vis. Patt. Recogn., 2021, pp. 1532–1540
2021
-
[42]
Neural discrete representa- tion learning,
Aaron Van Den Oord, Oriol Vinyals, et al., “Neural discrete representa- tion learning,” inAdvances in Neural Inf. Process. Syst., 2017, vol. 30
2017
-
[43]
Taming transformers for high-resolution image synthesis,
Patrick Esser, Robin Rombach, and Bjorn Ommer, “Taming transformers for high-resolution image synthesis,” inProc. IEEE Conf. Comp. Vis. Patt. Recogn., 2021, pp. 12873–12883
2021
-
[44]
Towards principled methods for training generative adversarial networks,
Martin Arjovsky and Leon Bottou, “Towards principled methods for training generative adversarial networks,” inProc. Int. Conf. Learn. Representations, 2017
2017
-
[45]
arXiv preprint arXiv:1606.05908 , year=
Carl Doersch, “Tutorial on variational autoencoders,”arXiv preprint arXiv:1606.05908, 2016
-
[46]
Gan inversion: A survey,
Weihao Xia, Yulun Zhang, Yujiu Yang, Jing-Hao Xue, Bolei Zhou, and Ming-Hsuan Yang, “Gan inversion: A survey,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 3, pp. 3121–3138, 2022
2022
-
[47]
Pivotal tuning for latent-based editing of real images,
Daniel Roich, Ron Mokady, Amit H Bermano, and Daniel Cohen- Or, “Pivotal tuning for latent-based editing of real images,”ACM Transactions on graphics (TOG), vol. 42, no. 1, pp. 1–13, 2022
2022
-
[48]
Null-text inversion for editing real images using guided diffusion models,
Ron Mokady, Amir Hertz, Kfir Aberman, Yael Pritch, and Daniel Cohen- Or, “Null-text inversion for editing real images using guided diffusion models,” inProc. IEEE Conf. Comp. Vis. Patt. Recogn., 2023, pp. 6038– 6047
2023
-
[49]
Reciprocal gan through characteristic functions (rcf-gan),
Shengxi Li, Zeyang Yu, Min Xiang, and Danilo Mandic, “Reciprocal gan through characteristic functions (rcf-gan),”IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 2, pp. 2246–2263, 2022
2022
-
[50]
Contragan: Contrastive learning for conditional image generation,
Minguk Kang and Jaesik Park, “Contragan: Contrastive learning for conditional image generation,” inAdvances in Neural Inf. Process. Syst., 2020, vol. 33, pp. 21357–21369
2020
-
[51]
Training gans with stronger augmen- tations via contrastive discriminator,
Jongheon Jeong and Jinwoo Shin, “Training gans with stronger augmen- tations via contrastive discriminator,”arXiv preprint arXiv:2103.09742, 2021
-
[52]
3d object representations for fine-grained categorization,
Jonathan Krause, Michael Stark, Jia Deng, and Li Fei-Fei, “3d object representations for fine-grained categorization,” inProceedings of the IEEE international conference on computer vision workshops, 2013, pp. 554–561
2013
-
[53]
Age progression/regression by conditional adversarial autoencoder,
Zhifei Zhang, Yang Song, and Hairong Qi, “Age progression/regression by conditional adversarial autoencoder,” inProc. IEEE Conf. Comp. Vis. Patt. Recogn.IEEE, 2017
2017
-
[54]
The caltech-ucsd birds-200-2011 dataset,
Catherine Wah, Steve Branson, Peter Welinder, Pietro Perona, and Serge Belongie, “The caltech-ucsd birds-200-2011 dataset,” 2011
2011
-
[55]
Automated flower classification over a large number of classes,
M-E. Nilsback and A. Zisserman, “Automated flower classification over a large number of classes,” inProceedings of the Indian Conference on Computer Vision, Graphics and Image Processing, Dec 2008
2008
-
[56]
Fine-grained image analysis with deep learning: A survey,
Xiu-Shen Wei, Yi-Zhe Song, Oisin Mac Aodha, Jianxin Wu, Yuxin Peng, Jinhui Tang, Jian Yang, and Serge Belongie, “Fine-grained image analysis with deep learning: A survey,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 12, pp. 8927–8948, 2021
2021
-
[57]
Bilinear cnn models for fine-grained visual recognition,
Tsung-Yu Lin, Aruni RoyChowdhury, and Subhransu Maji, “Bilinear cnn models for fine-grained visual recognition,” inProc. IEEE Int. Conf. Comp. Vis., 2015, pp. 1449–1457
2015
-
[58]
Gans trained by a two time-scale update rule converge to a local nash equilibrium,
Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter, “Gans trained by a two time-scale update rule converge to a local nash equilibrium,” inAdvances in Neural Inf. Process. Syst., 2017, vol. 30
2017
-
[59]
Improved techniques for training gans,
Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen, “Improved techniques for training gans,” in Advances in Neural Inf. Process. Syst., 2016, vol. 29
2016
-
[60]
39, Cambridge University Press Cambridge, 2008
Hinrich Sch ¨utze, Christopher D Manning, and Prabhakar Raghavan, Introduction to information retrieval, vol. 39, Cambridge University Press Cambridge, 2008
2008
-
[61]
Cluster ensembles—a knowledge reuse framework for combining multiple partitions,
Alexander Strehl and Joydeep Ghosh, “Cluster ensembles—a knowledge reuse framework for combining multiple partitions,”Journal of machine learning research, vol. 3, no. Dec, pp. 583–617, 2002
2002
-
[62]
Improved precision and recall metric for assessing generative models,
Tuomas Kynk ¨a¨anniemi, Tero Karras, Samuli Laine, Jaakko Lehtinen, and Timo Aila, “Improved precision and recall metric for assessing generative models,” inAdvances in Neural Inf. Process. Syst., 2019, vol. 32
2019
-
[63]
Reliable fidelity and diversity metrics for generative models,
Muhammad Ferjad Naeem, Seong Joon Oh, Yunjey Young, Sungjin Baek, and Donghyeon Kim, “Reliable fidelity and diversity metrics for generative models,” inProc. Int. Conf. Mach. Learn.PMLR, 2020, pp. 7176–7185
2020
-
[64]
Masked autoencoders are scalable vision learners,
Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Doll ´ar, and Ross Girshick, “Masked autoencoders are scalable vision learners,” in Proc. IEEE Conf. Comp. Vis. Patt. Recogn., 2022, pp. 16000–16009
2022
-
[65]
Masked siamese networks for label-efficient learning,
Mahmoud Assran, Mathilde Caron, Ishan Misra, Piotr Bojanowski, Florian Bordes, Pascal Vincent, Armand Joulin, Mike Rabbat, and Nicolas Ballas, “Masked siamese networks for label-efficient learning,” inProc. Eur. Conf. Comp. Vis.Springer, 2022, pp. 456–473. Shengxi Li(Member, IEEE) received the Ph.D. degree in electrical and electronic engineering from Imp...
2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.