Cross-Resolution Diffusion Models via Network Pruning
Pith reviewed 2026-05-10 19:57 UTC · model grok-4.3
The pith
Pruning certain weights in diffusion models restores image quality at resolutions not seen during training.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The core discovery is that resolution shifts cause certain weights in the UNet of diffusion models to become adverse, weakening semantic alignment and causing instability. By selectively pruning these adverse weights in a block-wise manner and amplifying the pruned predictions, CR-Diff achieves improved perceptual fidelity and semantic coherence at unseen resolutions across various backbones while preserving default performance and enabling prompt-specific refinements.
What carries the argument
Block-wise pruning of resolution-dependent adverse weights in the diffusion UNet, followed by pruned output amplification to purify predictions.
Load-bearing premise
That the degradation at different resolutions stems mainly from identifiable adverse weights removable by block-wise pruning without causing new problems in the model.
What would settle it
Running the unpruned model and the pruned model on the same set of prompts at a shifted resolution and checking whether the pruned version shows measurably higher perceptual quality and fewer structural artifacts; failure to improve would challenge the claim.
Figures
read the original abstract
Diffusion models have demonstrated impressive image synthesis performance, yet many UNet-based models are trained at certain fixed resolutions. Their quality tends to degrade when generating images at out-of-training resolutions. We trace this issue to resolution-dependent parameter behaviors, where weights that function well at the default resolution can become adverse when spatial scales shift, weakening semantic alignment and causing structural instability in the UNet architecture. Based on this analysis, this paper introduces CR-Diff, a novel method that improves the cross-resolution visual consistency by pruning some parameters of the diffusion model. Specifically, CR-Diff has two stages. It first performs block-wise pruning to selectively eliminate adverse weights. Then, a pruned output amplification is conducted to further purify the pruned predictions. Empirically, extensive experiments suggest that CR-Diff can improve perceptual fidelity and semantic coherence across various diffusion backbones and unseen resolutions, while largely preserving the performance at default resolutions. Additionally, CR-Diff supports prompt-specific refinement, enabling quality enhancement on demand.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes CR-Diff, a two-stage post-training procedure for UNet-based diffusion models. It first performs block-wise pruning to remove weights that exhibit adverse behavior at out-of-training resolutions (identified via analysis of resolution-dependent parameter effects), then applies pruned output amplification to refine the predictions. The central claim is that this selectively improves perceptual fidelity and semantic coherence at unseen resolutions across multiple backbones while largely preserving performance at the default training resolution, and that it additionally supports prompt-specific refinement.
Significance. If the hypothesized causal mechanism is isolated and the empirical gains are reproducible, the work would provide a lightweight, training-free adaptation strategy for increasing resolution flexibility in pre-trained diffusion models. This addresses a practical limitation in current generative pipelines without the cost of full retraining or architectural changes.
major comments (2)
- [Method (two-stage procedure)] The central claim requires that resolution-dependent adverse weights can be reliably identified and that their removal (plus amplification) produces gains beyond generic pruning effects. No ablation is described that compares the block-wise adverse-weight criterion against random pruning, magnitude-based pruning, or pruning without the subsequent amplification stage; without such controls, improvements could be explained by sparsity-induced regularization rather than the proposed mechanism.
- [Abstract and Experiments] The abstract asserts that 'extensive experiments suggest' improvements in perceptual fidelity and semantic coherence, yet supplies no quantitative metrics, tables of FID/LPIPS scores, ablation tables, or error bars at specific unseen resolutions. This absence makes it impossible to assess effect sizes or consistency of the cross-resolution gains.
minor comments (2)
- [Method] The term 'pruned output amplification' is introduced without a formal definition, equation, or pseudocode; a precise formulation of the amplification operator would improve reproducibility.
- [Method] The manuscript should clarify whether the block-wise pruning decisions are made once per backbone or recomputed per prompt, as the prompt-specific refinement claim implies the latter but the description is ambiguous.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We address each major comment below and will revise the manuscript to incorporate the requested controls and quantitative results.
read point-by-point responses
-
Referee: [Method (two-stage procedure)] The central claim requires that resolution-dependent adverse weights can be reliably identified and that their removal (plus amplification) produces gains beyond generic pruning effects. No ablation is described that compares the block-wise adverse-weight criterion against random pruning, magnitude-based pruning, or pruning without the subsequent amplification stage; without such controls, improvements could be explained by sparsity-induced regularization rather than the proposed mechanism.
Authors: We agree that the manuscript would be strengthened by explicit ablations isolating the block-wise adverse-weight criterion. In the revised version we will add comparisons to random pruning, magnitude-based pruning, and the pruning stage without amplification. These controls will be reported with the same evaluation protocol to demonstrate that the observed gains exceed generic sparsity effects and arise from the resolution-dependent analysis. revision: yes
-
Referee: [Abstract and Experiments] The abstract asserts that 'extensive experiments suggest' improvements in perceptual fidelity and semantic coherence, yet supplies no quantitative metrics, tables of FID/LPIPS scores, ablation tables, or error bars at specific unseen resolutions. This absence makes it impossible to assess effect sizes or consistency of the cross-resolution gains.
Authors: We acknowledge that the abstract and main text currently lack the requested quantitative tables and error bars. The revised manuscript will expand the abstract to reference key metrics and will include new tables reporting FID, LPIPS, and other scores with standard deviations across multiple unseen resolutions and backbones. This will allow direct assessment of effect sizes and reproducibility. revision: yes
Circularity Check
No significant circularity detected
full rationale
The paper traces degradation to resolution-dependent adverse weights via empirical analysis, then applies block-wise pruning followed by output amplification as a two-stage procedure. This chain does not reduce any central claim to a self-defined quantity, a fitted parameter renamed as prediction, or a load-bearing self-citation; the method is presented as an external, experimentally validated intervention whose effectiveness is tested on multiple backbones and unseen resolutions rather than derived tautologically from its inputs.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We trace this issue to resolution-dependent parameter behaviors, where weights that function well at the default resolution can become adverse when spatial scales shift... block-wise pruning to selectively eliminate adverse weights. Then, a pruned output amplification...
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
CR-Diff has two stages. It first performs block-wise pruning... pruned output amplification... k>1
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Net-trim: Convex pruning of deep neural net- works with performance guarantee
Alireza Aghasi, Afshin Abdi, Nam Nguyen, and Justin Romberg. Net-trim: Convex pruning of deep neural net- works with performance guarantee. InNeurIPS, 2017. 3
work page 2017
-
[2]
Flux.https : / / blackforestlabs.ai/, 2024
Black Forest Labs. Flux.https : / / blackforestlabs.ai/, 2024. Accessed: 2025-09-
work page 2024
-
[3]
Ld-pruner: Efficient pruning of la- tent diffusion models using task-agnostic insights
Thibault Castells, Hyoung-Kyu Song, Bo-Kyeong Kim, and Shinkook Choi. Ld-pruner: Efficient pruning of la- tent diffusion models using task-agnostic insights. In CVPR, 2024. 2, 3
work page 2024
-
[4]
Pixart-σ: Weak-to-strong training of diffusion transformer for 4k text-to-image generation
Junsong Chen, Chongjian Ge, Enze Xie, Yue Wu, Lewei Yao, Xiaozhe Ren, Zhongdao Wang, Ping Luo, Huchuan Lu, and Zhenguo Li. Pixart-σ: Weak-to-strong training of diffusion transformer for 4k text-to-image generation. InECCV, 2024. 2
work page 2024
-
[5]
Pixart-σ: Weak-to-strong training of diffusion transformer for 4k text-to-image generation
Junsong Chen, Chongjian Ge, Enze Xie, Yue Wu, Lewei Yao, Xiaozhe Ren, Zhongdao Wang, Ping Luo, Huchuan Lu, and Zhenguo Li. Pixart-σ: Weak-to-strong training of diffusion transformer for 4k text-to-image generation. InECCV. Springer, 2024
work page 2024
-
[6]
Pixart-α: Fast training of diffusion transformer for photorealistic text-to-image synthesis
Junsong Chen, Jincheng Yu, Chongjian Ge, Lewei Yao, Enze Xie, Zhongdao Wang, James T Kwok, Ping Luo, Huchuan Lu, and Zhenguo Li. Pixart-α: Fast training of diffusion transformer for photorealistic text-to-image synthesis. InICLR, 2024. 2
work page 2024
-
[7]
Sana-sprint: One-step diffusion with continuous-time consistency distillation
Junsong Chen, Shuchen Xue, Yuyang Zhao, Jincheng Yu, Sayak Paul, Junyu Chen, Han Cai, Song Han, and Enze Xie. Sana-sprint: One-step diffusion with continuous-time consistency distillation. InICCV, 2025. 2
work page 2025
-
[8]
Diffusion mod- els beat gans on image synthesis
Prafulla Dhariwal and Alexander Nichol. Diffusion mod- els beat gans on image synthesis. InNeurIPS, 2021. 2
work page 2021
-
[9]
An image is worth 16x16 words: Transformers for image recognition at scale
Alexey Dosovitskiy. An image is worth 16x16 words: Transformers for image recognition at scale. InICLR,
-
[10]
Scaling recti- fied flow transformers for high-resolution image synthe- sis
Patrick Esser, Sumith Kulal, Andreas Blattmann, Rahim Entezari, Jonas Müller, Harry Saini, Yam Levi, Dominik Lorenz, Axel Sauer, Frederic Boesel, et al. Scaling recti- fied flow transformers for high-resolution image synthe- sis. InICML, 2024. 2, 3, 7
work page 2024
-
[11]
Depgraph: Towards any structural pruning
Gongfan Fang, Xinyin Ma, Mingli Song, Michael Bi Mi, and Xinchao Wang. Depgraph: Towards any structural pruning. InCVPR, 2023. 2, 3
work page 2023
-
[12]
Struc- tural pruning for diffusion models
Gongfan Fang, Xinyin Ma, and Xinchao Wang. Struc- tural pruning for diffusion models. InNeurIPS, 2023. 2
work page 2023
-
[13]
Tinyfusion: Diffusion transformers learned shal- low
Gongfan Fang, Kunjun Li, Xinyin Ma, and Xinchao Wang. Tinyfusion: Diffusion transformers learned shal- low. InCVPR, 2025. 2
work page 2025
-
[14]
Is oracle prun- ing the true oracle?arXiv preprint arXiv:2412.00143,
Sicheng Feng, Keda Tao, and Huan Wang. Is oracle prun- ing the true oracle?arXiv preprint arXiv:2412.00143,
-
[15]
Optimal brain compres- sion: A framework for accurate post-training quantiza- tion and pruning
Elias Frantar and Dan Alistarh. Optimal brain compres- sion: A framework for accurate post-training quantiza- tion and pruning. InNeurIPS, 2022. 3
work page 2022
-
[16]
Sparsegpt: Massive lan- guage models can be accurately pruned in one-shot
Elias Frantar and Dan Alistarh. Sparsegpt: Massive lan- guage models can be accurately pruned in one-shot. In ICML, 2023. 3
work page 2023
-
[17]
Learning both weights and connections for efficient neu- ral network
Song Han, Jeff Pool, John Tran, and William Dally. Learning both weights and connections for efficient neu- ral network. InNeurIPS, 2015. 2, 3
work page 2015
-
[18]
Song Han, Huizi Mao, and William J Dally. Deep com- pression: Compressing deep neural network with prun- ing, trained quantization and huffman coding. InICLR,
-
[19]
Op- timal brain surgeon and general network pruning
Babak Hassibi, David G Stork, and Gregory J Wolff. Op- timal brain surgeon and general network pruning. In NeurIPS, 1992. 3
work page 1992
-
[20]
Clipscore: A reference-free evaluation metric for image captioning
Jack Hessel, Ari Holtzman, Maxwell Forbes, Ronan Le Bras, and Yejin Choi. Clipscore: A reference-free evaluation metric for image captioning. InEMNLP. As- sociation for Computational Linguistics, 2021. 5
work page 2021
-
[21]
Gans trained by a two time-scale update rule converge to a local nash equilibrium
Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. Gans trained by a two time-scale update rule converge to a local nash equilibrium. InNeurIPS, 2017. 5
work page 2017
-
[22]
Denoising diffusion probabilistic models
Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models. InNeurIPS, 2020. 2, 3
work page 2020
-
[23]
Bk-sdm: A lightweight, fast, and cheap version of stable diffusion
Bo-Kyeong Kim, Hyoung-Kyu Song, Thibault Castells, and Shinkook Choi. Bk-sdm: A lightweight, fast, and cheap version of stable diffusion. InECCV, 2024. 3
work page 2024
-
[24]
Pick-a-pic: An open dataset of user preferences for text-to-image gener- ation
Yuval Kirstain, Adam Polyak, Uriel Singer, Shahbuland Matiana, Joe Penna, and Omer Levy. Pick-a-pic: An open dataset of user preferences for text-to-image gener- ation. InNeurIPS, 2023. 5
work page 2023
-
[25]
Yann LeCun, John Denker, and Sara Solla. Optimal brain damage. InNeurIPS, 1989. 3
work page 1989
-
[26]
Snapfusion: Text-to-image diffusion model on mobile devices within two seconds
Yanyu Li, Huan Wang, Qing Jin, Ju Hu, Pavlo Chemerys, Yun Fu, Yanzhi Wang, Sergey Tulyakov, and Jian Ren. Snapfusion: Text-to-image diffusion model on mobile devices within two seconds. InNeurIPS, 2023. 2, 3
work page 2023
-
[27]
Microsoft coco: Common objects in context
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. Microsoft coco: Common objects in context. InECCV, 2014. 5, 2, 4, 6
work page 2014
-
[28]
Slimgpt: Layer-wise structured pruning for large language mod- els
Gui Ling, Ziyang Wang, and Qingwen Liu. Slimgpt: Layer-wise structured pruning for large language mod- els. InNeurIPS, 2024. 3
work page 2024
-
[29]
Importance estimation for neural net- work pruning
Pavlo Molchanov, Arun Mallya, Stephen Tyree, Iuri Fro- sio, and Jan Kautz. Importance estimation for neural net- work pruning. InCVPR, 2019. 8
work page 2019
-
[30]
Glide: Towards photo- realistic image generation and editing with text-guided diffusion models
Alexander Quinn Nichol, Prafulla Dhariwal, Aditya Ramesh, Pranav Shyam, Pamela Mishkin, Bob Mcgrew, Ilya Sutskever, and Mark Chen. Glide: Towards photo- realistic image generation and editing with text-guided diffusion models. InICML, 2022. 2 9
work page 2022
-
[31]
NovelAI improvements on Stable Diffusion
NovelAI. NovelAI improvements on Stable Diffusion. https : / / blog . novelai . net / novelai - improvements - on - stable - diffusion - e10d38db82ac, 2022. 2, 3
work page 2022
-
[32]
Scalable diffusion models with transformers
William Peebles and Saining Xie. Scalable diffusion models with transformers. InICCV, 2023. 3
work page 2023
-
[33]
Sdxl: Improving latent diffusion mod- els for high-resolution image synthesis
Dustin Podell, Zion English, Kyle Lacey, Andreas Blattmann, Tim Dockhorn, Jonas Müller, Joe Penna, and Robin Rombach. Sdxl: Improving latent diffusion mod- els for high-resolution image synthesis. InICLR, 2024. 1, 2, 3, 5, 8
work page 2024
-
[34]
Hierarchical Text-Conditional Image Generation with CLIP Latents
Aditya Ramesh, Prafulla Dhariwal, Alex Nichol, Casey Chu, and Mark Chen. Hierarchical text-conditional image generation with clip latents.arXiv preprint arXiv:2204.06125, 2022. 2
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[35]
High-resolution image synthesis with latent diffusion models
Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. High-resolution image synthesis with latent diffusion models. InCVPR, 2022. 2, 3, 5
work page 2022
-
[36]
U- net: Convolutional networks for biomedical image seg- mentation
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U- net: Convolutional networks for biomedical image seg- mentation. InMICCAI, 2015. 2
work page 2015
-
[37]
Photorealistic text-to-image diffusion models with deep language understanding
Chitwan Saharia, William Chan, Saurabh Saxena, Lala Li, Jay Whang, Emily L Denton, Kamyar Ghasemipour, Raphael Gontijo Lopes, Burcu Karagol Ayan, Tim Sali- mans, et al. Photorealistic text-to-image diffusion models with deep language understanding. InNeurIPS, 2022. 2
work page 2022
-
[38]
Progressive distillation for fast sampling of diffusion models
Tim Salimans and Jonathan Ho. Progressive distillation for fast sampling of diffusion models. InICLR, 2022. 2
work page 2022
-
[39]
Sparse learning for state space models on mobile
Xuan Shen, Hangyu Zheng, Yifan Gong, Zhenglun Kong, Changdi Yang, Zheng Zhan, Yushu Wu, Xue Lin, Yanzhi Wang, Pu Zhao, et al. Sparse learning for state space models on mobile. InICLR, 2025. 3
work page 2025
-
[40]
Ef- ficient unstructured pruning of mamba state-space mod- els for resource-constrained environments
Ibne Farabi Shihab, Sanjeda Akter, and Anuj Sharma. Ef- ficient unstructured pruning of mamba state-space mod- els for resource-constrained environments. InEMNLP,
-
[41]
Deep unsupervised learning using nonequilibrium thermodynamics
Jascha Sohl-Dickstein, Eric Weiss, Niru Mah- eswaranathan, and Surya Ganguli. Deep unsupervised learning using nonequilibrium thermodynamics. In ICML, 2015. 2
work page 2015
-
[42]
De- noising diffusion implicit models
Jiaming Song, Chenlin Meng, and Stefano Ermon. De- noising diffusion implicit models. InICLR, 2021
work page 2021
-
[43]
Generative modeling by estimating gradients of the data distribution
Yang Song and Stefano Ermon. Generative modeling by estimating gradients of the data distribution. InNeurIPS, 2019
work page 2019
-
[44]
Score- based generative modeling through stochastic differential equations
Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. Score- based generative modeling through stochastic differential equations. InICLR, 2021. 2
work page 2021
-
[45]
A simple and effective pruning approach for large lan- guage models
Mingjie Sun, Zhuang Liu, Anna Bair, and J Zico Kolter. A simple and effective pruning approach for large lan- guage models. InICLR, 2024. 3, 8
work page 2024
-
[46]
Kaiwen Tuo and Huan Wang. Sparsessm: Efficient se- lective structured state space models can be pruned in one-shot.arXiv preprint arXiv:2506.09613, 2025. 3
-
[47]
Trainability preserving neural pruning
Huan Wang and Yun Fu. Trainability preserving neural pruning. InICLR, 2023. 2
work page 2023
-
[48]
Neural pruning via growing regularization
Huan Wang, Can Qin, Yulun Zhang, and Yun Fu. Neural pruning via growing regularization. InICLR, 2021. 2, 3
work page 2021
-
[49]
Structured optimal brain pruning for large language models
Jiateng Wei, Quan Lu, Ning Jiang, Siqi Li, Jingyang Xi- ang, Jun Chen, and Yong Liu. Structured optimal brain pruning for large language models. InEMNLP, 2024. 3
work page 2024
-
[50]
Sana: Efficient high-resolution image syn- thesis with linear diffusion transformers
Enze Xie, Junsong Chen, Junyu Chen, Han Cai, Haotian Tang, Yujun Lin, Zhekai Zhang, Muyang Li, Ligeng Zhu, Yao Lu, et al. Sana: Efficient high-resolution image syn- thesis with linear diffusion transformers. InICLR, 2025. 2
work page 2025
-
[51]
Enze Xie, Junsong Chen, Yuyang Zhao, Jincheng YU, Ligeng Zhu, Yujun Lin, Zhekai Zhang, Muyang Li, Junyu Chen, Han Cai, et al. Sana 1.5: Efficient scaling of training-time and inference-time compute in linear diffu- sion transformer. InICML, 2025. 2
work page 2025
-
[52]
Im- agereward: Learning and evaluating human preferences for text-to-image generation
Jiazheng Xu, Xiao Liu, Yuchen Wu, Yuxuan Tong, Qinkai Li, Ming Ding, Jie Tang, and Yuxiao Dong. Im- agereward: Learning and evaluating human preferences for text-to-image generation. InNeurIPS, 2023. 5
work page 2023
-
[53]
Dingkun Zhang, Sijia Li, Chen Chen, Qingsong Xie, and Haonan Lu. Laptop-diff: Layer pruning and normal- ized distillation for compressing diffusion models.arXiv preprint arXiv:2404.11098, 2024. 3
-
[54]
Effortless efficiency: Low-cost pruning of diffusion models
Yang Zhang, Er Jin, Yanfei Dong, Ashkan Khakzar, Philip Torr, Johannes Stegmaier, and Kenji Kawaguchi. Effortless efficiency: Low-cost pruning of diffusion models.arXiv preprint arXiv:2412.02852, 2024. 3
-
[55]
Mobilediffusion: Instant text-to-image gen- eration on mobile devices
Yang Zhao, Yanwu Xu, Zhisheng Xiao, Haolin Jia, and Tingbo Hou. Mobilediffusion: Instant text-to-image gen- eration on mobile devices. InECCV, 2024. 2, 3
work page 2024
-
[56]
arXiv preprint arXiv:2510.06751 (2025)
Junhan Zhu, Hesong Wang, Mingluo Su, Zefang Wang, and Huan Wang. Obs-diff: Accurate prun- ing for diffusion models in one-shot.arXiv preprint arXiv:2510.06751, 2025. 3, 8 10 Cross-Resolution Diffusion Models via Network Pruning Supplementary Material
-
[57]
Block-wise Pruning Ratio Configurations As discussed in Section 3.1, the UNet architecture com- prises downsampling, middle, and upsampling blocks, which differ in redundancy and tolerance to parameter removal. This is further supported by our pruning ra- tio search experiments across multiple diffusion model families and sampling resolutions, with the re...
-
[58]
Full Ablation Study of POA To more comprehensively illustrate the effect of the pruned output amplification(POA) mechanism, we pro- vide the full ablation results across models and resolu- tions in Table 7, which were omitted from the main pa- per due to space constraints. This output-level refinement consistently improves generative quality across archit...
-
[59]
Simulated Annealing (SA) Algorithm Algorithm 1 summarizes the simulated annealing (SA) routine used to search for the optimal pruning ratio con- figurationr=r down, rmid, rup. The hyperparameters in- clude the initial temperatureT init, cooling rateα, it- eration budgetN iter, a set of candidate seedsS seeds, and a restart limitR max. Starting from the be...
-
[60]
Analyses on Unseen Resolutions Beyond the detailed analysis in Section 4.2, which demonstrates consistent improvements under CR-Diff at unseen resolutions, we provide additional analyses at higher resolutions for SDXL. SDXL, natively trained at 1024×1024with a resampler and high-resolution cross- attention, effectively internalizes dense object struc- tur...
-
[61]
Expanded Qualitative Analyses Representative Teaser Results.In Figures 9 and 10, we present additional representative teaser examples fol- lowing the style of Figure 1, further illustrating the ef- fectiveness of CR-Diff in enhancing cross-resolution vi- sual consistency over the dense SDXL [33]. Results on the 5K Dataset.In Figures 11, 12, and 13, we pre...
work page 2014
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.