Rethinking Token Reduction for Diffusion Models via Output-Similarity-Awareness
Pith reviewed 2026-05-22 07:27 UTC · model grok-4.3
The pith
Diffusion transformers can reduce tokens by matching output similarities from prior steps as proxies rather than input similarities.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
DiTo shifts token reduction to an output-centric view by treating preserved output similarities across adjacent timesteps as a reliable proxy: similarities measured at a matching timestep establish token correspondences that are then applied unchanged across multiple subsequent reduction timesteps, with pair-match-ratio scheduling setting the reuse interval and a selection-frequency penalty correcting for localized errors.
What carries the argument
Preservation of output token similarity across adjacent timesteps, used as a proxy to set token correspondences at matching steps for reuse in reduction steps.
If this is right
- Image quality measured by PSNR rises 1.6 to 3.9 dB above existing token-reduction baselines at matching speedup factors.
- The quality-speed tradeoff curve improves, placing the method on a better Pareto frontier than prior approaches.
- Repeated reuse of the same correspondences is kept from creating blocking artifacts by penalizing high-frequency token selections.
- The overall schedule balances matching cost against reduction savings through the pair-match-ratio metric.
Where Pith is reading between the lines
- The same output-similarity proxy idea could be tested in other iterative generative processes such as flow-based or autoregressive models.
- If output similarity patterns prove consistent in video or 3D generation, the reuse schedule might extend to those domains with only minor changes.
- The frequency penalty suggests a general way to control approximation error when any similarity measure is reused across steps.
Load-bearing premise
Output token similarities stay stable enough from one timestep to the next that earlier measurements can stand in for current ones without large errors.
What would settle it
Direct computation of output token similarities at consecutive timesteps shows large changes in which tokens are similar, causing the proxy correspondences to produce visible quality drops.
Figures
read the original abstract
Diffusion Transformers (DiTs) achieve superior image generation quality but suffer from quadratic computational complexity relative to token count. While various token reduction (TR) methods have been proposed to mitigate this cost, they overlook the primary objective of generative models: minimizing recovery error, which requires reflecting output token similarity. They rely solely on input token similarity inherited from reduction-only ViT paradigms, leading to a fundamental misalignment with this objective. To bridge this gap, we propose DiTo, a novel TR paradigm that shifts the focus toward output-centric token reduction. Based on the observation that output token similarity is consistently preserved across adjacent timesteps, DiTo utilizes prior-step similarities as an effective proxy to establish token correspondences at a Matching timestep, which are then reused across multiple subsequent Reduction timesteps. To optimize this interleaved scheduling, we propose Pair Match Ratio (PMR)-guided Interval Scheduling to determine the optimal matching frequency. Furthermore, to mitigate localized approximation errors and resulting blocking artifacts caused by repeated reuse, we propose Frequency-aware Token Matching by incorporating a selection-frequency penalty. Extensive experiments demonstrate that DiTo consistently outperforms existing TR methods with 1.6-3.9 dB higher PSNR at comparable speedups, achieving a superior Pareto frontier.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes DiTo, an output-centric token reduction paradigm for Diffusion Transformers. It observes that output token similarities are preserved across adjacent timesteps and reuses prior-step similarity matrices as proxies to establish token correspondences at a Matching timestep, which are then applied over multiple Reduction timesteps. PMR-guided Interval Scheduling determines matching frequency, while Frequency-aware Token Matching adds a selection-frequency penalty to reduce blocking artifacts. The central claim is that DiTo achieves 1.6-3.9 dB higher PSNR than prior TR methods at comparable speedups and a superior Pareto frontier.
Significance. If the performance claims and underlying similarity-preservation assumption hold under rigorous testing, DiTo would represent a meaningful shift from input-similarity-based TR methods toward alignment with the generative objective of minimizing recovery error. This could improve practical efficiency of DiT inference for high-resolution synthesis without proportional quality loss.
major comments (2)
- [Abstract / DiTo design paragraph] Abstract and DiTo design description: the core assumption that 'output token similarity is consistently preserved across adjacent timesteps' so that prior-step similarities serve as reliable proxies is stated without any supporting quantification (e.g., per-timestep cosine similarity statistics, correlation coefficients between consecutive output similarity matrices, or ablation on proxy misalignment). This assumption is load-bearing for the interleaved Matching/Reduction schedule and the claimed PSNR gains; without it the reuse mechanism risks accumulating correspondence errors, especially in later timesteps or high-frequency regions.
- [Experiments] Experimental results section: the reported 1.6-3.9 dB PSNR improvements and superior Pareto frontier are presented without reference to specific baselines, number of random seeds, statistical significance tests, exact model configurations (e.g., DiT-XL/2 at 512×512), or failure-case analysis. This makes it impossible to verify whether the gains are robust or whether the frequency-aware penalty actually mitigates the artifacts predicted by the proxy-reuse hypothesis.
minor comments (2)
- [Method] Notation for PMR and the interval-scheduling rule is introduced without an explicit equation or pseudocode; a compact definition would improve reproducibility.
- [Figures] Figure captions and axis labels for the Pareto curves should explicitly state the exact speedup metric (e.g., tokens per second or FLOPs reduction) and the reference full-token baseline.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We address each major comment below and will revise the manuscript to strengthen the presentation of the core assumption and experimental reporting.
read point-by-point responses
-
Referee: [Abstract / DiTo design paragraph] Abstract and DiTo design description: the core assumption that 'output token similarity is consistently preserved across adjacent timesteps' so that prior-step similarities serve as reliable proxies is stated without any supporting quantification (e.g., per-timestep cosine similarity statistics, correlation coefficients between consecutive output similarity matrices, or ablation on proxy misalignment). This assumption is load-bearing for the interleaved Matching/Reduction schedule and the claimed PSNR gains; without it the reuse mechanism risks accumulating correspondence errors, especially in later timesteps or high-frequency regions.
Authors: We agree that explicit quantification would strengthen the manuscript. While the similarity-preservation observation underpins the PMR-guided scheduling and reuse mechanism, the initial submission did not include per-timestep statistics. In the revision we will add cosine-similarity statistics across adjacent timesteps, correlation coefficients between consecutive output similarity matrices, and an ablation on proxy misalignment to show that correspondence errors remain limited and do not materially degrade the reported PSNR gains. revision: yes
-
Referee: [Experiments] Experimental results section: the reported 1.6-3.9 dB PSNR improvements and superior Pareto frontier are presented without reference to specific baselines, number of random seeds, statistical significance tests, exact model configurations (e.g., DiT-XL/2 at 512×512), or failure-case analysis. This makes it impossible to verify whether the gains are robust or whether the frequency-aware penalty actually mitigates the artifacts predicted by the proxy-reuse hypothesis.
Authors: We acknowledge that additional experimental details are required for full reproducibility and verification. In the revised manuscript we will explicitly name the baseline TR methods, report results averaged over multiple random seeds, include statistical significance tests where applicable, specify the exact DiT configurations and resolutions (including DiT-XL/2 at 512×512), and provide a failure-case analysis demonstrating how the frequency-aware penalty reduces blocking artifacts arising from repeated proxy reuse. revision: yes
Circularity Check
No significant circularity: design choices are independent heuristics validated empirically.
full rationale
The paper states an observation about output token similarity preservation across timesteps and uses it to motivate two new algorithmic components (PMR-guided Interval Scheduling and Frequency-aware Token Matching with penalty). These are presented as design decisions rather than derived equations. No load-bearing step reduces by construction to a fitted parameter, self-referential definition, or unverified self-citation chain. The central claim (PSNR gains on a Pareto frontier) rests on experimental comparisons, not on any internal equivalence that would make the result tautological. This is the common case of a heuristic method with external empirical grounding.
Axiom & Free-Parameter Ledger
free parameters (1)
- PMR interval parameters
axioms (1)
- domain assumption Output token similarity is consistently preserved across adjacent timesteps
Reference graph
Works this paper leans on
-
[1]
In: International Conference on Learning Represen- tations (2023)
Bolya, D., Fu, C.Y., Dai, X., Zhang, P., Feichtenhofer, C., Hoffman, J.: Token merging: Your ViT but faster. In: International Conference on Learning Represen- tations (2023)
work page 2023
-
[2]
CVPR Workshop on Efficient Deep Learning for Computer Vision (2023)
Bolya, D., Hoffman, J.: Token merging for fast stable diffusion. CVPR Workshop on Efficient Deep Learning for Computer Vision (2023)
work page 2023
-
[3]
Chen, J., Ge, C., Xie, E., Wu, Y., Yao, L., Ren, X., Wang, Z., Luo, P., Lu, H., Li, Z.: Pixart-σ: Weak-to-strong training of diffusion transformer for 4k text-to-image generation (2024)
work page 2024
-
[4]
In: Proceedings of the 33rd ACM International Conference on Multimedia (ACM MM)
Chen, Y., Ma, Z., Yang, C., An, Z., Zhang, Y.: Accelerating diffusion models via parallel denoising. In: Proceedings of the 33rd ACM International Conference on Multimedia (ACM MM). pp. 10652–10661 (2025)
work page 2025
-
[5]
In: International Conference on Learning Representations (ICLR) (2024)
Dao, T.: FlashAttention-2: Faster attention with better parallelism and work par- titioning. In: International Conference on Learning Representations (ICLR) (2024)
work page 2024
-
[6]
In: Proceedings of the 41st International Conference on Machine Learning (ICML) (2024)
Esser, P., Kulal, S., Andreas, B., Enright, A., Sheynin, J., Sauer, A., Chen, D., Podell, D., Evans, D., Brack, M., et al.: Scaling rectified flow transformers for high- resolution image synthesis. In: Proceedings of the 41st International Conference on Machine Learning (ICML) (2024)
work page 2024
-
[7]
In: European Conference on Computer Vision (ECCV) (2022)
Fayyaz, M., Koohpayegani, S.A., Jafari, F.R., Sengupta, S., Joze, H.R.V., Som- merlade, E., Pirsiavash, H., Gall, J.: Adaptive token sampling for efficient vision transformers. In: European Conference on Computer Vision (ECCV) (2022)
work page 2022
-
[8]
In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)
Guo, Bowei, e.a.: Mosaicdiff: Training-free structural pruning for diffusion model acceleration reflecting pretraining dynamics. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). pp. 1655–1664 (2025), https://openaccess.thecvf.com/content/ICCV2025/html/Guo_MosaicDiff_ Training - free _ Structural _ Pruning _ for _ Diffusi...
work page 2025
-
[9]
In: Advances in Neural Information Processing Systems (NeurIPS)
He, Yefei, e.a.: Ptqd: Accurate post-training quantization for diffusion models. In: Advances in Neural Information Processing Systems (NeurIPS). vol. 36, pp. 13237–13249 (2023),https://proceedings.neurips.cc/paper_files/paper/ 2023/hash/2aab8a76c7e761b66eccaca0927787de-Abstract-Conference.html
work page 2023
-
[10]
In: Conference on Empirical Methods in Natural Language Processing (EMNLP)
Hessel, Jack, e.a.: Clipscore: A reference-free evaluation metric for image cap- tioning. In: Conference on Empirical Methods in Natural Language Processing (EMNLP). pp. 7514–7528 (2021)
work page 2021
-
[11]
In: Advances in Neural Information Processing Systems (NeurIPS) (2017)
Heusel, Martin, e.a.: Gans trained by a two time-scale update rule converge to a local nash equilibrium. In: Advances in Neural Information Processing Systems (NeurIPS) (2017)
work page 2017
-
[12]
EURASIP Journal on Image and Video Processing pp
Huynh-Thu, Q., Ghanbari, M.: A study of the psnr metric for image quality as- sessment. EURASIP Journal on Image and Video Processing pp. 1–7 (2008)
work page 2008
-
[13]
In: Proceedings of the European Conference on Computer Vision (ECCV)
Kim, Bo-Kyeong, e.a.: Bk-sdm: A lightweight, fast, and cheap version of sta- ble diffusion. In: Proceedings of the European Conference on Computer Vision (ECCV). pp. 381–399. Springer (2024),https://www.ecva.net/papers/eccv_ 2024/papers_ECCV/html/7138_ECCV_2024_paper.php
work page 2024
-
[14]
In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)
Kim, M., Gao, S., Hsu, Y.C., Shen, Y., Jin, H.: Token fusion: Bridging the gap between token pruning and token merging. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). pp. 1372–1381 (2024)
work page 2024
-
[15]
Kong, Z., Dong, P., Ma, X., Meng, X., Niu, W., Sun, M., Shen, X., Yuan, G., Ren, B., Tang, H., et al.: Spvit: Enabling faster vision transformers via latency-aware 16 H. Lee et al. soft token pruning. In: Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI. pp. 620–640. Springer (2022)
work page 2022
-
[16]
Labs, B.F.: Flux.https://github.com/black-forest-labs/flux(2024)
work page 2024
-
[17]
Lee, Y., Park, K., Cho, Y., Lee, Y.J., Hwang, S.J.: Koala: Empirical lessons toward memory-efficientandfastdiffusionmodelsfortext-to-imagesynthesis.In:Advances in Neural Information Processing Systems. vol. 37, pp. 51597–51633 (2024)
work page 2024
-
[18]
Lefaudeux, B., Massa, F., Liskovich, D., Xiong, W., Caggiano, V., Naren, S., Xu, M., Hu, J., Tintore, M., Zhang, S., Labatut, P., Haziza, D., Wehrstedt, L., Reizen- stein, J., Sizov, G.: xformers: A modular and hackable transformer modelling li- brary.https://github.com/facebookresearch/xformers(2022)
work page 2022
-
[19]
In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)
Li, Xiuyu, e.a.: Q-diffusion: Quantizing diffusion models. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). pp. 17535– 17545 (2023),https://openaccess.thecvf.com/content/ICCV2023/html/Li_Q- Diffusion_Quantizing_Diffusion_Models_ICCV_2023_paper.html
work page 2023
-
[20]
In: International Conference on Learning Representations (ICLR) (2022)
Liang, Y., Ge, C., Tong, Z., Song, Y., Wang, J., Xie, P.: Evit: Expediting vision transformers via token reorganizations. In: International Conference on Learning Representations (ICLR) (2022)
work page 2022
-
[21]
arXiv preprint arXiv:2401.04585 (2024),https: //arxiv.org/abs/2401.04585
Liu, Xuewen, e.a.: Eda-dm: Enhanced distribution alignment for post-training quantization of diffusion models. arXiv preprint arXiv:2401.04585 (2024),https: //arxiv.org/abs/2401.04585
-
[22]
In: Advances in Neural Information Processing Sys- tems (NeurIPS)
Lu, Cheng, e.a.: Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps. In: Advances in Neural Information Processing Sys- tems (NeurIPS). vol. 35, pp. 5775–5787 (2022),https://proceedings.neurips. cc / paper _ files / paper / 2022 / hash / 260a14acce2a89dad36adc8eefe7c59e - Abstract-Conference.html
work page 2022
-
[23]
Lu, Cheng, e.a.: Dpm-solver++: Fast solver for guided sampling of diffusion prob- abilistic models. Machine Intelligence Research22(4), 730–751 (2025).https: //doi.org/10.1007/s11633-025-1562-4,https://www.mi-research.net/en/ article/doi/10.1007/s11633-025-1562-4
-
[24]
In: Forty-second International Conference on Machine Learning (2025)
Lu, W., Zheng, S., Xia, Y., Wang, S.: ToMA: Token merge with attention for diffusion models. In: Forty-second International Conference on Machine Learning (2025)
work page 2025
-
[25]
Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference
Luo, Simian, e.a.: Latent consistency models: Synthesizing high-resolution images with few-step inference. arXiv preprint arXiv:2310.04378 (2023),https://arxiv. org/abs/2310.04378
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[26]
In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)
Marin, D., Chang, J.R., Ranjan, A., Prabhu, A., Rastegari, M., Tuzel, O.: To- ken pooling in vision transformers for image classification. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). pp. 12–21 (2023)
work page 2023
-
[27]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Meng, C., Rombach, R., Gao, R., Kingma, D.P., Ermon, S., Ho, J., Salimans, T.: On distillation of guided diffusion models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 14297– 14306 (2023),https://openaccess.thecvf.com/content/CVPR2023/html/Meng_ On_Distillation_of_Guided_Diffusion_Models_CVPR_2023_paper.html
work page 2023
-
[28]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)
Meng, X., Li, X., Wang, Y., Wu, X., Zhang, Y., Sun, J.: Adavit: Adaptive vision transformers for efficient image recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)
work page 2022
-
[29]
NVIDIA Corporation: Nvidia rtx 6000 ada generation.https://www.nvidia.com/ en-us/products/workstations/rtx-6000/(2023), accessed: 2026-03-05 DiTo 17
work page 2023
-
[30]
In: Advances in Neural Information Processing Systems (NeurIPS) (2021)
Pan, X., Ge, C., Lu, R., Song, S., Huang, G., Wang, Z., Huang, Z.: Ia-red 2: Interpretability-aware redundancy reduction for vision transformers. In: Advances in Neural Information Processing Systems (NeurIPS) (2021)
work page 2021
-
[31]
Scalable Diffusion Models with Transformers
Peebles, W., Xie, S.: Scalable diffusion models with transformers. arXiv preprint arXiv:2212.09748 (2022)
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[32]
von Platen, P., Patil, S., Lozhkov, A., Cuenca, P., Lambert, N., Rasul, K., Davaadorj, M., Nair, D., Paul, S., Berman, W., Xu, Y., Liu, S., Wolf, T.: Diffusers: State-of-the-art diffusion models.https://github.com/huggingface/diffusers (2022)
work page 2022
-
[33]
Proust, M., Martyna Poreba, Michal Szczepanski, K.H.: Step: Supertoken and early-pruning for efficient semantic segmentation. In: Proceedings of the 20th Inter- national Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 3: VISAPP. pp. 50–61 (2025)
work page 2025
-
[34]
In: Advances in Neural In- formation Processing Systems (NeurIPS) (2021)
Rao, Y., Zhao, W., Liu, B., Lu, J., Zhou, J., Hsieh, C.J.: Dynamicvit: Efficient vision transformers with dynamic token sparsification. In: Advances in Neural In- formation Processing Systems (NeurIPS) (2021)
work page 2021
-
[35]
ImageNet Large Scale Visual Recognition Challenge,
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV)115(3), 211–252 (2015).https://doi.org/10.1007/s11263-015-0816-y
-
[36]
In: International Conference on Learning Representations (ICLR) (2022)
Salimans, T., Ho, J.: Progressive distillation for fast sampling of diffusion models. In: International Conference on Learning Representations (ICLR) (2022)
work page 2022
- [37]
-
[38]
Song,J.,Meng,C.,Ermon,S.:Denoisingdiffusionimplicitmodels.In:International Conference on Learning Representations (ICLR) (2021)
work page 2021
-
[39]
IEEE Transactions on Image Process- ing13(4), 600–612 (2004)
Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Process- ing13(4), 600–612 (2004)
work page 2004
-
[40]
Expert Systems with Applications288, 128206 (2025)
Yang, Y., Yue Zhou, Xiaofang Hu, S.D.: K-feature fusion token merging for vision transformer. Expert Systems with Applications288, 128206 (2025)
work page 2025
-
[41]
In: Advances in Neural Information Processing Systems (NeurIPS)
Yin, Tianwei, e.a.: Improved distribution matching distillation for fast image synthesis. In: Advances in Neural Information Processing Systems (NeurIPS). vol. 37, pp. 47455–47487 (2024).https : / / doi . org / 10 . 52202 / 079017 - 1505,https : / / proceedings . neurips . cc / paper _ files / paper / 2024 / hash / 54dcf25318f9de5a7a01f0a4125c541e-Abstrac...
work page 2024
-
[42]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Yin, H., Vahdat, A., Alvarez, J.M., Mallya, A., Kautz, J., Molchanov, P.: A-ViT: Adaptive tokens for efficient vision transformer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 10809– 10818 (June 2022)
work page 2022
-
[43]
In: Advances in Neu- ral Information Processing Systems (NeurIPS)
Zhan, Z., Wu, Y., Gong, Y., Meng, Z., Kong, Z., Yang, C., Wang, Y.: Fast and memory-efficient video diffusion using streamlined inference. In: Advances in Neu- ral Information Processing Systems (NeurIPS). vol. 37, pp. 13660–13684 (2024)
work page 2024
-
[44]
In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Zhang, Richard, e.a.: The unreasonable effectiveness of deep features as a percep- tual metric. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 586–595 (2018)
work page 2018
-
[45]
In: Proceedings of the AAAI Conference on Artificial Intelligence
Zhang, E., Tang, J., Ning, X., Zhang, L.: Training-free and hardware-friendly ac- celeration for diffusion models via similarity-based token pruning. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 39, pp. 9878–9886 (2025) 18 H. Lee et al
work page 2025
-
[46]
In: Advances in Neural Information Processing Systems (NeurIPS)
Zhao, W., Bai, L., Rao, Y., Zhou, J., Lu, J.: Unipc: A unified predictor- corrector framework for fast sampling of diffusion models. In: Advances in Neural Information Processing Systems (NeurIPS). vol. 36, pp. 49842–49869 (2023),https : / / proceedings . neurips . cc / paper _ files / paper / 2023 / hash / 9c2aa1e456ea543997f6927295196381-Abstract-Confer...
work page 2023
-
[47]
In: Advances in Neural Information Processing Systems (NeurIPS)
Zheng, X., Liu, X., Bian, Y., Ma, X., Zhang, Y., Wang, J., Qin, H.: Bidm: Pushing the limit of quantization for diffusion models. In: Advances in Neural Information Processing Systems (NeurIPS). vol. 37, pp. 39009–39035 (2024)
work page 2024
-
[48]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pat- tern Recognition (CVPR)
Zhou, Zhenyu, e.a.: Fast ode-based sampling for diffusion models in around 5 steps. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pat- tern Recognition (CVPR). pp. 7777–7786 (2024),https://openaccess.thecvf. com/content/CVPR2024/html/Zhou_Fast_ODE-based_Sampling_for_Diffusion_ Models_in_Around_5_Steps_CVPR_2024_paper.html
work page 2024
-
[49]
arXiv preprint arXiv:2510.06751 (2025)
Zhu, J., Wang, H., Su, M., Wang, Z., Wang, H.: Obs-diff: Accurate pruning for diffusion models in one-shot. arXiv preprint arXiv:2510.06751 (2025)
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.