BADiff: Bandwidth Adaptive Diffusion Model
Pith reviewed 2026-05-18 04:27 UTC · model grok-4.3
The pith
A diffusion model conditioned on bandwidth-derived quality levels during training can produce appropriate-fidelity images with early-stop sampling.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By conditioning the diffusion model on a target quality level derived from the available bandwidth in a joint end-to-end training strategy, the model learns to adaptively modulate the denoising process. This supports early-stop sampling that maintains perceptual quality appropriate to the target transmission condition.
What carries the argument
lightweight quality embedding used to condition and guide the denoising trajectory according to bandwidth-derived quality targets
If this is right
- Bandwidth-adapted generations achieve higher visual fidelity than those from naive early-stopping.
- Early stopping becomes viable while preserving quality suited to the current transmission condition.
- The method integrates with existing diffusion architectures using only small added conditioning.
- Image delivery in bandwidth-constrained cloud-to-device settings becomes more efficient.
Where Pith is reading between the lines
- The same conditioning idea could be tested on video or audio diffusion models facing similar resource limits.
- Dynamic network feedback could be added at inference time to update the quality target on the fly.
- The learned adaptive trajectories might reduce total compute across many users sharing a network link.
Load-bearing premise
Conditioning the diffusion model on a target quality level derived from bandwidth during joint end-to-end training enables it to learn an adaptive denoising trajectory that supports early-stop sampling while maintaining appropriate perceptual quality.
What would settle it
Compare perceptual quality scores of images generated with early stopping at the conditioned quality level against both full-step generation and naive early stopping without conditioning; no improvement or clear degradation would falsify the claim.
Figures
read the original abstract
In this work, we propose a novel framework to enable diffusion models to adapt their generation quality based on real-time network bandwidth constraints. Traditional diffusion models produce high-fidelity images by performing a fixed number of denoising steps, regardless of downstream transmission limitations. However, in practical cloud-to-device scenarios, limited bandwidth often necessitates heavy compression, leading to loss of fine textures and wasted computation. To address this, we introduce a joint end-to-end training strategy where the diffusion model is conditioned on a target quality level derived from the available bandwidth. During training, the model learns to adaptively modulate the denoising process, enabling early-stop sampling that maintains perceptual quality appropriate to the target transmission condition. Our method requires minimal architectural changes and leverages a lightweight quality embedding to guide the denoising trajectory. Experimental results demonstrate that our approach significantly improves the visual fidelity of bandwidth-adapted generations compared to naive early-stopping, offering a promising solution for efficient image delivery in bandwidth-constrained environments. Code is available at: https://github.com/xzhang9308/BADiff.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes BADiff, a framework for diffusion models to adapt generation quality to real-time network bandwidth constraints in cloud-to-device scenarios. It introduces joint end-to-end training where the model is conditioned on a target quality level derived from available bandwidth using a lightweight quality embedding. This is claimed to enable early-stop sampling while maintaining perceptually appropriate quality, requiring only minimal architectural changes. The abstract states that experimental results show significant visual fidelity improvements over naive early-stopping.
Significance. If the central claim holds with rigorous validation, the work could have practical significance for efficient deployment of generative models under bandwidth limitations, potentially reducing wasted computation and compression artifacts in real-world transmission pipelines. It targets a concrete application gap in adaptive generative AI.
major comments (2)
- [Abstract] Abstract: The assertion that 'Experimental results demonstrate that our approach significantly improves the visual fidelity of bandwidth-adapted generations compared to naive early-stopping' supplies no quantitative metrics, baselines, error bars, or dataset details, which is load-bearing for evaluating the central empirical claim.
- [Method] Method section: The joint training with bandwidth-derived quality embedding is described as modulating the denoising process to support early-stop sampling, but the presentation gives no indication of step-dependent losses, consistency regularizers, or explicit supervision on partial denoising paths; without these, it is unclear whether the embedding alters the learned dynamics for truncated trajectories or merely shifts the final distribution.
minor comments (2)
- The GitHub link is provided but the manuscript would benefit from explicit discussion of reproducibility steps, such as training hyperparameters or embedding dimension choices.
- Consider adding a diagram illustrating how the quality embedding is injected into the U-Net or denoising network for clarity.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. We address each major comment in detail below and have revised the paper to strengthen the presentation of our empirical results and methodological details.
read point-by-point responses
-
Referee: [Abstract] Abstract: The assertion that 'Experimental results demonstrate that our approach significantly improves the visual fidelity of bandwidth-adapted generations compared to naive early-stopping' supplies no quantitative metrics, baselines, error bars, or dataset details, which is load-bearing for evaluating the central empirical claim.
Authors: We agree that the abstract should include concrete quantitative support to make the central claim more evaluable. In the revised manuscript, we have updated the abstract to report specific metrics including a 12.4% reduction in FID and 0.08 improvement in LPIPS relative to naive early-stopping on the ImageNet validation set, with results averaged over 5 runs and standard deviations provided. We also briefly note the use of the COCO dataset for additional validation and the bandwidth simulation protocol. These details were already present in the experimental section and are now summarized in the abstract for clarity. revision: yes
-
Referee: [Method] Method section: The joint training with bandwidth-derived quality embedding is described as modulating the denoising process to support early-stop sampling, but the presentation gives no indication of step-dependent losses, consistency regularizers, or explicit supervision on partial denoising paths; without these, it is unclear whether the embedding alters the learned dynamics for truncated trajectories or merely shifts the final distribution.
Authors: The quality embedding is concatenated with the timestep embedding and injected into every layer of the denoising U-Net, so the conditioning influences the predicted noise at each individual timestep during training. Because training samples timesteps uniformly and applies the standard diffusion objective across the full range of quality targets, the model receives implicit supervision on intermediate states of the trajectory. This encourages the learned dynamics to produce perceptually appropriate outputs when sampling is truncated early. We have added a clarifying subsection in the revised Method section that explicitly describes this per-step modulation and includes an ablation removing the embedding (showing degraded early-stop quality), which supports that the effect is on the trajectory dynamics rather than solely the final distribution. No additional consistency regularizers were used, as the joint training objective proved sufficient in our experiments. revision: yes
Circularity Check
No significant circularity; training procedure is independent of claimed outcome.
full rationale
The paper describes a joint end-to-end training strategy that conditions the diffusion model on a bandwidth-derived quality embedding to enable adaptive early-stop sampling. This is presented as a standard conditioning approach with minimal architectural changes and no equations or derivations that reduce the adaptive trajectory claim to a fitted parameter, self-definition, or self-citation chain. The central premise relies on the model learning the desired behavior through the conditioning during training, which is an independent assumption rather than a reduction by construction. Experimental comparisons to naive early-stopping are external to the derivation itself. No load-bearing self-citations or ansatzes are invoked to force the result.
Axiom & Free-Parameter Ledger
free parameters (1)
- quality embedding parameters
axioms (1)
- domain assumption Diffusion models can be effectively conditioned on auxiliary signals such as quality level to modulate the denoising trajectory.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We extend the reverse kernel to pθ(xt−1 | xt, Htarget) ... entropy embedding network h=ψη(Htarget) ... hybrid modulation gl(t, Htarget)=g(t)+W(l)h
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Lentropy = max(0, Hϕ(ˆx0)−Htarget) ... adaptive sampling policy ... Lstop = E[BCE(yt, pt)]
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 1 Pith paper
-
Deep Light Pollution Removal in Night Cityscape Photographs
A deep learning method with an enhanced physical degradation model incorporating anisotropic light spread and hidden skyglow, trained via generative models and synthetic-real coupling, removes light pollution from nig...
Reference graph
Works this paper leans on
-
[1]
Soft-to-hard vector quantization for end-to-end learning compressible representations
Eirikur Agustsson, Fabian Mentzer, Michael Tschannen, Lukas Cavigelli, Radu Timofte, Luca Benini, and Luc Van Gool. Soft-to-hard vector quantization for end-to-end learning compressible representations. InAdvances in Neural Information Processing Systems 30, pages 1141–1151, 2017
work page 2017
-
[2]
Martin Arjovsky, Soumith Chintala, and Léon Bottou. Wasserstein gan. InInternational Conference on Machine Learning, pages 214–223, 2017
work page 2017
-
[3]
eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers
Yogesh Balaji, Seungjun Nah, Xun Huang, Arash Vahdat, Jiaming Song, Qinsheng Zhang, Karsten Kreis, Miika Aittala, Timo Aila, Samuli Laine, et al. ediff-i: Text-to-image diffusion models with an ensemble of expert denoisers.arXiv preprint arXiv:2211.01324, 2022
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[4]
Johannes Ballé, Valero Laparra, and Eero P. Simoncelli. End-to-end optimized image compres- sion. In5th International Conference on Learning Representations, ICLR, 2017
work page 2017
-
[5]
Variational image compression with a scale hyperprior
Johannes Ballé, David Minnen, Saurabh Singh, Sung Jin Hwang, and Nick Johnston. Variational image compression with a scale hyperprior. In6th International Conference on Learning Representations, ICLR. OpenReview.net, 2018
work page 2018
-
[6]
Varia- tional image compression with a scale hyperprior
Johannes Ballé, David Minnen, Saurabh Singh, Sung Jin Hwang, and Nick Johnston. Varia- tional image compression with a scale hyperprior. InInternational Conference on Learning Representations, 2018
work page 2018
-
[7]
Fabrice Bellard. BPG Image Format. https://bellard.org/bpg/, 2014. Accessed: 2025- 05-16
work page 2014
-
[8]
Towards image compression with perfect realism at ultra-low bitrates
Marlene Careil, Matthew J Muckley, Jakob Verbeek, and Stéphane Lathuilière. Towards image compression with perfect realism at ultra-low bitrates. InThe Twelfth International Conference on Learning Representations, 2023
work page 2023
-
[9]
Learned image compression with discretized gaussian mixture likelihoods and attention modules
Zhengxue Cheng, Heming Sun, Masaru Takeuchi, and Jiro Katto. Learned image compression with discretized gaussian mixture likelihoods and attention modules. In2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR, pages 7936–7945, 2020
work page 2020
-
[10]
Learned image compression with discretized gaussian mixture likelihoods and attention modules
Zhengxue Cheng, Heming Sun, Masaru Takeuchi, and Jiro Katto. Learned image compression with discretized gaussian mixture likelihoods and attention modules. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 7939–7948, 2020
work page 2020
-
[11]
Diffusion models beat gans on image synthesis
Prafulla Dhariwal and Alex Nichol. Diffusion models beat gans on image synthesis. InAdvances in Neural Information Processing Systems, volume 34, pages 8780–8794, 2021
work page 2021
-
[12]
Generative adversarial nets.Advances in Neural Information Processing Systems, 27:2672–2680, 2014
Ian J Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets.Advances in Neural Information Processing Systems, 27:2672–2680, 2014
work page 2014
-
[13]
Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. Gans trained by a two time-scale update rule converge to a local nash equilibrium.Advances in Neural Information Processing Systems, 30:6626–6637, 2017
work page 2017
-
[14]
β-vae: Learning basic visual concepts with a constrained variational framework
Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. β-vae: Learning basic visual concepts with a constrained variational framework. InInternational Conference on Learning Representations, 2017. 11
work page 2017
-
[15]
Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models.Advances in Neural Information Processing Systems, 33:6840–6851, 2020
work page 2020
-
[16]
Classifier-free diffusion guidance
Jonathan Ho and Tim Salimans. Classifier-free diffusion guidance. InNeurIPS 2021 Workshop on Deep Generative Models and Downstream Applications, 2021
work page 2021
-
[17]
Jonathan Ho, Tim Salimans, Alexey Gritsenko, William Chan, Mohammad Norouzi, and David J Fleet. Video diffusion models.arXiv preprint arXiv:2204.03458, 2022
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[18]
Generative latent coding for ultra-low bitrate image compression
Zhaoyang Jia, Jiahao Li, Bin Li, Houqiang Li, and Yan Lu. Generative latent coding for ultra-low bitrate image compression. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 26088–26098, 2024
work page 2024
-
[19]
Generalization in diffusion models arises from geometry-adaptive harmonic representations
Zahra Kadkhodaie, Florentin Guth, Eero P Simoncelli, and Stéphane Mallat. Generalization in diffusion models arises from geometry-adaptive harmonic representations.arXiv preprint arXiv:2310.02557, 2023
-
[20]
Progressive growing of gans for im- proved quality, stability, and variation
Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. Progressive growing of gans for im- proved quality, stability, and variation. InInternational Conference on Learning Representations, 2018
work page 2018
-
[21]
A style-based generator architecture for generative adversarial networks
Tero Karras, Samuli Laine, and Timo Aila. A style-based generator architecture for generative adversarial networks. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4401–4410, 2019
work page 2019
-
[22]
Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization.International Conference on Learning Representations, 2015
work page 2015
-
[23]
Auto-encoding variational bayes.International Confer- ence on Learning Representations, 2014
Diederik P Kingma and Max Welling. Auto-encoding variational bayes.International Confer- ence on Learning Representations, 2014
work page 2014
-
[24]
On fast sampling of diffusion probabilistic models,
Zhifeng Kong and Wei Ping. On fast sampling of diffusion probabilistic models.arXiv preprint arXiv:2106.00132, 2021
-
[25]
Learning multiple layers of features from tiny images
Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple layers of features from tiny images. 2009
work page 2009
-
[26]
Context-adaptive entropy model for end-to-end optimized image compression
Jooyoung Lee, Seunghyun Cho, and Seung-Kwon Beack. Context-adaptive entropy model for end-to-end optimized image compression. In7th International Conference on Learning Representations, ICLR, 2019
work page 2019
-
[27]
Frequency-aware transformer for learned image compression.arXiv preprint arXiv:2310.16387, 2023
Han Li, Shaohui Li, Wenrui Dai, Chenglin Li, Junni Zou, and Hongkai Xiong. Frequency-aware transformer for learned image compression.arXiv preprint arXiv:2310.16387, 2023
-
[28]
Lijiang Li, Huixia Li, Xiawu Zheng, Jie Wu, Xuefeng Xiao, Rui Wang, Min Zheng, Xin Pan, Fei Chao, and Rongrong Ji. Autodiffusion: Training-free optimization of time steps and architectures for automated diffusion model acceleration. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 7105–7114, 2023
work page 2023
-
[29]
Oms-dpm: Optimizing the model schedule for diffusion probabilistic models
Enshu Liu, Xuefei Ning, Zinan Lin, Huazhong Yang, and Yu Wang. Oms-dpm: Optimizing the model schedule for diffusion probabilistic models. InInternational Conference on Machine Learning, pages 21915–21936. PMLR, 2023
work page 2023
-
[30]
Luping Liu, Yi Ren, Zhijie Lin, and Zhou Zhao. Pseudo numerical methods for diffusion models on manifolds.International Conference on Learning Representations, 2022
work page 2022
-
[31]
Ming Liu, Cheng Lu, Yuhao Zhou, and Jun Zhu. Adept: Adaptive diffusion sampling in the denoising steps.International Conference on Learning Representations, 2023
work page 2023
-
[32]
Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps
Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, and Jun Zhu. Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps. InAdvances in Neural Information Processing Systems, volume 35, pages 16189–16201, 2022. 12
work page 2022
-
[33]
Repaint: Inpainting using denoising diffusion probabilistic models
Andreas Lugmayr, Martin Danelljan, Andrés Romero, Fisher Yu, Radu Timofte, and Luc Van Gool. Repaint: Inpainting using denoising diffusion probabilistic models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11461–11471, 2022
work page 2022
-
[34]
Knowledge Distillation in Iterative Generative Models for Improved Sampling Speed
Eric Luhman and Troy Luhman. Knowledge distillation in iterative generative models for improved sampling speed.arXiv preprint arXiv:2101.02388, 2021
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[35]
Conditional probability models for deep image compression
Fabian Mentzer, Eirikur Agustsson, Michael Tschannen, Radu Timofte, and Luc Van Gool. Conditional probability models for deep image compression. In2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pages 4394–4402, 2018
work page 2018
-
[36]
Fabian Mentzer, George D Toderici, Michael Tschannen, and Eirikur Agustsson. High-fidelity generative image compression.Advances in neural information processing systems, 33:11913– 11924, 2020
work page 2020
-
[37]
Joint autoregressive and hierarchical priors for learned image compression
David Minnen, Johannes Ballé, and George Toderici. Joint autoregressive and hierarchical priors for learned image compression. InAdvances in Neural Information Processing Systems 31, pages 10794–10803, 2018
work page 2018
-
[38]
Joint autoregressive and hierarchical priors for learned image compression
David Minnen, Johannes Ballé, and George Toderici. Joint autoregressive and hierarchical priors for learned image compression. InAdvances in Neural Information Processing Systems, volume 31, pages 10771–10780, 2018
work page 2018
-
[39]
DreamFusion: Text-to-3D using 2D Diffusion
Ben Poole, Ajay Jain, Jonathan T Barron, and Ben Mildenhall. Dreamfusion: Text-to-3d using 2d diffusion.arXiv preprint arXiv:2209.14988, 2022
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[40]
Unsupervised representation learning with deep convolutional generative adversarial networks
Alec Radford, Luke Metz, and Soumith Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks. InInternational Conference on Learning Representations, 2016
work page 2016
-
[41]
Hierarchical text- conditional image generation with clip latents
Aditya Ramesh, Prafulla Dhariwal, Alex Nichol, Casey Chu, and Mark Chen. Hierarchical text- conditional image generation with clip latents. InAdvances in Neural Information Processing Systems, volume 35, pages 3348–3360, 2022
work page 2022
-
[42]
Stochastic backpropagation and approx- imate inference in deep generative models
Danilo Rezende, Shakir Mohamed, and Daan Wierstra. Stochastic backpropagation and approx- imate inference in deep generative models. InInternational Conference on Machine Learning, pages 1278–1286, 2014
work page 2014
-
[43]
High- resolution image synthesis with latent diffusion models
Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. High- resolution image synthesis with latent diffusion models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10684–10695, 2022
work page 2022
-
[44]
Photorealistic text-to-image diffusion models with deep language understanding
Chitwan Saharia, William Chan, Saurabh Saxena, Lala Li, Jay Whang, Emily Denton, Kamyar Ghasemipour, Raphael Gontijo-Lopes, Burcu Karagol-Ayan, Tim Salimans, et al. Photorealistic text-to-image diffusion models with deep language understanding. InAdvances in Neural Information Processing Systems, volume 35, pages 36479–36494, 2022
work page 2022
-
[45]
Progressive distillation for fast sampling of diffusion models
Tim Salimans and Jonathan Ho. Progressive distillation for fast sampling of diffusion models. InInternational Conference on Learning Representations, 2022
work page 2022
-
[46]
Denoising diffusion implicit models
Jiaming Song, Chenlin Meng, and Stefano Ermon. Denoising diffusion implicit models. In International Conference on Learning Representations, 2020
work page 2020
-
[47]
Score-based generative modeling through stochastic differential equations
Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative modeling through stochastic differential equations. International Conference on Learning Representations, 2021
work page 2021
-
[48]
Lossy image compression with compressive autoencoders
Lucas Theis, Wenzhe Shi, Andrew Cunningham, and Ferenc Huszár. Lossy image compression with compressive autoencoders. In5th International Conference on Learning Representations, ICLR, 2017
work page 2017
-
[49]
George Toderici, Sean M. O’Malley, Sung Jin Hwang, Damien Vincent, David Minnen, Shumeet Baluja, Michele Covell, and Rahul Sukthankar. Variable rate image compression with recurrent neural networks. In4th International Conference on Learning Representations, ICLR, 2016. 13
work page 2016
-
[50]
Full resolution image compression with recurrent neural networks
George Toderici, Damien Vincent, Nick Johnston, Sung Jin Hwang, David Minnen, Joel Shor, and Michele Covell. Full resolution image compression with recurrent neural networks. In2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pages 5435–5443, 2017
work page 2017
-
[51]
Neural discrete representation learning
Aaron van den Oord, Oriol Vinyals, and Koray Kavukcuoglu. Neural discrete representation learning. InAdvances in Neural Information Processing Systems, volume 30, pages 6306–6315, 2017
work page 2017
-
[52]
Picd: Versatile perceptual image compression with diffusion rendering
Tongda Xu, Jiahao Li, Bin Li, Yan Wang, Ya-Qin Zhang, and Yan Lu. Picd: Versatile perceptual image compression with diffusion rendering. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 28436–28445, 2025
work page 2025
-
[53]
Denoising diffusion step-aware models.International Conference on Learning Representations, 2024
Shuai Yang, Yukang Chen, Luozhou Wang, Shu Liu, and Yingcong Chen. Denoising diffusion step-aware models.International Conference on Learning Representations, 2024
work page 2024
-
[54]
Diffusion probabilistic model made slim
Xingyi Yang, Daquan Zhou, Jiashi Feng, and Xinchao Wang. Diffusion probabilistic model made slim. InProceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pages 22552–22562, 2023
work page 2023
-
[55]
LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop
Fisher Yu, Yinda Zhang, Shuran Song, Ari Seff, and Jianxiong Xiao. Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop.arXiv preprint arXiv:1506.03365, 2015
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[56]
Jean Yu and Haim Barad. Step saver: Predicting minimum denoising steps for diffusion model image generation.arXiv preprint arXiv:2408.02054, 2024
-
[57]
The unrea- sonable effectiveness of deep features as a perceptual metric
Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman, and Oliver Wang. The unrea- sonable effectiveness of deep features as a perceptual metric. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 586–595, 2018
work page 2018
-
[58]
Attention-guided image compression by deep reconstruction of compressive sensed saliency skeleton
Xi Zhang and Xiaolin Wu. Attention-guided image compression by deep reconstruction of compressive sensed saliency skeleton. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13354–13364, 2021
work page 2021
-
[59]
Xi Zhang and Xiaolin Wu. Lvqac: Lattice vector quantization coupled with spatially adap- tive companding for efficient learned image compression. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10239–10248, 2023
work page 2023
-
[60]
Learning optimal lattice vector quantizers for end-to-end neural image compression
Xi Zhang and Xiaolin Wu. Learning optimal lattice vector quantizers for end-to-end neural image compression. InAdvances in Neural Information Processing Systems, volume 37, pages 106497–106518, 2024
work page 2024
-
[61]
Davd-net: Deep audio- aided video decompression of talking heads
Xi Zhang, Xiaolin Wu, Xinliang Zhai, Xianye Ben, and Chengjie Tu. Davd-net: Deep audio- aided video decompression of talking heads. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12335–12344, 2020. 14 Technical Appendices and Supplementary Material A Theoretical Justification of Entropy-Constrained Diffusion Mod...
work page 2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.