{"total":20,"items":[{"citing_arxiv_id":"2605.21573","ref_index":61,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Lens: Rethinking Training Efficiency for Foundational Text-to-Image Models","primary_cat":"cs.CV","submitted_at":"2026-05-20T17:59:58+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Lens is a 3.8B-parameter text-to-image model that reaches competitive or superior performance to >6B-parameter systems using 19.3% of the training compute of Z-Image through a densely captioned 800M dataset, multi-resolution batching, semantic VAE, strong language encoder, RL fine-tuning, and 4-step","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.20780","ref_index":10,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Learning to Think in Physics: Breaking Shortcut Learning in Scientific Diffusion via Representation Alignment","primary_cat":"cs.LG","submitted_at":"2026-05-20T06:22:44+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"REPA-P aligns intermediate representations in diffusion models with physical states using first-principles PDE residuals to accelerate convergence and boost out-of-distribution robustness on PDE tasks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.17546","ref_index":25,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Accelerating Redshift-Conditioned Galaxy Image Synthesis with One-step Generative Modeling","primary_cat":"astro-ph.IM","submitted_at":"2026-05-17T17:00:39+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"One-step pixel-MeanFlow models recover key galaxy morphology statistics at orders-of-magnitude lower computational cost than standard DDPM sampling while remaining weaker on fine-grained structure.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.07907","ref_index":4,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Consistency Regularised Gradient Flows for Inverse Problems","primary_cat":"stat.ML","submitted_at":"2026-05-08T15:45:34+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"A consistency-regularized Euclidean-Wasserstein-2 gradient flow performs joint posterior sampling and prompt optimization in latent space for efficient low-NFE inverse problem solving with diffusion models.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"minimises a function f : Rd → R in the Euclidean space, we seek an analogue of this ﬂow applicable to F[c,µ], which features gradients obtained by endowing C with the Euclidean geometry and P2(Z) with the Wasserstein-2 one (cf. Kuntz et al. [2023] and Appendix A.1): { ˙cτ = −∇cR[cτ,µτ] ˙µτ = −gradW2 R[cτ,µτ] − gradW2 L[µτ]. (10) (11) where gradW2F is the Wasserstein-2 ( W2) gradient of a functional F [Figalli and Glaudo, 2021, Chapter 4] deﬁned via the ﬁrst variation [ Santambrogio, 2015, Section 7.2]: gradW2 F[µ](z) = −∇ · (µ∇z(δF/δµ)(z)). Along the ﬂow (cτ,µτ), the value of F[cτ,µτ] is non-increasing and we recover the optimal solution (c⋆,µ⋆) at stationarity [ Kuntz et al. , 2023, Theorem 2] (cf. Appendix A.2 for the proof.) Theorem 1 (Optimality Conditions) . Under the loss (9), the gradient ∇F[c,µ] :="},{"citing_arxiv_id":"2604.26244","ref_index":16,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"MetaSR: Content-Adaptive Metadata Orchestration for Generative Super-Resolution","primary_cat":"cs.CV","submitted_at":"2026-04-29T02:58:04+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"MetaSR adaptively orchestrates metadata in a DiT-based generative SR model to deliver up to 1 dB PSNR gains and 50% bitrate savings across diverse content and degradations.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"Training-free acceleration of multi-step diffusion sampling.Standard diffusion sampling is slow due to many sequential denoising steps. A large class of meth- ods accelerates inference without retraining by changing the sampler. DDIM introduces non-Markovian sampling trajectories that reduce steps [26]. DPM- Solver and UniPC propose higher-order solvers/predictor-corrector frameworks for fast sampling [16,44]. These techniques are complementary to MetaSR: they speed up diffusion but generally remain multi-step and do not address the draft's sender-side metadata transmission problem. Distillation and consistency training for few-step / one-step generation.Another direction trains models to intrinsically generate in very few steps. Progressive Distillation iteratively distills multi-step teachers into fewer-step students [22]."},{"citing_arxiv_id":"2604.24622","ref_index":30,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"CF-VLA: Efficient Coarse-to-Fine Action Generation for Vision-Language-Action Policies","primary_cat":"cs.CV","submitted_at":"2026-04-27T15:51:40+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"CF-VLA uses a coarse initialization over endpoint velocity followed by single-step refinement to achieve strong performance with low inference steps on CALVIN, LIBERO, and real-robot tasks.","context_count":1,"top_context_role":"baseline","top_context_polarity":"baseline","context_text":"NinA (MLP) [35] NeurIPS WS, 2025 - 1 87.898.290.292.890.9 EveryDayVLA [6] ICRA subm., 2026 Plan+7B 196.895.691.082.091.4 NFE = 8/10 methods DreamVLA [47] NeurIPS, 2025 WM 10 97.5 94.0 89.5 89.5 92.6 𝜋 ∗ 0.5 [11] CoRL, 2025 3B 10 98.4 97.4 96.2 90.6 95.7 InstructVLA [41] ICLR, 2026 - 10 97.3 99.6 96.5 89.8 95.8 MemoryVLA [34] ICLR, 2026 7B 10 98.4 98.4 96.4 93.4 96.5 Flower VLA [30] CoRL, 2025 - 8 97.5 99.1 96.1 94.9 96.9 𝜋0.5 [11] CoRL, 2025 3B 1098.898.2 98.0 92.4 96.9 X-VLA [49] ICLR, 2026 - 10 98.2 98.6 97.897.698.1 Cosmos Policy [16] ICLR, 2026 Plan+WM 10 98.1100.0 98.2 97.6 98.5 NFE = 20/100 methods Dita [8] ICCV, 2025 - 20 84.2 96.3 85.4 63.8 82.4 CronusVLA [20] AAAI, 2026 7B 10097.3 99.6 96.9 94.0 97.0 NFE = 2 methods"},{"citing_arxiv_id":"2604.23536","ref_index":25,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"$Z^2$-Sampling: Zero-Cost Zigzag Trajectories for Semantic Alignment in Diffusion Models","primary_cat":"cs.CV","submitted_at":"2026-04-26T05:16:54+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Z²-Sampling implicitly realizes zero-cost zigzag trajectories for curvature-aware semantic alignment in diffusion models by reducing multi-step paths via operator dualities and temporal caching while synthesizing a directional derivative penalty.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"Sora: A review on background, technology, limitations, and opportunities of large vision models, 2024. URL https://arxiv.org/abs/2402.17177. [24] Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, and Jun Zhu. Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps, 2022. URL https://arxiv.org/abs/2206.00927. [25] Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, and Jun Zhu. Dpm-solver++: Fast solver for guided sampling of diffusion probabilistic models.Machine Intelligence Re- search, 22(4):730-751, June 2025. ISSN 2731-5398. doi: 10.1007/s11633-025-1562-4. URL http://dx.doi.org/10.1007/s11633-025-1562-4. [26] Andreas Lugmayr, Martin Danelljan, Andres Romero, Fisher Yu, Radu Timofte, and Luc Van"},{"citing_arxiv_id":"2602.13357","ref_index":14,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"AdaCorrection: Adaptive Offset Cache Correction for Accurate Diffusion Transformers","primary_cat":"cs.CV","submitted_at":"2026-02-13T08:11:54+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"AdaCorrection adaptively corrects offset caches in DiT inference via on-the-fly spatio-temporal validity checks to maintain near-original FID with moderate acceleration.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.02340","ref_index":20,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Not All Denoising Steps Are Equal: Model Scheduling for Faster Masked Diffusion Language Models","primary_cat":"cs.LG","submitted_at":"2026-02-04T13:04:58+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Early and late denoising steps in masked diffusion LMs are robust to smaller-model replacement, enabling 17% FLOPs reduction with modest generative quality loss.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2511.00124","ref_index":2,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Cross-fluctuation phase transitions reveal sampling dynamics in diffusion models","primary_cat":"cs.LG","submitted_at":"2025-10-31T09:40:59+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Cross-fluctuations detect discrete phase transitions during diffusion sampling trajectories, yielding efficiency gains in generation and zero-shot tasks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2507.16344","ref_index":41,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Diff-ANO: Towards Fast High-Resolution Ultrasound Computed Tomography via Conditional Consistency Models and Adjoint Neural Operators","primary_cat":"math.NA","submitted_at":"2025-07-22T08:24:22+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Diff-ANO uses conditional consistency models and adjoint neural operator surrogates to enable fast, high-quality USCT reconstructions under sparse and partial views by replacing slow PDE solvers and enabling few-step sampling.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2504.03468","ref_index":23,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"D-Garment: Physically Grounded Latent Diffusion for Dynamic Garment Deformations","primary_cat":"cs.CV","submitted_at":"2025-04-04T14:18:06+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"D-Garment is a template-specific latent diffusion model that generates dynamic 3D garment deformations conditioned on body shape, motion, and physical material properties using training data from a physics-based simulator.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2412.15689","ref_index":32,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"DOLLAR: Few-Step Video Generation via Distillation and Latent Reward Optimization","primary_cat":"cs.CV","submitted_at":"2024-12-20T09:07:36+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"DOLLAR combines variational score and consistency distillation for few-step video generation plus latent reward optimization, reporting 82.57 VBench score and up to 278x speedup over the teacher diffusion model for 128-frame 10-second videos.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2311.04938","ref_index":13,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Improved DDIM Sampling with Moment Matching Gaussian Mixtures","primary_cat":"cs.CV","submitted_at":"2023-11-08T00:24:50+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Moment-matched GMM kernels in DDIM yield lower FID and higher IS than Gaussian kernels at small sampling steps on CelebA-HQ, FFHQ, ImageNet, and Stable Diffusion tasks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2310.04378","ref_index":69,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference","primary_cat":"cs.CV","submitted_at":"2023-10-06T17:11:58+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Latent Consistency Models enable high-fidelity text-to-image generation in 2-4 steps by directly predicting solutions to the probability flow ODE in latent space, distilled from pre-trained LDMs.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2303.01469","ref_index":38,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Consistency Models","primary_cat":"cs.LG","submitted_at":"2023-03-02T18:30:16+00:00","verdict":"CONDITIONAL","verdict_confidence":"MODERATE","novelty_score":8.0,"formal_verification":"none","one_line_summary":"Consistency models achieve fast one-step generation with SOTA FID of 3.55 on CIFAR-10 and 6.20 on ImageNet 64x64 by directly mapping noise to data, outperforming prior distillation techniques.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2211.15089","ref_index":56,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Continuous diffusion for categorical data","primary_cat":"cs.CL","submitted_at":"2022-11-28T06:08:54+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"The paper proposes CDCD, a continuous-time and continuous-space diffusion framework for categorical data, and reports results on language modeling tasks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2209.14687","ref_index":147,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Diffusion Posterior Sampling for General Noisy Inverse Problems","primary_cat":"stat.ML","submitted_at":"2022-09-29T11:12:27+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Diffusion models solve noisy (non)linear inverse problems via approximated posterior sampling that blends diffusion steps with manifold gradients without strict consistency projection.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2209.03003","ref_index":46,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow","primary_cat":"cs.LG","submitted_at":"2022-09-07T08:59:55+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":8.0,"formal_verification":"none","one_line_summary":"Rectified flow learns straight-path neural ODEs for distribution transport, yielding efficient generative models and domain transfers that work well even with a single simulation step.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"improve the likelihood, sampling speed, and generation quality. [29] systematic exams the design space of diffusion generative models with empirical studies and identiﬁes a number of training and inference recipes for better generative quality with fewer sampling steps. [94] proposes a diffusion exponential integrator sampler for fast sampling of diffusion models. [46] provides a customized high order solver for PF-ODEs. [5] provides an analytic estimate of the optimal diffusion coefﬁcient. • Combination with other methods. Another direction is to speed up diffusion models by combining them with GANs and other generative models. DDPM Distillation [47] accelerates the inference speed by dis- tilling the trajectories of a diffusion model into a series of conditional GANs."},{"citing_arxiv_id":"2208.06193","ref_index":9,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Diffusion Policies as an Expressive Policy Class for Offline Reinforcement Learning","primary_cat":"cs.LG","submitted_at":"2022-08-12T09:54:11+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Diffusion-QL uses conditional diffusion models as expressive policies in offline RL by coupling behavior cloning with Q-value maximization, achieving SOTA on most D4RL tasks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null}],"limit":50,"offset":0}