Controlling vision-language models for multi-task image restoration

Controlling vision-language models for universal image restoration , author= · 2023 · arXiv 2310.01018

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

representative citing papers

Degradation-Aware Adaptive Context Gating for Unified Image Restoration

cs.CV · 2026-05-02 · unverdicted · novelty 7.0

DACG-IR adds a lightweight degradation-aware module that generates prompts to adaptively gate attention temperature, output features, and spatial-channel fusion in an encoder-decoder network for unified image restoration.

PhySe-RPO: Physics and Semantics Guided Relative Policy Optimization for Diffusion-Based Surgical Smoke Removal

cs.AI · 2026-03-24 · unverdicted · novelty 7.0

PhySe-RPO enables diffusion-based surgical smoke removal by converting restoration into a stochastic policy optimized with physics consistency and CLIP semantic rewards under limited supervision.

Toward Generalizable Forgery Detection and Reasoning

cs.CV · 2025-03-27 · unverdicted · novelty 7.0

FakeReasoning is an MLLM-based framework for unified forgery detection and reasoning on AI-generated images, supported by the new MMFR-Dataset of 120K images and 378K annotations across 10 generators.

EvoIR-Agent: Self-Evolving Image Restoration Agentic System via Experience-Driven Learning

cs.CV · 2026-05-21 · unverdicted · novelty 6.0

EvoIR-Agent formulates experience components into a hierarchical pool with a self-evolving update mechanism to improve performance and efficiency of training-free MLLM image restoration agents over prior paradigms.

Degradation Frequency Curve: An Explicit Frequency-Quantified Representation for All-in-One Image Restoration

cs.CV · 2026-05-17 · unverdicted · novelty 6.0

The paper proposes the Degradation Frequency Curve (DFC) as an explicit spectral representation for quantifying degradations and develops a DFC-guided multi-scale restorer that achieves state-of-the-art performance on composite and real-world benchmarks.

Adapting Large VLMs with Iterative and Manual Instructions for Generative Low-light Enhancement

cs.CV · 2025-07-24 · conditional · novelty 6.0

VLM-IMI adapts VLMs with iterative and manual instructions plus a learnable fusion module to guide diffusion-based generative low-light image enhancement, outperforming prior methods in perceptual quality.

TPGDiff: Hierarchical Triple-Prior Guided Diffusion for Image Restoration

cs.CV · 2026-01-28 · unverdicted · novelty 5.0

TPGDiff introduces hierarchical triple-prior guidance in a diffusion network, placing degradation priors throughout, structural priors in shallow layers, and semantic priors in deep layers for improved all-in-one image restoration.

Q-Agent: Quality-Driven Chain-of-Thought Image Restoration Agent through Robust Multimodal Large Language Model

eess.IV · 2025-04-09 · unverdicted · novelty 5.0

Q-Agent uses CoT decomposition on a fine-tuned MLLM for multi-degradation perception plus IQA-driven greedy selection of restoration algorithms to claim better performance than All-in-One IR models.

SCRIPT: Scalable Diffusion Policy with Multi-stage Training for Language-driven Physics-based Humanoid Control

cs.GR · 2026-05-21

citing papers explorer

Showing 9 of 9 citing papers.

Degradation-Aware Adaptive Context Gating for Unified Image Restoration cs.CV · 2026-05-02 · unverdicted · none · ref 43
DACG-IR adds a lightweight degradation-aware module that generates prompts to adaptively gate attention temperature, output features, and spatial-channel fusion in an encoder-decoder network for unified image restoration.
PhySe-RPO: Physics and Semantics Guided Relative Policy Optimization for Diffusion-Based Surgical Smoke Removal cs.AI · 2026-03-24 · unverdicted · none · ref 20
PhySe-RPO enables diffusion-based surgical smoke removal by converting restoration into a stochastic policy optimized with physics consistency and CLIP semantic rewards under limited supervision.
Toward Generalizable Forgery Detection and Reasoning cs.CV · 2025-03-27 · unverdicted · none · ref 72
FakeReasoning is an MLLM-based framework for unified forgery detection and reasoning on AI-generated images, supported by the new MMFR-Dataset of 120K images and 378K annotations across 10 generators.
EvoIR-Agent: Self-Evolving Image Restoration Agentic System via Experience-Driven Learning cs.CV · 2026-05-21 · unverdicted · none · ref 31
EvoIR-Agent formulates experience components into a hierarchical pool with a self-evolving update mechanism to improve performance and efficiency of training-free MLLM image restoration agents over prior paradigms.
Degradation Frequency Curve: An Explicit Frequency-Quantified Representation for All-in-One Image Restoration cs.CV · 2026-05-17 · unverdicted · none · ref 21
The paper proposes the Degradation Frequency Curve (DFC) as an explicit spectral representation for quantifying degradations and develops a DFC-guided multi-scale restorer that achieves state-of-the-art performance on composite and real-world benchmarks.
Adapting Large VLMs with Iterative and Manual Instructions for Generative Low-light Enhancement cs.CV · 2025-07-24 · conditional · none · ref 38
VLM-IMI adapts VLMs with iterative and manual instructions plus a learnable fusion module to guide diffusion-based generative low-light image enhancement, outperforming prior methods in perceptual quality.
TPGDiff: Hierarchical Triple-Prior Guided Diffusion for Image Restoration cs.CV · 2026-01-28 · unverdicted · none · ref 54
TPGDiff introduces hierarchical triple-prior guidance in a diffusion network, placing degradation priors throughout, structural priors in shallow layers, and semantic priors in deep layers for improved all-in-one image restoration.
Q-Agent: Quality-Driven Chain-of-Thought Image Restoration Agent through Robust Multimodal Large Language Model eess.IV · 2025-04-09 · unverdicted · none · ref 20
Q-Agent uses CoT decomposition on a fine-tuned MLLM for multi-degradation perception plus IQA-driven greedy selection of restoration algorithms to claim better performance than All-in-One IR models.
SCRIPT: Scalable Diffusion Policy with Multi-stage Training for Language-driven Physics-based Humanoid Control cs.GR · 2026-05-21 · unreviewed · ref 57

Controlling vision-language models for multi-task image restoration

fields

years

verdicts

representative citing papers

citing papers explorer