Exploring clip for assessing the look and feel of images

Jianyi Wang, Kelvin CK Chan, Chen Change Loy · 2023

8 Pith papers cite this work. Polarity classification is still indexing.

8 Pith papers citing it

browse 8 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

SenseBench: A Benchmark for Remote Sensing Low-Level Visual Perception and Description in Large Vision-Language Models

cs.CV · 2026-05-11 · unverdicted · novelty 8.0

SenseBench is the first physics-based benchmark with 10K+ instances and dual protocols to evaluate VLMs on remote sensing low-level perception and diagnostic description, revealing domain bias and specific failure modes.

Bringing Multimodal Large Language Models to Infrared-Visible Image Fusion Quality Assessment

cs.CV · 2026-05-07 · unverdicted · novelty 7.0 · 2 refs

FuScore uses MLLMs to output continuous quality scores for IVIF images, constructs per-image soft labels from four sub-dimensions, and applies a tripartite objective with Thurstone fidelity to achieve higher correlation with human preferences than prior metrics.

Chain-of-Zoom: Extreme Super-Resolution via Scale Autoregression and Preference Alignment

cs.CV · 2025-05-24 · unverdicted · novelty 7.0

Chain-of-Zoom factorizes extreme super-resolution into an autoregressive sequence of intermediate scales using a reused backbone model plus GRPO-tuned multi-scale VLM prompts.

SEGA: Spectral-Energy Guided Attention for Resolution Extrapolation in Diffusion Transformers

cs.CV · 2026-05-21 · unverdicted · novelty 6.0

SEGA adaptively scales RoPE attention components using spectral-energy guidance from the latent to improve structural coherence and fine details in high-resolution DiT synthesis.

DiRotQ: Rotation-Aware Quantization for 4-bit Diffusion Transformers

cs.CV · 2026-05-16 · unverdicted · novelty 6.0

DiRotQ uses PCA-based rotation-aware activation quantization combined with GPTQ to achieve better FID and PSNR in 4-bit diffusion transformers than prior methods like SVDQuant.

OPERA: An Agent for Image Restoration with End-to-End Joint Planning-Execution Optimization

cs.CV · 2026-05-21 · unverdicted · novelty 5.0

OPERA jointly optimizes restoration planning via RL over tool compositions and execution via agent-guided co-training of tools, claiming consistent gains over all-in-one models and prior agent methods on multi-degradation benchmarks.

Adding Thermal Awareness to Visual Systems in Real-Time via Distilled Diffusion Models

cs.CV · 2026-05-07 · unverdicted · novelty 5.0

FusionProxy is a distilled diffusion-based fusion module that adds thermal awareness to RGB vision systems in real time as an independent plug-and-play component.

EvoIR-Agent: Self-Evolving Image Restoration Agentic System via Experience-Driven Learning

cs.CV · 2026-05-21

citing papers explorer

Showing 8 of 8 citing papers.

SenseBench: A Benchmark for Remote Sensing Low-Level Visual Perception and Description in Large Vision-Language Models cs.CV · 2026-05-11 · unverdicted · none · ref 7
SenseBench is the first physics-based benchmark with 10K+ instances and dual protocols to evaluate VLMs on remote sensing low-level perception and diagnostic description, revealing domain bias and specific failure modes.
Bringing Multimodal Large Language Models to Infrared-Visible Image Fusion Quality Assessment cs.CV · 2026-05-07 · unverdicted · none · ref 21 · 2 links
FuScore uses MLLMs to output continuous quality scores for IVIF images, constructs per-image soft labels from four sub-dimensions, and applies a tripartite objective with Thurstone fidelity to achieve higher correlation with human preferences than prior metrics.
Chain-of-Zoom: Extreme Super-Resolution via Scale Autoregression and Preference Alignment cs.CV · 2025-05-24 · unverdicted · none · ref 42
Chain-of-Zoom factorizes extreme super-resolution into an autoregressive sequence of intermediate scales using a reused backbone model plus GRPO-tuned multi-scale VLM prompts.
SEGA: Spectral-Energy Guided Attention for Resolution Extrapolation in Diffusion Transformers cs.CV · 2026-05-21 · unverdicted · none · ref 41
SEGA adaptively scales RoPE attention components using spectral-energy guidance from the latent to improve structural coherence and fine details in high-resolution DiT synthesis.
DiRotQ: Rotation-Aware Quantization for 4-bit Diffusion Transformers cs.CV · 2026-05-16 · unverdicted · none · ref 69
DiRotQ uses PCA-based rotation-aware activation quantization combined with GPTQ to achieve better FID and PSNR in 4-bit diffusion transformers than prior methods like SVDQuant.
OPERA: An Agent for Image Restoration with End-to-End Joint Planning-Execution Optimization cs.CV · 2026-05-21 · unverdicted · none · ref 33
OPERA jointly optimizes restoration planning via RL over tool compositions and execution via agent-guided co-training of tools, claiming consistent gains over all-in-one models and prior agent methods on multi-degradation benchmarks.
Adding Thermal Awareness to Visual Systems in Real-Time via Distilled Diffusion Models cs.CV · 2026-05-07 · unverdicted · none · ref 28
FusionProxy is a distilled diffusion-based fusion module that adds thermal awareness to RGB vision systems in real time as an independent plug-and-play component.
EvoIR-Agent: Self-Evolving Image Restoration Agentic System via Experience-Driven Learning cs.CV · 2026-05-21 · unreviewed · ref 56

Exploring clip for assessing the look and feel of images

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer