VLM-to-DiT alignment in video editing models acts as a semantic bottleneck that degrades fine-grained structural semantics, demonstrated via a new diagnostic dataset and protocol on relation-based edits.
Editval: Benchmarking diffusion based text-guided image editing methods
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CV 6representative citing papers
UniEditBench unifies image and video editing evaluation with a nine-plus-eight operation taxonomy and cost-effective 4B/8B distilled MLLM evaluators that align with human judgments.
An MLLM agent reformulates image editing tasks into executable operation sequences to improve reliability on challenging cases across existing generative backbones.
DS-IEQA jointly learns evaluation criteria via feedback-driven prompt optimization and continuous score modeling via token-decoupled distance regression, ranking 4th in the 2026 NTIRE X-AIGC Quality Assessment Track 2 without extra training data.
VDE Bench is a new human-annotated dataset and OCR-based evaluation framework for measuring image editing model performance on bilingual dense visual documents.
ImgEdit supplies 1.2 million curated edit pairs and a three-part benchmark that let a VLM-based model outperform prior open-source editors on adherence, quality, and detail preservation.
citing papers explorer
-
What Semantics Survive the Connector? Diagnosing VLM-to-DiT Alignment in Video Editing
VLM-to-DiT alignment in video editing models acts as a semantic bottleneck that degrades fine-grained structural semantics, demonstrated via a new diagnostic dataset and protocol on relation-based edits.
-
UniEditBench: A Unified and Cost-Effective Benchmark for Image and Video Editing via Distilled MLLMs
UniEditBench unifies image and video editing evaluation with a nine-plus-eight operation taxonomy and cost-effective 4B/8B distilled MLLM evaluators that align with human judgments.
-
Making Image Editing Easier via Adaptive Task Reformulation with Agentic Executions
An MLLM agent reformulates image editing tasks into executable operation sequences to improve reliability on challenging cases across existing generative backbones.
-
Redefining Quality Criteria and Distance-Aware Score Modeling for Image Editing Assessment
DS-IEQA jointly learns evaluation criteria via feedback-driven prompt optimization and continuous score modeling via token-decoupled distance regression, ranking 4th in the 2026 NTIRE X-AIGC Quality Assessment Track 2 without extra training data.
-
VDE Bench: Evaluating The Capability of Image Editing Models to Modify Visual Documents
VDE Bench is a new human-annotated dataset and OCR-based evaluation framework for measuring image editing model performance on bilingual dense visual documents.
-
ImgEdit: A Unified Image Editing Dataset and Benchmark
ImgEdit supplies 1.2 million curated edit pairs and a three-part benchmark that let a VLM-based model outperform prior open-source editors on adherence, quality, and detail preservation.