Picabench: How far are we from physically realistic image editing?

Yuandong Pu, Le Zhuo, Songhao Han, Jinbo Xing, Kaiwen Zhu, Shuo Cao, Bin Fu, Si Liu, Hongsheng Li, Yu Qiao, Wenlong Zhang, Xi Chen, Yihao Liu · 2025 · arXiv 2510.17681

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

read on arXiv browse 7 citing papers

citation-role summary

baseline 1 method 1

citation-polarity summary

baseline 1 use method 1

representative citing papers

Do Image Editing Models Understand Lighting?

cs.CV · 2026-06-25 · unverdicted · novelty 7.0

New 3DLP benchmark with real-world 1K HDR pairs shows state-of-the-art image editing models vary in physical lighting consistency, with best models close to reality but error-prone in low-light regions.

Editor's Choice: Evaluating Abstract Intent in Image Editing through Atomic Entity Analysis

cs.CV · 2026-05-14 · unverdicted · novelty 7.0

Presents Entity-Rubrics and AbstractEdit benchmark to measure image editing models on abstract intent, finding standard models struggle to balance edit intent with image preservation.

Do-Undo Bench: Reversibility for Action Understanding in Image Generation

cs.CV · 2025-12-15 · unverdicted · novelty 7.0

Do-Undo Bench is a new evaluation task and dataset that forces models to simulate forward action effects and then undo them to measure genuine action understanding in image generation.

Taming I2V models for Image HOI Editing: A Cognitive Benchmark and Agentic Self-Correcting Framework

cs.CV · 2026-06-17 · unverdicted · novelty 6.0

Introduces HOI-Edit benchmark with HOI-Eval metric and SCPE self-correcting framework leveraging I2V models for competitive HOI image editing performance.

LithoBench: Benchmarking Large Multimodal Models for Remote-Sensing Lithology Interpretation

cs.CV · 2026-05-08 · conditional · novelty 6.0

LithoBench is a new multi-level benchmark showing that existing large multimodal models have substantial limitations in geological semantic understanding for remote sensing lithology interpretation.

DataEvolver: Let Your Data Build and Improve Itself via Goal-Driven Loop Agents

cs.AI · 2026-05-03 · unverdicted · novelty 5.0

DataEvolver introduces a reusable framework with generation-time self-correction and validation-time self-expansion loops that improves visual datasets, shown to outperform baselines on an object-rotation task.

Making Image Editing Easier via Adaptive Task Reformulation with Agentic Executions

cs.CV · 2026-04-17

citing papers explorer

Showing 1 of 1 citing paper after filters.

LithoBench: Benchmarking Large Multimodal Models for Remote-Sensing Lithology Interpretation cs.CV · 2026-05-08 · conditional · none · ref 38
LithoBench is a new multi-level benchmark showing that existing large multimodal models have substantial limitations in geological semantic understanding for remote sensing lithology interpretation.

Picabench: How far are we from physically realistic image editing?

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer