Do-Undo Bench is a new evaluation task and dataset that forces models to simulate forward action effects and then undo them to measure genuine action understanding in image generation.
PICABench: How far are we from physically realistic image editing?
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
representative citing papers
LithoBench is a new multi-level benchmark showing that existing large multimodal models have substantial limitations in geological semantic understanding for remote sensing lithology interpretation.
An MLLM agent reformulates image editing tasks into executable operation sequences to improve reliability on challenging cases across existing generative backbones.
DataEvolver introduces a reusable framework with generation-time self-correction and validation-time self-expansion loops that improves visual datasets, shown to outperform baselines on an object-rotation task.
citing papers explorer
-
Do-Undo Bench: Reversibility for Action Understanding in Image Generation
Do-Undo Bench is a new evaluation task and dataset that forces models to simulate forward action effects and then undo them to measure genuine action understanding in image generation.
-
LithoBench: Benchmarking Large Multimodal Models for Remote-Sensing Lithology Interpretation
LithoBench is a new multi-level benchmark showing that existing large multimodal models have substantial limitations in geological semantic understanding for remote sensing lithology interpretation.
-
Making Image Editing Easier via Adaptive Task Reformulation with Agentic Executions
An MLLM agent reformulates image editing tasks into executable operation sequences to improve reliability on challenging cases across existing generative backbones.
-
DataEvolver: Let Your Data Build and Improve Itself via Goal-Driven Loop Agents
DataEvolver introduces a reusable framework with generation-time self-correction and validation-time self-expansion loops that improves visual datasets, shown to outperform baselines on an object-rotation task.