EditRefiner uses a perception-reasoning-action-evaluation agent loop and the EditFHF-15K human feedback dataset to refine text-guided image edits more accurately than prior methods.
arXiv preprint arXiv:2505.16707 (2025)
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CV 5verdicts
UNVERDICTED 5representative citing papers
CAMEO uses coordinated agents for planning, prompting, generation, and quality feedback to achieve higher structural reliability in conditional image editing than single-step models.
DLEBench is the first benchmark for small-scale object editing in instruction-based image editing models, using 1889 samples, seven instruction types, and a dual-mode evaluation protocol to reveal performance gaps in 10 tested models.
DDA-Thinker decouples planning from generation and applies dual-atomic RL with checklist-based rewards to boost reasoning in image editing, yielding competitive results on RISE-Bench and KRIS-Bench.
BAGEL is a unified decoder-only model that develops emerging complex multimodal reasoning abilities after pretraining on large-scale interleaved data and outperforms prior open-source unified models.
citing papers explorer
-
EditRefiner: A Human-Aligned Agentic Framework for Image Editing Refinement
EditRefiner uses a perception-reasoning-action-evaluation agent loop and the EditFHF-15K human feedback dataset to refine text-guided image edits more accurately than prior methods.
-
CAMEO: A Conditional and Quality-Aware Multi-Agent Image Editing Orchestrator
CAMEO uses coordinated agents for planning, prompting, generation, and quality feedback to achieve higher structural reliability in conditional image editing than single-step models.
-
DLEBench: Evaluating Small-scale Object Editing Ability for Instruction-based Image Editing Model
DLEBench is the first benchmark for small-scale object editing in instruction-based image editing models, using 1889 samples, seven instruction types, and a dual-mode evaluation protocol to reveal performance gaps in 10 tested models.
-
DDA-Thinker: Decoupled Dual-Atomic Reinforcement Learning for Reasoning-Driven Image Editing
DDA-Thinker decouples planning from generation and applies dual-atomic RL with checklist-based rewards to boost reasoning in image editing, yielding competitive results on RISE-Bench and KRIS-Bench.
-
Emerging Properties in Unified Multimodal Pretraining
BAGEL is a unified decoder-only model that develops emerging complex multimodal reasoning abilities after pretraining on large-scale interleaved data and outperforms prior open-source unified models.