EditRefiner uses a perception-reasoning-action-evaluation agent loop and the EditFHF-15K human feedback dataset to refine text-guided image edits more accurately than prior methods.
arXiv preprint arXiv:2410.17809 (2024) 14
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 6representative citing papers
ForenAgent lets MLLMs create and iteratively improve low-level Python tools for image forgery detection via a two-stage training pipeline and a new 100k-image benchmark dataset.
LumiVideo deploys an LLM-based agent with RAG and Tree of Thoughts to generate ASC-CDL parameters and 3D LUTs for automatic cinematic color grading from raw log video, approaching expert quality.
Q-DeepSight proposes a think-with-image multimodal CoT framework trained via RL with perceptual curriculum rewards and evidence gradient filtering to achieve SOTA IQA performance and enable training-free perceptual refinement in image generation.
TPGDiff introduces hierarchical triple-prior guidance in a diffusion network, placing degradation priors throughout, structural priors in shallow layers, and semantic priors in deep layers for improved all-in-one image restoration.
A survey categorizing weather restoration methods for smart transportation into traditional priors, CNNs, transformers, diffusion models, and VLMs, with discussions on datasets, night-time challenges, and future directions.
citing papers explorer
-
EditRefiner: A Human-Aligned Agentic Framework for Image Editing Refinement
EditRefiner uses a perception-reasoning-action-evaluation agent loop and the EditFHF-15K human feedback dataset to refine text-guided image edits more accurately than prior methods.
-
Code-in-the-Loop Forensics: Agentic Tool Use for Image Forgery Detection
ForenAgent lets MLLMs create and iteratively improve low-level Python tools for image forgery detection via a two-stage training pipeline and a new 100k-image benchmark dataset.
-
LumiVideo: An Intelligent Agentic System for Video Color Grading
LumiVideo deploys an LLM-based agent with RAG and Tree of Thoughts to generate ASC-CDL parameters and 3D LUTs for automatic cinematic color grading from raw log video, approaching expert quality.
-
Q-DeepSight: Incentivizing Thinking with Images for Image Quality Assessment and Refinement
Q-DeepSight proposes a think-with-image multimodal CoT framework trained via RL with perceptual curriculum rewards and evidence gradient filtering to achieve SOTA IQA performance and enable training-free perceptual refinement in image generation.
-
TPGDiff: Hierarchical Triple-Prior Guided Diffusion for Image Restoration
TPGDiff introduces hierarchical triple-prior guidance in a diffusion network, placing degradation priors throughout, structural priors in shallow layers, and semantic priors in deep layers for improved all-in-one image restoration.
-
Clear Roads, Clear Vision: Advancements in Multi-Weather Restoration for Smart Transportation
A survey categorizing weather restoration methods for smart transportation into traditional priors, CNNs, transformers, diffusion models, and VLMs, with discussions on datasets, night-time challenges, and future directions.