Presents Reason50K dataset and ReasonBrain framework for hypothetical instruction-based image editing that requires physical, temporal, causal, and story reasoning.
Mige: A unified framework for multimodal instruction-based image generation and editing
2 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CV 2years
2025 2verdicts
UNVERDICTED 2representative citing papers
OmniGen2 introduces a unified generative model with two distinct decoding pathways and a decoupled image tokenizer that achieves competitive results on text-to-image and editing benchmarks plus state-of-the-art consistency among open-source models on the new OmniContext benchmark.
citing papers explorer
-
Reasoning to Edit: Hypothetical Instruction-Based Image Editing with Visual Reasoning
Presents Reason50K dataset and ReasonBrain framework for hypothetical instruction-based image editing that requires physical, temporal, causal, and story reasoning.
-
OmniGen2: Towards Instruction-Aligned Multimodal Generation
OmniGen2 introduces a unified generative model with two distinct decoding pathways and a decoupled image tokenizer that achieves competitive results on text-to-image and editing benchmarks plus state-of-the-art consistency among open-source models on the new OmniContext benchmark.