pith. sign in

hub

ACE++: Instruction-Based Image Creation and Editing via Context- Aware Content Filling

12 Pith papers cite this work. Polarity classification is still indexing.

12 Pith papers citing it

hub tools

citation-role summary

background 4

citation-polarity summary

fields

cs.CV 12

years

2026 7 2025 5

verdicts

UNVERDICTED 12

roles

background 4

polarities

background 4

clear filters

representative citing papers

VACE: All-in-One Video Creation and Editing

cs.CV · 2025-03-10 · unverdicted · novelty 7.0

VACE unifies reference-to-video generation, video-to-video editing, and masked video-to-video editing in one Diffusion Transformer framework using a Video Condition Unit for inputs and a Context Adapter for task injection.

OmniGen2: Towards Instruction-Aligned Multimodal Generation

cs.CV · 2025-06-23 · unverdicted · novelty 5.0

OmniGen2 introduces a unified generative model with two distinct decoding pathways and a decoupled image tokenizer that achieves competitive results on text-to-image and editing benchmarks plus state-of-the-art consistency among open-source models on the new OmniContext benchmark.

Wan: Open and Advanced Large-Scale Video Generative Models

cs.CV · 2025-03-26 · unverdicted · novelty 5.0

Wan releases open 1.3B and 14B video diffusion models claiming superior performance over open-source and commercial baselines across multiple tasks with consumer-grade efficiency.

Step1X-Edit: A Practical Framework for General Image Editing

cs.CV · 2025-04-24 · unverdicted · novelty 4.0

Step1X-Edit integrates a multimodal LLM with a diffusion decoder, trained on a custom high-quality dataset, to deliver image editing performance that surpasses open-source baselines and approaches proprietary models on the new GEdit-Bench.

citing papers explorer

Showing 5 of 5 citing papers after filters.

  • In-Context Edit: Enabling Instructional Image Editing with In-Context Generation in Large Scale Diffusion Transformer cs.CV · 2025-04-29 · unverdicted · none · ref 7

    ICEdit achieves state-of-the-art instructional image editing in Diffusion Transformers via in-context generation, requiring only 0.1% of prior training data and 1% trainable parameters.

  • VACE: All-in-One Video Creation and Editing cs.CV · 2025-03-10 · unverdicted · none · ref 41

    VACE unifies reference-to-video generation, video-to-video editing, and masked video-to-video editing in one Diffusion Transformer framework using a Video Condition Unit for inputs and a Context Adapter for task injection.

  • OmniGen2: Towards Instruction-Aligned Multimodal Generation cs.CV · 2025-06-23 · unverdicted · none · ref 48

    OmniGen2 introduces a unified generative model with two distinct decoding pathways and a decoupled image tokenizer that achieves competitive results on text-to-image and editing benchmarks plus state-of-the-art consistency among open-source models on the new OmniContext benchmark.

  • Wan: Open and Advanced Large-Scale Video Generative Models cs.CV · 2025-03-26 · unverdicted · none · ref 35

    Wan releases open 1.3B and 14B video diffusion models claiming superior performance over open-source and commercial baselines across multiple tasks with consumer-grade efficiency.

  • Step1X-Edit: A Practical Framework for General Image Editing cs.CV · 2025-04-24 · unverdicted · none · ref 34

    Step1X-Edit integrates a multimodal LLM with a diffusion decoder, trained on a custom high-quality dataset, to deliver image editing performance that surpasses open-source baselines and approaches proprietary models on the new GEdit-Bench.