{"paper":{"title":"HiDream-I1: A High-Efficient Image Generative Foundation Model with Sparse Diffusion Transformer","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"HiDream-I1 deploys a 17B-parameter sparse Diffusion Transformer that delivers state-of-the-art images in seconds.","cross_cats":["cs.MM"],"primary_cat":"cs.CV","authors_text":"Bo Zhao, Fengbin Gao, Fuchen Long, Jianzhuang Pan, Jingwen Chen, Kai Yu, Peihan Xu, Qi Cai, Rui Tian, Siyu Wang, Tao Mei, Ting Yao, Wenxuan Chen, Yang Chen, Yehao Li, Yiheng Zhang, Yimeng Wang, Yingwei Pan, Yi Peng, Zhaofan Qiu, Zijian Gong, Ziwei Feng","submitted_at":"2025-05-28T17:59:15Z","abstract_excerpt":"Recent advancements in image generative foundation models have prioritized quality improvements but often at the cost of increased computational complexity and inference latency. To address this critical trade-off, we introduce HiDream-I1, a new open-source image generative foundation model with 17B parameters that achieves state-of-the-art image generation quality within seconds. HiDream-I1 is constructed with a new sparse Diffusion Transformer (DiT) structure. Specifically, it starts with a dual-stream decoupled design of sparse DiT with dynamic Mixture-of-Experts (MoE) architecture, in whic"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"HiDream-I1 ... achieves state-of-the-art image generation quality within seconds.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"The dual-stream decoupled sparse DiT with dynamic MoE architecture delivers the claimed quality and speed without hidden trade-offs in training cost or generalization that are not visible in the abstract.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"HiDream-I1 introduces a sparse DiT architecture with dual-stream processing and MoE for efficient state-of-the-art text-to-image generation, plus extensions to editing and an interactive agent.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"HiDream-I1 deploys a 17B-parameter sparse Diffusion Transformer that delivers state-of-the-art images in seconds.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"309e47e6ce805622a5ce7e8da766d1b120d9eceeed8cdc9c96d4d2e483a64fde"},"source":{"id":"2505.22705","kind":"arxiv","version":1},"verdict":{"id":"cfd22377-816a-4c29-8923-7a85dc9fb564","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-16T17:09:22.780761Z","strongest_claim":"HiDream-I1 ... achieves state-of-the-art image generation quality within seconds.","one_line_summary":"HiDream-I1 introduces a sparse DiT architecture with dual-stream processing and MoE for efficient state-of-the-art text-to-image generation, plus extensions to editing and an interactive agent.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"The dual-stream decoupled sparse DiT with dynamic MoE architecture delivers the claimed quality and speed without hidden trade-offs in training cost or generalization that are not visible in the abstract.","pith_extraction_headline":"HiDream-I1 deploys a 17B-parameter sparse Diffusion Transformer that delivers state-of-the-art images in seconds."},"references":{"count":0,"sample":[],"resolved_work":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57","internal_anchors":0},"formal_canon":{"evidence_count":2,"snapshot_sha256":"45d4632ed6860caff02a1ed7d21c204cdeed86ada1f3f4e2aced70ad40f52e45"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}