Seeing the Poem: Image-Semantic Detection of AI-Generated Modern Chinese Poetry with MLLMs
Pith reviewed 2026-05-22 05:38 UTC · model grok-4.3
The pith
Adding images that reflect poem content improves LLM detection of AI-generated modern Chinese poetry
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By incorporating images that reflect the content of the poetry, the image-semantic guided method allows LLMs to integrate complementary information on meaning, imagery, and feeling, leading to more accurate detection of AI-generated modern Chinese poetry compared to text-only methods.
What carries the argument
Image-semantic guided poetry detection method that forms complementary judgments from poem text and matching images
If this is right
- LLM detectors outperform text-only baselines on multiple AI-generated poetry datasets
- The Gemini-based detector reaches state-of-the-art Macro-F1 of 85.65%
- The method surpasses the best traditional detector RoBERTa
- Performance gains are observed across different LLMs
Where Pith is reading between the lines
- This technique could extend to detecting AI content in other image-rich creative fields such as visual art descriptions or song lyrics.
- Poetry detection might benefit from always generating an illustrative image as a first step before analysis.
- Future detectors may need to account for how image generation models themselves introduce patterns that could be exploited or masked.
Load-bearing premise
Images can be generated or selected to match the poetry's meaning, imagery, and feeling closely enough to aid detection without introducing their own biases or artifacts.
What would settle it
Running the image-semantic detector and a plain-text detector on a new collection of human and AI-written modern Chinese poems and finding no accuracy advantage for the image version.
Figures
read the original abstract
Previous detection studies have shown that LLMs cannot be effectively used as detectors, but these studies have not addressed modern Chinese poetry. Moreover, no relevant research has explored the performance of LLMs in detecting modern Chinese poetry. This paper evaluates and enhances the performance of LLMs as detectors for modern Chinese poetry, and proposes an image-semantic guided poetry detection method. Compared with traditional detection approaches, our method innovatively incorporates images that reflect the content of the poetry. Through example-driven approaches, our method effectively integrates information such as meaning, imagery, and feeling from the image, then forms a complementary judgment with the poem text. Experimental results demonstrate that the LLM detectors based on our method outperform baseline detectors based on plain text, and even surpass the best-performing traditional detector, RoBERTa. The Gemini detector using our method achieves a Macro-F1 score of 85.65%, reaching the state-of-the-art level. The performance improvements of different LLM detectors on multiple LLMs-generated data prove the effectiveness of our method.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes an image-semantic guided detection method for AI-generated modern Chinese poetry that augments LLM/MLLM prompts with images reflecting poem content (meaning, imagery, feeling). It reports that this multimodal approach yields higher detection performance than text-only LLM baselines and surpasses the RoBERTa baseline, with the Gemini-based detector reaching 85.65% Macro-F1.
Significance. If the performance gain is shown to arise from genuine semantic complementarity rather than low-level image artifacts, the work would provide a concrete multimodal technique for detecting generated creative text and could inform future detectors that exploit imagery in poetry and similar domains.
major comments (2)
- [Method / Experimental Setup] The central experimental claim (abstract and results) that images supply complementary semantic information rests on an unspecified image generation or selection process. No details are given on the text-to-image model, prompt construction, or any control condition that would isolate semantic content from correlated artifacts (texture inconsistencies, prompt leakage). This directly affects interpretability of the 85.65% Macro-F1 and the outperformance over RoBERTa.
- [Experiments] Dataset construction is described only at a high level; the manuscript does not report how human-written vs. LLM-generated poems were collected, balanced, or split, nor any statistical significance tests or error analysis that would substantiate the cross-model performance gains.
minor comments (2)
- [Method] Notation for the image-semantic fusion step could be clarified with a short pseudocode or diagram to show exactly how image features are combined with text in the MLLM prompt.
- [Abstract / Method] The abstract states that the method 'forms a complementary judgment'; a concrete example of a prompt template and the resulting MLLM output would help readers reproduce the integration step.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback, which has helped us identify areas where the manuscript requires greater clarity and detail. We address each major comment below and have prepared revisions to strengthen the experimental description and reproducibility.
read point-by-point responses
-
Referee: [Method / Experimental Setup] The central experimental claim (abstract and results) that images supply complementary semantic information rests on an unspecified image generation or selection process. No details are given on the text-to-image model, prompt construction, or any control condition that would isolate semantic content from correlated artifacts (texture inconsistencies, prompt leakage). This directly affects interpretability of the 85.65% Macro-F1 and the outperformance over RoBERTa.
Authors: We agree that the original manuscript described the image-augmentation process at too high a level, limiting readers' ability to assess whether gains derive from semantic content or from low-level artifacts. In the revised version we have added a dedicated subsection that specifies the text-to-image model, the exact prompt templates used to translate poem semantics into images, and a control experiment that compares performance with semantically faithful images against images generated from shuffled or artifact-only prompts. These additions directly address the interpretability concern raised. revision: yes
-
Referee: [Experiments] Dataset construction is described only at a high level; the manuscript does not report how human-written vs. LLM-generated poems were collected, balanced, or split, nor any statistical significance tests or error analysis that would substantiate the cross-model performance gains.
Authors: We acknowledge that the dataset section was insufficiently detailed. The revised manuscript now contains an expanded data section that reports the sources of human-written poems, the specific LLMs and generation settings used to create the AI poems, the balancing and splitting procedures, and the final dataset sizes. We have also added statistical significance testing (paired t-tests and McNemar's test) for all reported improvements and a concise error-analysis subsection that categorizes the remaining misclassifications. revision: yes
Circularity Check
Empirical evaluation with external baselines is self-contained
full rationale
The paper describes an empirical method that augments poem text with images reflecting its content and evaluates LLM-based detectors (including Gemini) against plain-text baselines and the external RoBERTa model, reporting a Macro-F1 of 85.65%. No equations, fitted parameters renamed as predictions, self-definitional constructs, or load-bearing self-citations appear in the abstract or method framing. The performance claims rest on direct experimental comparisons to independent detectors rather than reducing to the method's own inputs by construction.
Axiom & Free-Parameter Ledger
free parameters (1)
- Image generation or selection process
axioms (1)
- domain assumption Images can capture and convey meaning, imagery, and feeling from poetry text in a way useful for AI detection
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
proposes an image-semantic guided poetry detection method... integrates information such as meaning, imagery, and feeling from the image
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Gemini detector using our method achieves a Macro-F1 score of 85.65%
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[3]
James Betker, Gabriel Goh, Li Jing, Tim Brooks, Jianfeng Wang, Linjie Li, Long Ouyang, Juntang Zhuang, Joyce Lee, Yufei Guo, and 1 others. 2023. https://cdn.openai.com/papers/dall-e-3.pdf Improving image generation with better captions . Computer Science., 2(3):8
work page 2023
-
[4]
Stefan Blohm, Valentin Wagner, Matthias Schlesewsky, and Winfried Menninghaus. 2018. https://www.sciencedirect.com/science/article/abs/pii/S0304422X17300864?casa_token=fZLWD530VLEAAAAA:6hL3soz_GX_rtn_KIC2gg3YIGOcagdgzEMJFMWje77zdeWHbsvoRzsf8I0UlPIWWAYQS_lqgtLaY Sentence judgments and the grammar of poetry: Linking linguistic structure and poetic effect . ...
work page 2018
-
[6]
Tianqi Chen. 2016. Xgboost: A scalable tree boosting system. Cornell University
work page 2016
-
[8]
Zhongyi Chen. 2012. https://xueshu.baidu.com/usercenter/paper/show?paperid=c974e7f2cdc678422498efa431cdf15e&site=xueshu_se Why "twist the neck of grammar" - a study of modern poetic rhetoric . Journal of Central China Normal University: Humanities and Social Sciences Edition, 51(1):7
work page 2012
-
[10]
Kohinoor Darda, Marion Carre, and Emily Cross. 2023. https://royalsocietypublishing.org/doi/10.1098/rsos.220915 Value attributed to text-based archives generated by artificial intelligence . Royal Society Open Science, 10(2):220915
-
[11]
Cheng Deng. 2007. https://xueshu.baidu.com/usercenter/paper/show?paperid=9e2401da68a441c1d5e3c691c144a586&site=xueshu_se Dilemma and solution: Reflections on current new poetry . Literary Review, (3):4
work page 2007
-
[12]
Sebastian Gehrmann, Hendrik Strobelt, and Alexander Rush. 2019. https://aclanthology.org/P19-3019/ GLTR : Statistical detection and visualization of generated text . In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 111--116. Association for Computational Linguistics
work page 2019
- [15]
-
[16]
Min Hu. 2003. https://xueshu.baidu.com/ndscholar/browse/detail?paperid=8777ea800ec979a294010a8b20201a30&site=xueshu_se On the imagery beauty of scene-emotion blending in classical chinese poetry . Jiangxi Social Sciences, (4):3
work page 2003
-
[17]
Junming Huo. 2020. https://xueshu.baidu.com/usercenter/paper/show?paperid=1q0y0rk03s1v0e00vc2v0840f8747493&site=xueshu_se Clone li bai and 100 trillion poems - the "quasi-text" production and possible prospects of ai poetry . Southern Literature, (4):5
work page 2020
-
[19]
Maurice Jakesch, Jeffrey T Hancock, and Mor Naaman. 2023. https://www.pnas.org/doi/10.1073/pnas.2208839120 Human heuristics for ai-generated language are flawed . Proceedings of the National Academy of Sciences, 120(11):e2208839120
-
[20]
H Tin Kam. 1995. Random decision forests. proceedings of 3rd international conference on document analysis and recognition
work page 1995
- [21]
- [22]
-
[23]
Tian Lan, Jiang Li, Yemin Wang, Xu Liu, Xiangdong Su, and Guanglai Gao. 2025 a . https://aclanthology.org/2025.emnlp-main.105/ F B ench: An open-ended fairness evaluation benchmark for LLM s with factuality considerations . In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 2031--2046. Association for Computat...
work page 2025
-
[24]
Tian Lan, Xiangdong Su, Xu Liu, Ruirui Wang, Ke Chang, Jiang Li, and Guanglai Gao. 2025 b . https://aclanthology.org/2025.findings-acl.313/ M c BE : A multi-task C hinese bias evaluation benchmark for large language models . In Findings of the Association for Computational Linguistics: ACL 2025, pages 6033--6056
work page 2025
-
[25]
Haotian Li, Jiatao Zhu, Sichen Cao, Xiangyu Li, Jiajun Zeng, and Peng Wang. 2021. https://link.springer.com/chapter/10.1007/978-3-030-73197-7_43 Poetic expression through scenery: sentimental chinese classical poetry generation from images . In International Conference on Database Systems for Advanced Applications, pages 629--637. Springer
-
[27]
Dianqing Lin, Aruukhan, Hongxu Hou, Shuo Sun, Wei Chen, Yichen Yang, and Guodong Shi. 2025. https://aclanthology.org/2025.emnlp-main.1179/ Can large language models translate unseen languages in underrepresented scripts? In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 23137--23150. Association for Computati...
work page 2025
-
[29]
Bei Liu, Jianlong Fu, Makoto P Kato, and Masatoshi Yoshikawa. 2018. https://dl.acm.org/doi/abs/10.1145/3240508.3240587 Beyond narrative description: Generating poetry from images by multi-adversarial training . In Proceedings of the 26th ACM international conference on Multimedia, pages 783--791
-
[31]
Hanjia Lyu, Jiebo Luo, Jian Kang, and Allison Koenecke. 2025. https://doi.org/10.1145/3715275.3732182 Characterizing bias: Benchmarking large language models in simplified versus traditional chinese . In Proceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency, page 2815–2846. Association for Computing Machinery
- [33]
-
[34]
OpenAI. 2025. https://platform.openai.com/docs/models/gpt-image-1 Gpt-image-1 model documentation
work page 2025
-
[35]
Brian Porter and Edouard Machery. 2024. https://www.nature.com/articles/s41598-024-76900-1 Ai-generated poetry is indistinguishable from human-written poetry and is rated more favorably . Scientific Reports, 14(1):26133
work page 2024
-
[36]
Xuelian Ren, Xiaolong Chai, and Mingzhi Mao. 2023. https://dl.acm.org/doi/abs/10.1145/3656766.3656790 Generating chinese poetry from images based on deep learning . In Proceedings of the 2023 3rd International Conference on Big Data, Artificial Intelligence and Risk Management, pages 134--138
-
[40]
Wai Lei Song, Haoyun Xu, Derek F. Wong, Runzhe Zhan, Lidia S. Chao, and Shanshan Wang. 2023. https://aclanthology.org/2023.mtsummit-research.27/ Towards zero-shot multilingual poetry translation . In Proceedings of Machine Translation Summit XIX, Vol. 1: Research Track, pages 324--335. Asia-Pacific Association for Machine Translation
work page 2023
-
[42]
Deming Wang. 2003. https://xueshu.baidu.com/ndscholar/browse/search?wd=中国古代早期诗歌创作中借景抒情手法形成的三个阶段&paperid=e6e1cb61ab39614e01a1e14a9bfbe90e&site=xueshu_se The three stages in the formation of the technique of expressing emotions through describing scenes in early ancient chinese poetry creation . Journal of Baise University, (5):42--46
work page 2003
-
[43]
Shanshan Wang, Derek Wong, Jingming Yao, and Lidia Chao. 2024. https://aclanthology.org/2024.acl-long.756/ What is the best way for chatgpt to translate poetry? In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 14025--14043
work page 2024
-
[44]
Wong, Jingming Yao, and Lidia S
Shanshan Wang, Derek F. Wong, Jingming Yao, and Lidia S. Chao. 2026. https://aclanthology.org/2026.findings-eacl.216/ Can C hat GPT really understand M odern C hinese poetry? In Findings of the A ssociation for C omputational L inguistics: EACL 2026 , pages 4152--4162. Association for Computational Linguistics
work page 2026
-
[45]
Wong, Jingming Yao, and Lidia S
Shanshan Wang, Junchao Wu, Fengying Ye, Derek F. Wong, Jingming Yao, and Lidia S. Chao. 2025. https://aclanthology.org/2025.findings-emnlp.507/ Benchmarking the detection of LLM s-generated M odern C hinese poetry . In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 9533--9552. Association for Computational Linguistics
work page 2025
-
[47]
Junchao Wu, Shu Yang, Runzhe Zhan, Yulin Yuan, Lidia Sam Chao, and Derek Fai Wong. 2025 b . https://aclanthology.org/2025.cl-1.8/ A survey on llm-generated text detection: Necessity, methods, and future directions . Computational Linguistics, pages 1--66
work page 2025
-
[49]
Zhengrong Xiong. 2001. https://xueshu.baidu.com/ndscholar/browse/detail?paperid=4c41d54cda2477f3b9a97c0b5bd6922c&site=xueshu_se Why so many spring sorrows and autumn regrets: A preliminary exploration of the creative mentality of ancient chinese poets . Journal of Huainan Normal University, (4):2
work page 2001
-
[50]
Linli Xu, Liang Jiang, Chuan Qin, Zhe Wang, and Dongfang Du. 2018. https://ojs.aaai.org/index.php/AAAI/article/view/12001 How images inspire poems: Generating classical chinese poetry from images with memory networks . In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32
work page 2018
-
[52]
Xiaoyuan Yi, Maosong Sun, Ruoyu Li, and Zonghan Yang. 2018. https://www.ijcai.org/proceedings/2018/0633.pdf Chinese poetry generation with a working memory model
work page 2018
-
[53]
Daniel Zhang, Bo Ni, Qiyu Zhi, Thomas Plummer, Qi Li, Hao Zheng, Qingkai Zeng, Yang Zhang, and Dong Wang. 2019. https://dl.acm.org/doi/abs/10.1145/3341161.3342885 Through the eyes of a poet: Classical poetry recommendation with visual input on social media . In Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analys...
-
[54]
Jiaqi Zhao, Ting Bai, Yuting Wei, and Bin Wu. 2022. https://link.springer.com/chapter/10.1007/978-981-19-8991-9_26 Poetrybert: Pre-training with sememe knowledge for classical chinese poetry . In International conference on data mining and big data, pages 369--384. Springer
-
[55]
ZHIPU. 2024. https://zhipuai.cn/en/devday Zhipu ai devday glm-4 . Accessed: 2024-05-01
work page 2024
-
[56]
2024 International Joint Conference on Neural Networks (IJCNN) , pages=
Adaptive ensembles of fine-tuned transformers for llm-generated text detection , author=. 2024 International Joint Conference on Neural Networks (IJCNN) , pages=. 2024 , Url=
work page 2024
-
[57]
Proceedings of the National Academy of Sciences , volume=
Human heuristics for AI-generated language are flawed , author=. Proceedings of the National Academy of Sciences , volume=. 2023 , url=
work page 2023
-
[58]
All That ' s `Human' Is Not Gold: Evaluating Human Evaluation of Generated Text
Clark, Elizabeth and August, Tal and Serrano, Sofia and Haduong, Nikita and Gururangan, Suchin and Smith, Noah A. All That ' s `Human' Is Not Gold: Evaluating Human Evaluation of Generated Text. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing...
work page 2021
-
[59]
AI-generated poetry is indistinguishable from human-written poetry and is rated more favorably , author=. Scientific Reports , volume=. 2024 , url=
work page 2024
-
[60]
Journal of Information Science , pages=
The imitation game: Detecting human and AI-generated texts in the era of ChatGPT and BARD , author=. Journal of Information Science , pages=. 2024 , url=
work page 2024
-
[61]
arXiv preprint arXiv:2501.03212 , url=
Leveraging Explainable AI for LLM Text Attribution: Differentiating Human-Written and Multiple LLMs-Generated Text , author=. arXiv preprint arXiv:2501.03212 , url=
-
[62]
Computational Linguistics , pages=
A survey on LLM-generated text detection: Necessity, methods, and future directions , author=. Computational Linguistics , pages=. 2025 , url=
work page 2025
- [63]
-
[64]
Clone Li Bai and 100 trillion poems - the "quasi-text" production and possible prospects of AI poetry , author=. Southern Literature , number=
-
[65]
Chinese poetry generation with a working memory model , author=
-
[66]
International conference on data mining and big data , pages=
PoetryBERT: Pre-training with sememe knowledge for classical Chinese poetry , author=. International conference on data mining and big data , pages=. 2022 , url=
work page 2022
-
[67]
Computers in human behavior , volume=
Artificial intelligence versus Maya Angelou: Experimental evidence that people cannot differentiate AI-generated from human-written poetry , author=. Computers in human behavior , volume=. 2021 , url=
work page 2021
-
[68]
Royal Society Open Science , volume=
Value attributed to text-based archives generated by artificial intelligence , author=. Royal Society Open Science , volume=. 2023 , url=
work page 2023
-
[69]
Image Inspired Poetry Generation in XiaoIce
Image inspired poetry generation in xiaoice , author=. arXiv preprint arXiv:1808.03090 , url=
work page internal anchor Pith review Pith/arXiv arXiv
-
[70]
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) , volume=
Image to modern chinese poetry creation via a constrained topic-aware model , author=. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) , volume=. 2020 , url=
work page 2020
-
[71]
International Conference on Database Systems for Advanced Applications , pages=
Poetic expression through scenery: sentimental Chinese classical poetry generation from images , author=. International Conference on Database Systems for Advanced Applications , pages=. 2021 , url=
work page 2021
-
[72]
Generating Chinese Poetry from Images Based on Deep Learning , author=. Proceedings of the 2023 3rd International Conference on Big Data, Artificial Intelligence and Risk Management , pages=
work page 2023
-
[73]
Proceedings of the 26th ACM international conference on Multimedia , pages=
Beyond narrative description: Generating poetry from images by multi-adversarial training , author=. Proceedings of the 26th ACM international conference on Multimedia , pages=
-
[74]
Through the eyes of a poet: Classical poetry recommendation with visual input on social media , author=. Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining , pages=
work page 2019
-
[75]
Proceedings of the AAAI Conference on Artificial Intelligence , volume=
How images inspire poems: Generating classical Chinese poetry from images with memory networks , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=
-
[76]
中国古代诗歌情景论研究 , author=
-
[77]
Study on the Theory of Scene and Emotion in Ancient Chinese Poetry , author=
- [78]
-
[79]
Jiangxi Social Sciences , number=
On the Imagery Beauty of Scene-Emotion Blending in Classical Chinese Poetry , author=. Jiangxi Social Sciences , number=. 2003 , url=
work page 2003
- [80]
-
[81]
Journal of Baise University , number=
The Three Stages in the Formation of the Technique of Expressing Emotions through Describing Scenes in Early Ancient Chinese Poetry Creation , author=. Journal of Baise University , number=. 2003 , url=
work page 2003
- [82]
-
[83]
Journal of Huainan Normal University , number=
Why So Many Spring Sorrows and Autumn Regrets: A Preliminary Exploration of the Creative Mentality of Ancient Chinese Poets , author=. Journal of Huainan Normal University , number=. 2001 , url=
work page 2001
-
[84]
arXiv preprint arXiv:2508.13152 , url=
RepreGuard: Detecting LLM-Generated Text by Revealing Hidden Representation Patterns , author=. arXiv preprint arXiv:2508.13152 , url=
-
[85]
arXiv preprint arXiv:2405.04286 , url=
Who wrote this? the key to zero-shot llm-generated text detection is gecscore , author=. arXiv preprint arXiv:2405.04286 , url=
- [86]
-
[87]
Dilemma and Solution: Reflections on Current New Poetry , author=. Literary Review , number=
- [88]
-
[89]
Why "twist the neck of grammar" - a study of modern poetic rhetoric , author=. Journal of Central China Normal University: Humanities and Social Sciences Edition , volume=
-
[90]
Sentence judgments and the grammar of poetry: Linking linguistic structure and poetic effect , author=. Poetics , volume=. 2018 , publisher=
work page 2018
-
[91]
How to detect AI-generated texts? , author=. 2023 IEEE 14th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON) , pages=. 2023 , url=
work page 2023
-
[92]
Proceedings of 3rd International Conference on Document Analysis and Recognition , author=
Random decision forests. Proceedings of 3rd International Conference on Document Analysis and Recognition , author=. 1995 , publisher=
work page 1995
-
[93]
XGBoost: A Scalable Tree Boosting System , author=. Cornell University , year=
-
[94]
Improving image generation with better captions , author=. Computer Science. , url=
-
[95]
Gpt-image-1 model documentation , author=
-
[96]
Gpt-4o technical report , author=
-
[97]
Gpt-4o system card , author=. arXiv preprint arXiv:2410.21276 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[98]
Qwen-image technical report , author=. arXiv preprint arXiv:2508.02324 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[99]
HiDream-I1: A High-Efficient Image Generative Foundation Model with Sparse Diffusion Transformer
HiDream-I1: A High-Efficient Image Generative Foundation Model with Sparse Diffusion Transformer , author=. arXiv preprint arXiv:2505.22705 , year=
work page internal anchor Pith review Pith/arXiv arXiv
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.