LithoDreamer: A Physics-Informed World Model for Multi-Stage Computational Lithography
Pith reviewed 2026-06-26 04:48 UTC · model grok-4.3
The pith
LithoDreamer is the first physics-informed world model for the multi-stage lithography process from layout to after-development image.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
LithoDreamer formulates the Layout-Mask-Resist Image-After Development Image pipeline as a decision-driven multi-step evolution system. It captures feature changes between adjacent states to create stage-specific physics-informed latent spaces that control process interventions and drive state transitions. The contrastive variational optimization paradigm contrasts latent differences between intervention paths with variational evolution constraints to ensure consistency with real lithography physics without continuous supervision.
What carries the argument
The contrastive variational optimization paradigm that guides generation of physically consistent evolutions by contrasting latent differences between intervention paths under variational constraints.
Load-bearing premise
The contrastive variational optimization paradigm can guide the model to generate evolutions consistent with real lithography physics without continuous supervision.
What would settle it
Measuring the error between LithoDreamer predictions and ground-truth physical simulations or real experimental data for after-development images on a held-out set of process parameters and masks.
Figures
read the original abstract
As semiconductor technology nodes scale, computational lithography is essential for ensuring yield and performance. However, lithography is a continuous physical process involving mask optimization, optical imaging, resist exposure, and development, which existing models fail to capture. To overcome this limitation, we present LithoDreamer, the first physics-informed World Model (WM) framework for computational lithography, which formulates the ``Layout-Mask-Resist Image-After Development Image (ADI)'' pipeline as a decision-driven multi-step evolution system. LithoDreamer captures feature changes between adjacent states to model stage-specific physics-informed latent spaces, in which it controls process intervention exploration and drives subsequent state transitions. To achieve interpretable intervention optimization without continuous supervision, we propose a contrastive variational optimization paradigm that contrasts the latent differences between intervention paths with variational evolution constraints, guiding the model to generate evolutions consistent with real lithography physics. Experiments show LithoDreamer achieves state-of-the-art performance in forward evolution and inverse planning. Our lithography dataset is publicly available at GitHub (https://github.com/7jiangyq/lithodreamer.git).
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces LithoDreamer, the first physics-informed world model for computational lithography. It formulates the Layout-Mask-Resist Image-After Development Image (ADI) pipeline as a decision-driven multi-step evolution system, captures stage-specific physics-informed latent spaces via feature changes between adjacent states, and introduces a contrastive variational optimization paradigm to enable interpretable intervention optimization without continuous supervision. Experiments claim state-of-the-art performance on forward evolution and inverse planning tasks, with a publicly released dataset.
Significance. If the reported metrics hold under the stated conditions, the work provides a unified multi-stage framework that integrates optical imaging, resist exposure, and development physics within a single latent-space world model, addressing fragmentation in existing stage-specific approaches. The public dataset release is a clear strength that enables external verification and extension.
minor comments (3)
- §3.2: the description of the contrastive variational objective would benefit from an explicit statement of how the variational evolution constraints are enforced in the loss (e.g., via KL term or reconstruction) to clarify the 'without continuous supervision' claim.
- Table 2 and Table 3: axis labels and units for the reported metrics (e.g., CD error, process window) are not fully specified; adding these would improve reproducibility.
- §4.1: the baseline methods are listed but the exact hyper-parameter settings used for each (especially for the non-world-model competitors) are not tabulated; a supplementary table would strengthen the SOTA comparison.
Simulated Author's Rebuttal
We thank the referee for the positive assessment of LithoDreamer and the recommendation for minor revision. The report does not enumerate any specific major comments requiring point-by-point rebuttal.
Circularity Check
No significant circularity detected
full rationale
The paper formulates lithography as a multi-step evolution system and introduces a contrastive variational optimization paradigm to enforce physics consistency without continuous supervision. No equations, fitted parameters renamed as predictions, or self-citation chains are shown that reduce the central claims (forward evolution, inverse planning, or latent intervention) to inputs by construction. The public dataset link permits external falsification of reported metrics. The derivation chain remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The lithography process can be formulated as a decision-driven multi-step evolution system with stage-specific physics-informed latent spaces.
invented entities (1)
-
contrastive variational optimization paradigm
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Advances in neural information processing systems , volume=
Variational autoencoder for deep learning of images, labels and captions , author=. Advances in neural information processing systems , volume=
-
[2]
Proceedings of the IEEE conference on computer vision and pattern recognition , pages=
Image-to-image translation with conditional adversarial networks , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=
-
[3]
Bert: Pre-training of deep bidirectional transformers for language understanding , author=. Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers) , pages=
2019
-
[4]
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems , volume=
From IC layout to die photograph: a CNN-based data-driven approach , author=. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems , volume=. 2020 , publisher=
2020
-
[5]
arXiv preprint arXiv:2010.11929 , year=
An image is worth 16x16 words: Transformers for image recognition at scale , author=. arXiv preprint arXiv:2010.11929 , year=
Pith/arXiv arXiv 2010
-
[6]
Science China Information Sciences , volume=
Litho-AsymVnet: super-resolution lithography modeling with an asymmetric V-net architecture , author=. Science China Information Sciences , volume=. 2023 , publisher=
2023
-
[7]
Proceedings of the IEEE/CVF international conference on computer vision , pages=
Scalable diffusion models with transformers , author=. Proceedings of the IEEE/CVF international conference on computer vision , pages=
-
[8]
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems , volume=
L2O-ILT: Learning to optimize inverse lithography techniques , author=. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems , volume=. 2023 , publisher=
2023
-
[9]
arXiv preprint arXiv:2411.04983 , year=
Dino-wm: World models on pre-trained visual features enable zero-shot planning , author=. arXiv preprint arXiv:2411.04983 , year=
-
[10]
Proceedings of the 43rd IEEE/ACM International Conference on Computer-Aided Design , pages=
Fabgpt: An efficient large multimodal model for complex wafer defect knowledge queries , author=. Proceedings of the 43rd IEEE/ACM International Conference on Computer-Aided Design , pages=
-
[11]
Proceedings of the AAAI Conference on Artificial Intelligence , volume=
Intelligent opc engineer assistant for semiconductor manufacturing , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=
-
[12]
Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
Ominicontrol: Minimal and universal control for diffusion transformer , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
-
[13]
IEEE Journal of Biomedical and Health Informatics , year=
Fast-DDPM: Fast denoising diffusion probabilistic models for medical image-to-image generation , author=. IEEE Journal of Biomedical and Health Informatics , year=
-
[14]
Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
Feature purification matters: Suppressing outlier propagation for training-free open-vocabulary semantic segmentation , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
-
[15]
arXiv preprint arXiv:2502.09992 , year=
Large language diffusion models , author=. arXiv preprint arXiv:2502.09992 , year=
-
[16]
arXiv preprint arXiv:2505.15809 , year=
Mmada: Multimodal large diffusion language models , author=. arXiv preprint arXiv:2505.15809 , year=
-
[17]
Proceedings of the 44rd IEEE/ACM International Conference on Computer-Aided Design , pages=
Unitho: A Unified Multi-Task Framework for Computational Lithography , author=. Proceedings of the 44rd IEEE/ACM International Conference on Computer-Aided Design , pages=
-
[18]
2025 IEEE/ACM International Conference On Computer Aided Design (ICCAD) , pages=
LMLitho: A Large Vision Model-Driven Lithography Simulation Framework , author=. 2025 IEEE/ACM International Conference On Computer Aided Design (ICCAD) , pages=. 2025 , organization=
2025
-
[19]
Proceedings of the Computer Vision and Pattern Recognition Conference , pages=
Navigation world models , author=. Proceedings of the Computer Vision and Pattern Recognition Conference , pages=
-
[20]
Proceedings of the AAAI Conference on Artificial Intelligence , volume=
Drivedreamer-2: Llm-enhanced world models for diverse driving video generation , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=
-
[21]
Moore and More , volume=
Recent advances in computational lithography technology , author=. Moore and More , volume=. 2025 , publisher=
2025
-
[22]
arXiv preprint arXiv:2510.21219 , year=
World Models Should Prioritize the Unification of Physical and Social Dynamics , author=. arXiv preprint arXiv:2510.21219 , year=
-
[23]
Advances in Neural Information Processing Systems , volume=
Diffusion for World Modeling: Visual Details Matter in Atari , author=. Advances in Neural Information Processing Systems , volume=
-
[24]
Advances in Neural Information Processing Systems , volume=
Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models , author=. Advances in Neural Information Processing Systems , volume=
-
[25]
Huang, Yuhang and Zhang, Jiazhao and Zou, Shilong and Liu, Xinwang and Hu, Ruizhen and Xu, Kai , journal=
-
[26]
Proceedings of the 39th International Conference on Computer-Aided Design , pages=
DAMO: Deep agile mask optimization for full chip scale , author=. Proceedings of the 39th International Conference on Computer-Aided Design , pages=
-
[27]
Enabling scalable
Yang, Haoyu and Ren, Haoxing , booktitle=. Enabling scalable
-
[28]
Light: Science & Applications , volume=
Advancements and challenges in inverse lithography technology: a review of artificial intelligence-based approaches , author=. Light: Science & Applications , volume=
-
[29]
2025 IEEE/ACM International Conference On Computer Aided Design (ICCAD) , pages=
Fabthink: A wafer analysis multimodal llm via chain-of-thought-driven retrieval augmentation , author=. 2025 IEEE/ACM International Conference On Computer Aided Design (ICCAD) , pages=. 2025 , organization=
2025
-
[30]
Advances in Neural Information Processing Systems , volume=
LithoSim: A Large, Holistic Lithography Simulation Benchmark for AI-Driven Semiconductor Manufacturing , author=. Advances in Neural Information Processing Systems , volume=
-
[31]
Proceedings of the AAAI Conference on Artificial Intelligence , volume=
Circuit-Think: A Multimodal Reasoning Framework for Automated Circuit-to-Netlist Translation with Trajectory-Guided Reinforcement Learning , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.