Federated Knowledge Distillation for Multi-Model Architectures Lithography Hotspot Detection

Chuanguang Yang; Jianping Gou; Kai Zhang; Tingwen Huang; Xingyou Lin; Yanli Li; Yingli Tian; Yuqi Li; Zhongliang Guo

arxiv: 2501.04066 · v2 · submitted 2025-01-07 · 💻 cs.LG · cs.AR

Federated Knowledge Distillation for Multi-Model Architectures Lithography Hotspot Detection

Yuqi Li , Xingyou Lin , Yanli Li , Kai Zhang , Chuanguang Yang , Zhongliang Guo , Jianping Gou , Tingwen Huang

show 1 more author

Yingli Tian

This is my paper

Pith reviewed 2026-05-23 05:38 UTC · model grok-4.3

classification 💻 cs.LG cs.AR

keywords federated learningknowledge distillationlithography hotspot detectionprivacy preservationmulti-model architecturessemiconductor manufacturing

0 comments

The pith

A hybrid federated knowledge distillation framework improves lithography hotspot detection while preserving data privacy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces FedKD-hybrid to solve privacy issues in training models for lithography hotspot detection, a task in semiconductor manufacturing that requires strong data protection. Existing federated methods use either only parameter averaging or only knowledge distillation, but this approach combines both by having clients share parameters from agreed layers and logits using a public dataset for alignment. The hybrid aggregation helps refine each client's local model more effectively than single-paradigm methods. Experiments on benchmark and real factory data show consistent gains in accuracy and robustness. This matters because it enables collaborative improvement across different organizations without exposing sensitive manufacturing data.

Core claim

FedKD-hybrid utilizes a public dataset to facilitate consensus, where clients exchange both parameters of agreed-upon layers and logits. This hybrid information is aggregated to refine local models, enhancing knowledge transfer and outperforming state-of-the-art methods in effectiveness and robustness on ICCAD-2012 and real-world FAB datasets.

What carries the argument

The FedKD-hybrid framework that aggregates both selected model parameters and output logits exchanged over a public dataset to update local models.

If this is right

Local models trained on private data achieve higher hotspot detection performance through combined knowledge sources.
Privacy is maintained as private datasets remain local while still benefiting from collaboration.
The method supports multi-model architectures by allowing exchange only on agreed layers.
Performance gains hold on both public benchmarks and real manufacturing datasets.
Robustness to variations in data distributions increases compared to pure parameter or distillation approaches.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar hybrid exchange could apply to other domains needing privacy like medical imaging analysis.
If the public dataset is well-chosen, it might reduce the number of communication rounds needed.
Testing the sensitivity to public dataset choice would clarify how critical that component is.
The approach might generalize to other computer vision tasks in industrial settings.

Load-bearing premise

There exists a public dataset that enables useful consensus across clients without introducing bias or privacy risks to the private training data.

What would settle it

Demonstrating that models trained with the hybrid method perform no better than those using only parameters or only distillation on the same datasets would challenge the central claim.

Figures

Figures reproduced from arXiv: 2501.04066 by Chuanguang Yang, Jianping Gou, Kai Zhang, Tingwen Huang, Xingyou Lin, Yanli Li, Yingli Tian, Yuqi Li, Zhongliang Guo.

**Figure 2.** Figure 2: The overview of FedKD-hybrid algorithm. transfer across multiple clients based on different scenarios: parameter-based and non-parameter-based, respectively. In lithography hotspot detection (LHD) scenarios, clients may use different model architectures depending on their available computing resources. Additionally, due to bandwidth heterogeneity, clients are assumed to participate in the learning task a… view at source ↗

**Figure 3.** Figure 3: Test results on ICCAD and FAB set using synchronous updates. [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: Test results on ICCAD and FAB set using 80% asynchronous updates. [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

read the original abstract

As a special type of multimedia data, Lithography Hotspot Detection (LHD) training often requires stronger privacy protection than conventional multimedia data, and federated learning provides a promising potential solution to this challenge. However, existing approaches rely solely on either parameter aggregation or Knowledge Distillation (KD), failing to fully exploit the potential of collaborative learning. To address this, we propose FedKD-hybrid, a novel framework that synergizes the strengths of both paradigms. Specifically, FedKD-hybrid utilizes a public dataset to facilitate consensus, where clients exchange both parameters of agreed-upon layers and logits. This hybrid information is aggregated to refine local models, enhancing knowledge transfer. Extensive experiments on ICCAD-2012 and real-world FAB datasets demonstrate that FedKD-hybrid consistently outperforms state-of-the-art methods in both effectiveness and robustness.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

FedKD-hybrid combines parameter aggregation and knowledge distillation with a public dataset for lithography hotspot detection, but the claims cannot be checked without details on that dataset or the experiments.

read the letter

The paper's core idea is to use a public dataset so that clients can share both model parameters from agreed layers and logits, then aggregate that hybrid information to improve local models for lithography hotspot detection. This is framed as a way to get better knowledge transfer than using either federated averaging or distillation alone. That hybrid is the main new element. It applies to a setting where data privacy matters a lot because the training data comes from actual fabrication processes. The authors test on the standard ICCAD-2012 benchmark and some real-world FAB data, and they say it beats existing methods in effectiveness and robustness. The approach makes sense on paper for keeping private data local while still allowing some consensus through the public set. If the public data is chosen well, the logit exchange could help with knowledge transfer across different model architectures, which the title highlights. But the description stops at a high level. There is no information on what the public dataset actually is, where it comes from, or how its distribution compares to the private client data. That leaves the central mechanism open to the exact problem the stress-test note raises: the public set might reinforce benchmark quirks instead of helping with real distributions, or it could add bias that the local models then have to deal with. The experimental claims are also hard to evaluate. The abstract mentions consistent outperformance but gives no list of baselines, no numbers, no mention of statistical tests, and no ablation on the hybrid components. Without those, it's impossible to tell if the gains come from the hybrid design or from something else in the setup. The title refers to multi-model architectures, yet the abstract does not explain how the agreed-upon layers are chosen when clients have different models or how the aggregation handles that heterogeneity. That detail would matter for the claim to hold. This work is aimed at people doing applied federated learning in semiconductor manufacturing or similar privacy-sensitive industrial settings. A reader already working on hotspot detection might find the specific combination useful if the experiments hold up, but the current write-up does not provide enough to judge that. I would not recommend sending this to peer review in its present form. The authors would need to add a clear description of the public dataset, full experimental protocols, and results with proper controls before it would be ready for referees.

Referee Report

2 major / 2 minor

Summary. The paper proposes FedKD-hybrid, a federated learning framework for lithography hotspot detection (LHD) that combines parameter aggregation of agreed-upon layers with logit exchange via knowledge distillation, using a public dataset to enable client consensus and model refinement. It claims this hybrid approach enhances knowledge transfer and consistently outperforms state-of-the-art methods in effectiveness and robustness on the ICCAD-2012 benchmark and real-world FAB datasets.

Significance. If the experimental results hold under proper validation, the hybrid parameter-plus-logit aggregation in a federated setting with a public dataset could offer a practical advance for privacy-sensitive collaborative training in semiconductor manufacturing, where LHD data distributions are proprietary. The approach addresses a gap between pure parameter-based FL and pure KD methods, but its significance depends on demonstrating that the public dataset enables unbiased transfer without introducing distribution shift.

major comments (2)

[Abstract and experimental evaluation section] The central experimental claim (abstract) that FedKD-hybrid 'consistently outperforms state-of-the-art methods' on ICCAD-2012 and real-world FAB datasets cannot be evaluated because the manuscript provides no description of the public dataset (source, size, label distribution, or statistical distance to private client data), no baselines, no statistical significance tests, and no ablation studies isolating the hybrid aggregation benefit. This directly undermines the assertion that the public dataset 'facilitates consensus' and 'enhances knowledge transfer' without bias or leakage.
[Method description of FedKD-hybrid] The method relies on the assumption that a public dataset exists which matches private LHD distributions closely enough for hybrid aggregation to transfer useful knowledge (abstract and method description). No analysis of distribution shift, privacy leakage risk, or sensitivity to public-set choice is provided; if the public set is drawn from ICCAD-2012 itself, the reported gains may reflect benchmark artifacts rather than generalization to proprietary FAB data.

minor comments (2)

[Method] Notation for 'agreed-upon layers' and the aggregation procedure for hybrid information should be formalized with equations or pseudocode for reproducibility.
[Abstract and introduction] The abstract mentions 'multi-model architectures' in the title but provides no details on how the framework handles heterogeneous client models beyond layer agreement.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We agree that additional details on the public dataset, baselines, statistical tests, ablations, distribution shift, and privacy analysis are needed to strengthen the claims. The revised manuscript will incorporate these elements.

read point-by-point responses

Referee: [Abstract and experimental evaluation section] The central experimental claim (abstract) that FedKD-hybrid 'consistently outperforms state-of-the-art methods' on ICCAD-2012 and real-world FAB datasets cannot be evaluated because the manuscript provides no description of the public dataset (source, size, label distribution, or statistical distance to private client data), no baselines, no statistical significance tests, and no ablation studies isolating the hybrid aggregation benefit. This directly undermines the assertion that the public dataset 'facilitates consensus' and 'enhances knowledge transfer' without bias or leakage.

Authors: We agree that the current version lacks sufficient experimental transparency. In the revision we will add: a full description of the public dataset (source, size, label distribution, and statistical distance metrics to private data); explicit listing of all baselines; statistical significance tests; and ablation studies isolating the hybrid aggregation components. These additions will allow proper evaluation of the claims regarding consensus and knowledge transfer. revision: yes
Referee: [Method description of FedKD-hybrid] The method relies on the assumption that a public dataset exists which matches private LHD distributions closely enough for hybrid aggregation to transfer useful knowledge (abstract and method description). No analysis of distribution shift, privacy leakage risk, or sensitivity to public-set choice is provided; if the public set is drawn from ICCAD-2012 itself, the reported gains may reflect benchmark artifacts rather than generalization to proprietary FAB data.

Authors: We agree that the manuscript would be strengthened by explicit analysis of these factors. The revision will include quantitative assessment of distribution shift, privacy leakage evaluation, and sensitivity experiments across different public-set choices. We will also clarify the relationship of the public dataset to the ICCAD-2012 benchmark to address concerns about potential artifacts. revision: yes

Circularity Check

0 steps flagged

No significant circularity; claims rest on experimental comparisons

full rationale

The paper introduces FedKD-hybrid as a hybrid federated KD framework that aggregates layer parameters and logits over a public dataset. All performance claims are grounded in reported experiments on ICCAD-2012 and real-world FAB data versus prior methods. No equations, fitted parameters renamed as predictions, self-definitional constructions, or load-bearing self-citations appear in the provided text. The derivation chain consists of a proposed architecture plus empirical validation and does not reduce to its inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Based on abstract only; no free parameters, axioms, or invented entities are specified.

pith-pipeline@v0.9.0 · 5690 in / 971 out tokens · 45731 ms · 2026-05-23T05:38:22.849563+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

43 extracted references · 43 canonical work pages · 1 internal anchor

[1]

Reducing dfm to practice: the lithography manufacturability assessor,

L. Liebmann, S. Mansfield, G. Han, J. Culp, J. Hibbeler, and R. Tsai, “Reducing dfm to practice: the lithography manufacturability assessor,” in Design and Process Integration for Microelectronic Manufacturing IV, vol. 6156. SPIE, 2006, pp. 178–189

work page 2006
[2]

Cramming more components onto integrated circuits,

G. E. Moore, “Cramming more components onto integrated circuits,” Proceedings of the IEEE , vol. 86, no. 1, pp. 82–85, 1998

work page 1998
[3]

Lithography hotspot de- tection based on heterogeneous federated learning with local adaptation and feature selection,

J. Pan, X. Lin, J. Xu, Y . Chen, and C. Zhuo, “Lithography hotspot de- tection based on heterogeneous federated learning with local adaptation and feature selection,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems , 2023

work page 2023
[4]

C. Finn, P. Abbeel, and S. Levine, 2017

work page 2017
[5]

Hotspot prediction: Sem image generation with potential lithography hotspots,

J. Kim, J. Lim, J. Lee, T.-Y . Kim, Y . Nam, K. Kim, and D.-N. Kim, “Hotspot prediction: Sem image generation with potential lithography hotspots,” IEEE Transactions on Semiconductor Manufacturing , 2023

work page 2023
[6]

Hotspot detection on post-opc layout using full chip simulation based verification tool: A case study with aerial image simulation,

J. Kim and M. Fan, “Hotspot detection on post-opc layout using full chip simulation based verification tool: A case study with aerial image simulation,” Proc. SPIE, vol. 5256, 2003

work page 2003
[7]

Automated full-chip hotspot detection and removal flow for interconnect layers of cell-based designs,

E. Roseboom, M. Rossman, F. C. Chang, and P. Hurat, “Automated full-chip hotspot detection and removal flow for interconnect layers of cell-based designs,” Proceedings of Spie the International Society for Optical Engineering, 2007

work page 2007
[8]

Accurate process-hotspot detection using critical design rule extraction,

Y . T. Yu, Y . C. Chan, S. Sinha, H. R. Jiang, and C. Chiang, “Accurate process-hotspot detection using critical design rule extraction,” in ACM, 2012

work page 2012
[9]

A fuzzy-matching model with grid reduction for lithography hotspot detection,

Chang, S., J., Chen, Lin, Wen, and W., “A fuzzy-matching model with grid reduction for lithography hotspot detection,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems: A publication of the IEEE Circuits and Systems Society , vol. 33, no. 11, pp. 1671–1680, 2014

work page 2014
[10]

Improved tangent space-based distance metric for lithographic hotspot classification,

Fan, Yang, Subarna, Sinha, Charles, C., Chiang, Xuan, Zeng, and Dian, “Improved tangent space-based distance metric for lithographic hotspot classification,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 36, no. 9, pp. 1545–1556, 2017

work page 2017
[11]

Grasp based metaheuristics for layout pattern classification,

M. Woo, S. Kim, and S. Kang, “Grasp based metaheuristics for layout pattern classification,” in 2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) , 2017

work page 2017
[12]

Accurate lithography hotspot detection using deep convolutional neural networks,

M. Shin and J. H. Lee, “Accurate lithography hotspot detection using deep convolutional neural networks,” Journal of Micro/nanolithography Mems & Moems , vol. 15, no. 4, p. 043507, 2016

work page 2016
[13]

Imbalance aware lithography hotspot detection: a deep learning approach,

H. Yang, L. Luo, S. Jing, C. Lin, and Y . Bei, “Imbalance aware lithography hotspot detection: a deep learning approach,” Journal of Micro/nanolithography Mems & Moems , vol. 16, no. 3, p. 1, 2017

work page 2017
[14]

Lithography hotspot detection: From shallow to deep learning,

H. Yang, Y . Lin, Y . Bei, and E. Young, “Lithography hotspot detection: From shallow to deep learning,” in 2017 30th IEEE International System-on-Chip Conference (SOCC) , 2017

work page 2017
[15]

Lithography hotspot detection via heterogeneous federated learning with local adaptation,

X. Lin, J. Pan, J. Xu, Y . Chen, and C. Zhuo, “Lithography hotspot detection via heterogeneous federated learning with local adaptation,” 2021

work page 2021
[16]

Communication-efficient learning of deep networks from decentralized data,

H. B. Mcmahan, E. Moore, D. Ramage, S. Hampson, and B. Arcas, “Communication-efficient learning of deep networks from decentralized data,” 2016

work page 2016
[17]

Federated multi-task learning,

V . Smith, C.-K. Chiang, M. Sanjabi, and A. S. Talwalkar, “Federated multi-task learning,”Advances in neural information processing systems, vol. 30, 2017

work page 2017
[18]

Federated Learning with Personalization Layers

M. G. Arivazhagan, V . Aggarwal, A. K. Singh, and S. Choud- hary, “Federated learning with personalization layers,” arXiv preprint arXiv:1912.00818, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1912
[19]

Fedmd: Heterogenous federated learning via model distillation,

D. Li and J. Wang, “Fedmd: Heterogenous federated learning via model distillation,” arXiv preprint arXiv:1910.03581 , 2019

work page arXiv 1910
[20]

Federated optimization in heterogeneous networks,

T. Li, A. K. Sahu, M. Zaheer, M. Sanjabi, A. Talwalkar, and V . Smith, “Federated optimization in heterogeneous networks,” Proceedings of Machine learning and systems , vol. 2, pp. 429–450, 2020

work page 2020
[21]

Knowledge distillation: A survey,

J. Gou, B. Yu, S. J. Maybank, and D. Tao, “Knowledge distillation: A survey,” International Journal of Computer Vision , vol. 129, no. 6, pp. 1789–1819, 2021

work page 2021
[22]

Hierarchical self-supervised augmented knowledge distillation,

C. Yang, Z. An, L. Cai, and Y . Xu, “Hierarchical self-supervised augmented knowledge distillation,” International Joint Conference on Artificial Intelligence, pp. 1217–1223, 2021

work page 2021
[23]

Cross- image relational knowledge distillation for semantic segmentation,

C. Yang, H. Zhou, Z. An, X. Jiang, Y . Xu, and Q. Zhang, “Cross- image relational knowledge distillation for semantic segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 12 319–12 328

work page 2022
[24]

Mixskd: Self-knowledge distillation from mixup for image recogni- tion,

C. Yang, Z. An, H. Zhou, L. Cai, X. Zhi, J. Wu, Y . Xu, and Q. Zhang, “Mixskd: Self-knowledge distillation from mixup for image recogni- tion,” in European Conference on Computer Vision . Springer, 2022, pp. 534–551

work page 2022
[25]

Federated distillation: A survey,

L. Li, J. Gou, B. Yu, L. Du, and Z. Y . D. Tao, “Federated distillation: A survey,” arXiv preprint arXiv:2404.08564 , 2024

work page arXiv 2024
[26]

Clip-kd: An empirical study of clip model distillation,

C. Yang, Z. An, L. Huang, J. Bi, X. Yu, H. Yang, B. Diao, and Y . Xu, “Clip-kd: An empirical study of clip model distillation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 15 952–15 962

work page 2024
[27]

Iccad-2012 cad contest in fuzzy pattern matching for physical verification and benchmark suite,

J. A. Torres, “Iccad-2012 cad contest in fuzzy pattern matching for physical verification and benchmark suite,” IEEE, 2012

work page 2012
[28]

Ensemble distillation for robust model fusion in federated learning,

T. Lin, L. Kong, S. U. Stich, and M. Jaggi, “Ensemble distillation for robust model fusion in federated learning,” Advances in neural information processing systems , vol. 33, pp. 2351–2363, 2020

work page 2020
[29]

Knowledge dis- tillation for federated learning: a practical guide,

A. Mora, I. Tenison, P. Bellavista, and I. Rish, “Knowledge dis- tillation for federated learning: a practical guide,” arXiv preprint arXiv:2211.04742, 2022

work page arXiv 2022
[30]

Data-free knowledge distillation for het- erogeneous federated learning,

Z. Zhu, J. Hong, and J. Zhou, “Data-free knowledge distillation for het- erogeneous federated learning,” in International conference on machine learning. PMLR, 2021, pp. 12 878–12 889

work page 2021
[31]

Communication-efficient federated learning via knowledge distillation,

C. Wu, F. Wu, L. Lyu, Y . Huang, and X. Xie, “Communication-efficient federated learning via knowledge distillation,” Nature communications, vol. 13, no. 1, p. 2032, 2022

work page 2032
[32]

When federated learning meets knowledge distillation,

X. Pang, J. Hu, P. Sun, J. Ren, and Z. Wang, “When federated learning meets knowledge distillation,” IEEE Wireless Communications, vol. 31, no. 5, pp. 208–214, 2024

work page 2024
[33]

Survey of personalization techniques for federated learning,

V . Kulkarni, M. Kulkarni, and A. Pant, “Survey of personalization techniques for federated learning,” in 2020 fourth world conference on smart trends in systems, security and sustainability (WorldS4) . IEEE, 2020, pp. 794–797

work page 2020
[34]

Lithography hotspots detection using deep learning,

V . Borisov and J. Scheible, “Lithography hotspots detection using deep learning,” in2018 15th International Conference on Synthesis, Modeling, Analysis and Simulation Methods and Applications to Circuit Design (SMACD), 2018, pp. 145–148

work page 2018
[35]

Lithography hotspot detection based on residual network,

M. Lin, F. Zeng, Y . Shen, and Y . Wei, “Lithography hotspot detection based on residual network,” in DTCO and Computational Patterning II , vol. 12495. SPIE, 2023, pp. 354–361

work page 2023
[36]

Data augmentation in hotspot detection based on generative adversarial network,

S. Wang, T. Gai, T. Qu, B. Ma, X. Su, L. Dong, L. Zhang, P. Xu, Y . Su, and Y . Wei, “Data augmentation in hotspot detection based on generative adversarial network,” Journal of Micro/Nanopatterning, Materials, and Metrology, vol. 20, no. 3, pp. 034 201–034 201, 2021

work page 2021
[37]

Lithography hotspot detection via heterogeneous federated learning with local adaptation,

X. Lin, J. Pan, J. Xu, Y . Chen, and C. Zhuo, “Lithography hotspot detection via heterogeneous federated learning with local adaptation,” in 2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC), 2022, pp. 166–171

work page 2022
[38]

Lithography hotspot de- tection based on heterogeneous federated learning with local adaptation and feature selection,

J. Pan, X. Lin, J. Xu, Y . Chen, and C. Zhuo, “Lithography hotspot de- tection based on heterogeneous federated learning with local adaptation and feature selection,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems , vol. 43, no. 5, pp. 1484–1496, 2024

work page 2024
[39]

Lithography hotspot detection method based on transfer learning using pre-trained deep convolutional neural network,

L. Liao, S. Li, Y . Che, W. Shi, and X. Wang, “Lithography hotspot detection method based on transfer learning using pre-trained deep convolutional neural network,” Applied Sciences, vol. 12, no. 4, p. 2192, 2022

work page 2022
[40]

: Toward heterogeneous federated learning via global knowledge distillation,

D. Yao, W. Pan, Y . Dai, Y . Wan, X. Ding, C. Yu, H. Jin, Z. Xu, and L. Sun, “: Toward heterogeneous federated learning via global knowledge distillation,”IEEE Transactions on Computers, vol. 73, no. 1, pp. 3–17, 2024

work page 2024
[41]

Adam: A method for stochastic optimization,

D. Kingma and J. Ba, “Adam: A method for stochastic optimization,” Computer Science, 2014

work page 2014
[42]

Federated learning with partial model personalization,

K. Pillutla, K. Malik, A.-R. Mohamed, M. Rabbat, M. Sanjabi, and L. Xiao, “Federated learning with partial model personalization,” in International Conference on Machine Learning . PMLR, 2022, pp. 17 716–17 758

work page 2022
[43]

Online knowledge distillation via mutual contrastive learning for visual recogni- tion,

C. Yang, Z. An, H. Zhou, F. Zhuang, Y . Xu, and Q. Zhang, “Online knowledge distillation via mutual contrastive learning for visual recogni- tion,” IEEE Transactions on Pattern Analysis and Machine Intelligence , vol. 45, no. 8, pp. 10 212–10 227, 2023

work page 2023

[1] [1]

Reducing dfm to practice: the lithography manufacturability assessor,

L. Liebmann, S. Mansfield, G. Han, J. Culp, J. Hibbeler, and R. Tsai, “Reducing dfm to practice: the lithography manufacturability assessor,” in Design and Process Integration for Microelectronic Manufacturing IV, vol. 6156. SPIE, 2006, pp. 178–189

work page 2006

[2] [2]

Cramming more components onto integrated circuits,

G. E. Moore, “Cramming more components onto integrated circuits,” Proceedings of the IEEE , vol. 86, no. 1, pp. 82–85, 1998

work page 1998

[3] [3]

Lithography hotspot de- tection based on heterogeneous federated learning with local adaptation and feature selection,

J. Pan, X. Lin, J. Xu, Y . Chen, and C. Zhuo, “Lithography hotspot de- tection based on heterogeneous federated learning with local adaptation and feature selection,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems , 2023

work page 2023

[4] [4]

C. Finn, P. Abbeel, and S. Levine, 2017

work page 2017

[5] [5]

Hotspot prediction: Sem image generation with potential lithography hotspots,

J. Kim, J. Lim, J. Lee, T.-Y . Kim, Y . Nam, K. Kim, and D.-N. Kim, “Hotspot prediction: Sem image generation with potential lithography hotspots,” IEEE Transactions on Semiconductor Manufacturing , 2023

work page 2023

[6] [6]

Hotspot detection on post-opc layout using full chip simulation based verification tool: A case study with aerial image simulation,

J. Kim and M. Fan, “Hotspot detection on post-opc layout using full chip simulation based verification tool: A case study with aerial image simulation,” Proc. SPIE, vol. 5256, 2003

work page 2003

[7] [7]

Automated full-chip hotspot detection and removal flow for interconnect layers of cell-based designs,

E. Roseboom, M. Rossman, F. C. Chang, and P. Hurat, “Automated full-chip hotspot detection and removal flow for interconnect layers of cell-based designs,” Proceedings of Spie the International Society for Optical Engineering, 2007

work page 2007

[8] [8]

Accurate process-hotspot detection using critical design rule extraction,

Y . T. Yu, Y . C. Chan, S. Sinha, H. R. Jiang, and C. Chiang, “Accurate process-hotspot detection using critical design rule extraction,” in ACM, 2012

work page 2012

[9] [9]

A fuzzy-matching model with grid reduction for lithography hotspot detection,

Chang, S., J., Chen, Lin, Wen, and W., “A fuzzy-matching model with grid reduction for lithography hotspot detection,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems: A publication of the IEEE Circuits and Systems Society , vol. 33, no. 11, pp. 1671–1680, 2014

work page 2014

[10] [10]

Improved tangent space-based distance metric for lithographic hotspot classification,

Fan, Yang, Subarna, Sinha, Charles, C., Chiang, Xuan, Zeng, and Dian, “Improved tangent space-based distance metric for lithographic hotspot classification,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 36, no. 9, pp. 1545–1556, 2017

work page 2017

[11] [11]

Grasp based metaheuristics for layout pattern classification,

M. Woo, S. Kim, and S. Kang, “Grasp based metaheuristics for layout pattern classification,” in 2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) , 2017

work page 2017

[12] [12]

Accurate lithography hotspot detection using deep convolutional neural networks,

M. Shin and J. H. Lee, “Accurate lithography hotspot detection using deep convolutional neural networks,” Journal of Micro/nanolithography Mems & Moems , vol. 15, no. 4, p. 043507, 2016

work page 2016

[13] [13]

Imbalance aware lithography hotspot detection: a deep learning approach,

H. Yang, L. Luo, S. Jing, C. Lin, and Y . Bei, “Imbalance aware lithography hotspot detection: a deep learning approach,” Journal of Micro/nanolithography Mems & Moems , vol. 16, no. 3, p. 1, 2017

work page 2017

[14] [14]

Lithography hotspot detection: From shallow to deep learning,

H. Yang, Y . Lin, Y . Bei, and E. Young, “Lithography hotspot detection: From shallow to deep learning,” in 2017 30th IEEE International System-on-Chip Conference (SOCC) , 2017

work page 2017

[15] [15]

Lithography hotspot detection via heterogeneous federated learning with local adaptation,

X. Lin, J. Pan, J. Xu, Y . Chen, and C. Zhuo, “Lithography hotspot detection via heterogeneous federated learning with local adaptation,” 2021

work page 2021

[16] [16]

Communication-efficient learning of deep networks from decentralized data,

H. B. Mcmahan, E. Moore, D. Ramage, S. Hampson, and B. Arcas, “Communication-efficient learning of deep networks from decentralized data,” 2016

work page 2016

[17] [17]

Federated multi-task learning,

V . Smith, C.-K. Chiang, M. Sanjabi, and A. S. Talwalkar, “Federated multi-task learning,”Advances in neural information processing systems, vol. 30, 2017

work page 2017

[18] [18]

Federated Learning with Personalization Layers

M. G. Arivazhagan, V . Aggarwal, A. K. Singh, and S. Choud- hary, “Federated learning with personalization layers,” arXiv preprint arXiv:1912.00818, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1912

[19] [19]

Fedmd: Heterogenous federated learning via model distillation,

D. Li and J. Wang, “Fedmd: Heterogenous federated learning via model distillation,” arXiv preprint arXiv:1910.03581 , 2019

work page arXiv 1910

[20] [20]

Federated optimization in heterogeneous networks,

T. Li, A. K. Sahu, M. Zaheer, M. Sanjabi, A. Talwalkar, and V . Smith, “Federated optimization in heterogeneous networks,” Proceedings of Machine learning and systems , vol. 2, pp. 429–450, 2020

work page 2020

[21] [21]

Knowledge distillation: A survey,

J. Gou, B. Yu, S. J. Maybank, and D. Tao, “Knowledge distillation: A survey,” International Journal of Computer Vision , vol. 129, no. 6, pp. 1789–1819, 2021

work page 2021

[22] [22]

Hierarchical self-supervised augmented knowledge distillation,

C. Yang, Z. An, L. Cai, and Y . Xu, “Hierarchical self-supervised augmented knowledge distillation,” International Joint Conference on Artificial Intelligence, pp. 1217–1223, 2021

work page 2021

[23] [23]

Cross- image relational knowledge distillation for semantic segmentation,

C. Yang, H. Zhou, Z. An, X. Jiang, Y . Xu, and Q. Zhang, “Cross- image relational knowledge distillation for semantic segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 12 319–12 328

work page 2022

[24] [24]

Mixskd: Self-knowledge distillation from mixup for image recogni- tion,

C. Yang, Z. An, H. Zhou, L. Cai, X. Zhi, J. Wu, Y . Xu, and Q. Zhang, “Mixskd: Self-knowledge distillation from mixup for image recogni- tion,” in European Conference on Computer Vision . Springer, 2022, pp. 534–551

work page 2022

[25] [25]

Federated distillation: A survey,

L. Li, J. Gou, B. Yu, L. Du, and Z. Y . D. Tao, “Federated distillation: A survey,” arXiv preprint arXiv:2404.08564 , 2024

work page arXiv 2024

[26] [26]

Clip-kd: An empirical study of clip model distillation,

C. Yang, Z. An, L. Huang, J. Bi, X. Yu, H. Yang, B. Diao, and Y . Xu, “Clip-kd: An empirical study of clip model distillation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 15 952–15 962

work page 2024

[27] [27]

Iccad-2012 cad contest in fuzzy pattern matching for physical verification and benchmark suite,

J. A. Torres, “Iccad-2012 cad contest in fuzzy pattern matching for physical verification and benchmark suite,” IEEE, 2012

work page 2012

[28] [28]

Ensemble distillation for robust model fusion in federated learning,

T. Lin, L. Kong, S. U. Stich, and M. Jaggi, “Ensemble distillation for robust model fusion in federated learning,” Advances in neural information processing systems , vol. 33, pp. 2351–2363, 2020

work page 2020

[29] [29]

Knowledge dis- tillation for federated learning: a practical guide,

A. Mora, I. Tenison, P. Bellavista, and I. Rish, “Knowledge dis- tillation for federated learning: a practical guide,” arXiv preprint arXiv:2211.04742, 2022

work page arXiv 2022

[30] [30]

Data-free knowledge distillation for het- erogeneous federated learning,

Z. Zhu, J. Hong, and J. Zhou, “Data-free knowledge distillation for het- erogeneous federated learning,” in International conference on machine learning. PMLR, 2021, pp. 12 878–12 889

work page 2021

[31] [31]

Communication-efficient federated learning via knowledge distillation,

C. Wu, F. Wu, L. Lyu, Y . Huang, and X. Xie, “Communication-efficient federated learning via knowledge distillation,” Nature communications, vol. 13, no. 1, p. 2032, 2022

work page 2032

[32] [32]

When federated learning meets knowledge distillation,

X. Pang, J. Hu, P. Sun, J. Ren, and Z. Wang, “When federated learning meets knowledge distillation,” IEEE Wireless Communications, vol. 31, no. 5, pp. 208–214, 2024

work page 2024

[33] [33]

Survey of personalization techniques for federated learning,

V . Kulkarni, M. Kulkarni, and A. Pant, “Survey of personalization techniques for federated learning,” in 2020 fourth world conference on smart trends in systems, security and sustainability (WorldS4) . IEEE, 2020, pp. 794–797

work page 2020

[34] [34]

Lithography hotspots detection using deep learning,

V . Borisov and J. Scheible, “Lithography hotspots detection using deep learning,” in2018 15th International Conference on Synthesis, Modeling, Analysis and Simulation Methods and Applications to Circuit Design (SMACD), 2018, pp. 145–148

work page 2018

[35] [35]

Lithography hotspot detection based on residual network,

M. Lin, F. Zeng, Y . Shen, and Y . Wei, “Lithography hotspot detection based on residual network,” in DTCO and Computational Patterning II , vol. 12495. SPIE, 2023, pp. 354–361

work page 2023

[36] [36]

Data augmentation in hotspot detection based on generative adversarial network,

S. Wang, T. Gai, T. Qu, B. Ma, X. Su, L. Dong, L. Zhang, P. Xu, Y . Su, and Y . Wei, “Data augmentation in hotspot detection based on generative adversarial network,” Journal of Micro/Nanopatterning, Materials, and Metrology, vol. 20, no. 3, pp. 034 201–034 201, 2021

work page 2021

[37] [37]

Lithography hotspot detection via heterogeneous federated learning with local adaptation,

X. Lin, J. Pan, J. Xu, Y . Chen, and C. Zhuo, “Lithography hotspot detection via heterogeneous federated learning with local adaptation,” in 2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC), 2022, pp. 166–171

work page 2022

[38] [38]

Lithography hotspot de- tection based on heterogeneous federated learning with local adaptation and feature selection,

J. Pan, X. Lin, J. Xu, Y . Chen, and C. Zhuo, “Lithography hotspot de- tection based on heterogeneous federated learning with local adaptation and feature selection,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems , vol. 43, no. 5, pp. 1484–1496, 2024

work page 2024

[39] [39]

Lithography hotspot detection method based on transfer learning using pre-trained deep convolutional neural network,

L. Liao, S. Li, Y . Che, W. Shi, and X. Wang, “Lithography hotspot detection method based on transfer learning using pre-trained deep convolutional neural network,” Applied Sciences, vol. 12, no. 4, p. 2192, 2022

work page 2022

[40] [40]

: Toward heterogeneous federated learning via global knowledge distillation,

D. Yao, W. Pan, Y . Dai, Y . Wan, X. Ding, C. Yu, H. Jin, Z. Xu, and L. Sun, “: Toward heterogeneous federated learning via global knowledge distillation,”IEEE Transactions on Computers, vol. 73, no. 1, pp. 3–17, 2024

work page 2024

[41] [41]

Adam: A method for stochastic optimization,

D. Kingma and J. Ba, “Adam: A method for stochastic optimization,” Computer Science, 2014

work page 2014

[42] [42]

Federated learning with partial model personalization,

K. Pillutla, K. Malik, A.-R. Mohamed, M. Rabbat, M. Sanjabi, and L. Xiao, “Federated learning with partial model personalization,” in International Conference on Machine Learning . PMLR, 2022, pp. 17 716–17 758

work page 2022

[43] [43]

Online knowledge distillation via mutual contrastive learning for visual recogni- tion,

C. Yang, Z. An, H. Zhou, F. Zhuang, Y . Xu, and Q. Zhang, “Online knowledge distillation via mutual contrastive learning for visual recogni- tion,” IEEE Transactions on Pattern Analysis and Machine Intelligence , vol. 45, no. 8, pp. 10 212–10 227, 2023

work page 2023