Improving conditional generative adversarial networks for inverse design of plasmonic structures

Nicol\`o Maccaferri; Nils Henriksson; Petter Persson

arxiv: 2511.11279 · v1 · pith:H33UML7Rnew · submitted 2025-11-14 · ⚛️ physics.optics · physics.comp-ph

Improving conditional generative adversarial networks for inverse design of plasmonic structures

Petter Persson , Nils Henriksson , Nicol\`o Maccaferri This is my paper

Pith reviewed 2026-05-21 18:25 UTC · model grok-4.3

classification ⚛️ physics.optics physics.comp-ph

keywords conditional generative adversarial networksinverse designplasmonic nanostructuresextinction cross sectionnanophotonicsdeep learninglabel projectionembedding network

0 comments

The pith

Adding label projection and a novel embedding network to conditional GANs reduces mean absolute error by an order of magnitude and triples training speed for inverse design of plasmonic nanostructures.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines how to use conditional generative adversarial networks to inversely design plasmonic nanostructures from their target extinction cross section spectra. It establishes that incorporating label projection along with a novel embedding network into the model yields substantially better results than a standard conditional GAN baseline. These additions lower the mean absolute error in the generated designs by up to an order of magnitude and allow the training process to converge more than three times faster on average. The gains appear for both a basic fully connected architecture and a more complex convolutional one. A separately pre-trained convolutional neural network acts as a surrogate evaluator to confirm that the improved models produce designs whose optical responses match the targets at least as well as those from the baseline.

Core claim

The central claim is that augmenting a conditional generative adversarial network with label projection and a novel embedding network improves performance on inverse design tasks for plasmonic nanostructures. When trained to generate designs from extinction cross section spectra, the modified model achieves lower error estimates and faster convergence than the unmodified baseline, with the mean absolute error dropping by an order of magnitude in the best case and average training convergence improving by a factor greater than three. These benefits hold across both fully connected and convolutional network architectures, and the resulting designs are validated as equally good or better usinga

What carries the argument

The modified conditional generative adversarial network that integrates label projection and a novel embedding network to condition the generator on target extinction spectra.

If this is right

Designs generated by the modified model match target extinction spectra more closely than those from the baseline conditional GAN.
Training runs for plasmonic inverse design finish in roughly one-third the time while reaching better accuracy.
The same label-projection and embedding changes improve results for both simple fully connected and convolutional generator architectures.
The overall pipeline offers a concrete route to faster and more accurate inverse design of optical elements.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same conditioning improvements might transfer to inverse design of other nanophotonic or metamaterial systems if a suitable surrogate evaluator can be pre-trained.
Because convergence accelerates, the method could support real-time or iterative design loops where many spectra are targeted in sequence.
If the surrogate error remains low on experimental data, the approach could move from simulation-only validation toward laboratory fabrication targets.

Load-bearing premise

The pre-trained convolutional neural network surrogate accurately evaluates the extinction cross sections of the generated designs with low enough error to reliably compare inverse-design performance against the baseline.

What would settle it

Running full electromagnetic simulations on the designs produced by the modified model and finding that their extinction spectra deviate substantially from the surrogate predictions would show that the reported error reductions do not hold.

Figures

Figures reproduced from arXiv: 2511.11279 by Nicol\`o Maccaferri, Nils Henriksson, Petter Persson.

**Figure 2.** Figure 2: FIG. 2. The general architecture of the conidtional GAN model consists of critic network and generator network. The critic [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: FIG. 3. The results in this figure are obtained from the FCGAN-model with and without using label projection and an [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: FIG. 4. The models were trained on a dataset containing cylindrical dimer structures and they were evaluated using two [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 5.** Figure 5: FIG. 5. This figure shows example images from training the FCGAN-models on a larger dataset containing differently shaped [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 6.** Figure 6: FIG. 6. The figure shows images generated from the FCGAN + LP + Embedding network model and the corresponding [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

read the original abstract

Deep learning has emerged as a key tool for designing nanophotonic structures that manipulate light at sub-wavelength scales. We investigate how to inversely design plasmonic nanostructures using conditional generative adversarial networks. Although a conventional approach of measuring the optical properties of a given nanostructure is conceptually straightforward, inverse design remains difficult because the existence and uniqueness of an acceptable design cannot be guaranteed. Furthermore, the dimensionality of the design space is often large, and simulation-based methods become quickly intractable. Deep learning methods are well-suited to tackle this problem because they can handle effectively high-dimensional input data. We train a conditional generative adversarial network model and use it for inverse design of plasmonic nanostructures based on their extinction cross section spectra. Our main result shows that adding label projection and a novel embedding network to the conditional generative adversarial network model, improves performance in terms of error estimates and convergence speed for the training algorithm. The mean absolute error is reduced by an order of magnitude in the best case, and the training algorithm converges more than three times faster on average. This is shown for two network architectures, a simpler one using a fully connected neural network architecture, and a more complex one using convolutional layers. We pre-train a convolutional neural network and use it as surrogate model to evaluate the performance of our inverse design model. The surrogate model evaluates the extinction cross sections of the design predictions, and we show that our modifications lead to equally good or better predictions of the original design compared to a baseline model. This provides an important step towards more efficient and precise inverse design methods for optical elements.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The cGAN tweaks deliver practical gains on plasmonic inverse design but the surrogate CNN's accuracy is not shown to be tight enough to support the claimed MAE differences.

read the letter

Colleague, the main point on this paper is that label projection plus a new embedding network inside a conditional GAN produces lower error and faster training when generating plasmonic structures from target spectra, yet the evaluation rests on a surrogate model whose own error is not quantified against the reported improvements. They start with standard cGANs and add these conditioning changes, then test both a simple fully connected network and a convolutional version. The headline numbers are an order-of-magnitude drop in mean absolute error in the best case and more than three times faster convergence on average. They score the generated designs by running them through a pre-trained CNN surrogate that predicts extinction cross sections instead of calling full-wave solvers each time. This setup is the sort of targeted engineering that helps in nanophotonics where simulation budgets are tight, and showing gains on two architectures adds a bit of robustness. The soft spot is the surrogate. The performance edge is measured by how closely the surrogate-evaluated spectra match the targets, but the abstract and claims give no test-set error for the surrogate itself relative to FDTD or Maxwell solvers on unseen geometries. If that surrogate error is comparable to the differences between baseline and modified models, the ranking could change. The paper does not appear to include those direct validation numbers. On the rest of the work the methods look standard for the field and the citation choices are appropriate. This is useful reading for people already running ML inverse design in optics who want concrete tweaks to try. It is not a new framework, just an empirical extension. I would send it to peer review because the claims are specific enough to check with code and data, provided the surrogate validation gets added or clarified. The central argument can hold if that piece is solid.

Referee Report

2 major / 2 minor

Summary. The manuscript investigates the use of conditional generative adversarial networks (cGANs) for inverse design of plasmonic nanostructures, targeting specified extinction cross-section spectra. The central contribution is the addition of label projection and a novel embedding network to the cGAN, which the authors report improves performance over a baseline: mean absolute error is reduced by up to an order of magnitude in the best case, and training converges more than three times faster on average. These gains are shown for both a fully connected architecture and a convolutional architecture. A pre-trained convolutional neural network is used as a surrogate to evaluate the optical response of generated designs, allowing comparison of inverse-design quality without repeated full-wave simulations.

Significance. If the quantitative claims are substantiated, the work offers a practical advance in data-driven inverse design for nanophotonics, where high-dimensional parameter spaces render direct optimization intractable. The reported speed-up and error reduction, together with the surrogate-based evaluation protocol, could lower the computational barrier to exploring plasmonic geometries. Demonstrating the modifications across two distinct network families adds modest evidence of robustness.

major comments (2)

[Abstract] Abstract and surrogate-model description: the headline MAE reductions and convergence claims are obtained by comparing target spectra against spectra predicted by a pre-trained CNN surrogate. No quantitative surrogate validation (test-set MAE, R², or direct comparison against FDTD/Maxwell solvers on held-out geometries) is supplied. Because the surrogate error could be comparable to or larger than the reported inter-model differences, the ranking of baseline versus modified cGAN cannot yet be considered reliable.
[Methods/Results] Experimental protocol: the abstract states quantitative gains but omits dataset cardinality, training/validation/test splits, exact baseline cGAN hyperparameters, number of independent runs, and error bars on MAE and convergence time. These omissions make it impossible to judge whether the order-of-magnitude improvement is statistically robust or sensitive to particular data partitions or random seeds.

minor comments (2)

The description of the novel embedding network would benefit from an explicit architectural diagram or layer-by-layer specification to allow reproduction.
Clarify whether the label-projection mechanism is applied only at the discriminator or also at the generator; the current wording leaves this ambiguous.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments on our manuscript. We address each major point below and will incorporate the requested clarifications and validations into the revised version to strengthen the reliability of our claims.

read point-by-point responses

Referee: [Abstract] Abstract and surrogate-model description: the headline MAE reductions and convergence claims are obtained by comparing target spectra against spectra predicted by a pre-trained CNN surrogate. No quantitative surrogate validation (test-set MAE, R², or direct comparison against FDTD/Maxwell solvers on held-out geometries) is supplied. Because the surrogate error could be comparable to or larger than the reported inter-model differences, the ranking of baseline versus modified cGAN cannot yet be considered reliable.

Authors: We agree that explicit validation of the surrogate is necessary to confirm that its error does not confound the reported improvements. In the revised manuscript we will add a dedicated subsection reporting the surrogate CNN's test-set MAE, R², and direct comparisons against FDTD simulations on a held-out set of geometries. These metrics will demonstrate that the surrogate error is substantially smaller than the observed differences between baseline and modified cGANs, thereby supporting the validity of the ranking. revision: yes
Referee: [Methods/Results] Experimental protocol: the abstract states quantitative gains but omits dataset cardinality, training/validation/test splits, exact baseline cGAN hyperparameters, number of independent runs, and error bars on MAE and convergence time. These omissions make it impossible to judge whether the order-of-magnitude improvement is statistically robust or sensitive to particular data partitions or random seeds.

Authors: We acknowledge that the current manuscript lacks sufficient detail on the experimental setup. In the revision we will explicitly state the total dataset size, the precise training/validation/test split ratios, the full hyperparameter configuration of the baseline cGAN, the number of independent training runs performed, and error bars (standard deviation across runs) for all reported MAE and convergence-time values. These additions will allow readers to assess statistical robustness directly. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical ML performance claims rest on held-out training runs and external surrogate evaluation

full rationale

The paper reports empirical improvements from training modified cGANs (with label projection and embedding network) versus baseline, measured via MAE on extinction spectra and convergence speed. These metrics derive from actual optimization runs on data splits and a separately pre-trained CNN surrogate, not from any equation or parameter that is defined in terms of the target result. No self-citation chain, uniqueness theorem, ansatz smuggling, or renaming of known results is invoked to support the central claims. The derivation is therefore self-contained against external benchmarks (FDTD simulations via the surrogate) and does not reduce to its inputs by construction.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claims rest on standard neural-network training assumptions plus the domain assumption that the surrogate CNN faithfully approximates full-wave simulations.

free parameters (1)

GAN and embedding network hyperparameters
Learning rates, layer dimensions, and regularization strengths chosen or tuned during training.

axioms (1)

domain assumption The surrogate convolutional neural network provides sufficiently accurate extinction cross-section predictions for design evaluation.
Invoked when the surrogate is used to score inverse-design outputs.

pith-pipeline@v0.9.0 · 5814 in / 1066 out tokens · 33271 ms · 2026-05-21T18:25:11.316937+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We train a conditional generative adversarial network model and use it for inverse design of plasmonic nanostructures based on their extinction cross section spectra. Our main result shows that adding label projection and a novel embedding network...
IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean absolute_floor_iff_bare_distinguishability unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We pre-train a convolutional neural network and use it as surrogate model to evaluate the performance of our inverse design model.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

3 extracted references · 3 canonical work pages · 2 internal anchors

[1]

The complexity of the design problem increases with the size of the design parameter space, leading to sev- eral challenges when developing methods to predict op- timal designs with respect to the desired optical prop- erties. These challenges include limitations in existing physics-based approaches to model light-matter interac- tions of particles with c...

work page internal anchor Pith review Pith/arXiv arXiv 2025
[2]

Light–matter interactions in quantum nanopho- tonic devices,

Original images are to the left, and generated ones are to the right. The samples are chosen to illustrate the fact that one specific cross section sprectra might correspond to several nanostructures designs, the so called one-to-many problem. As a result, the GAN-model output can differ from the original image while its corresponding spectra are still cl...

work page arXiv 2021
[3]

Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks

pp. 165–187. 28T. Feichtner, O. Selig, and B. Hecht, “Plasmonic nanoantenna design and fabrication based on evolutionary optimization,” Op- tics express 25, 10828–10842 (2017). 29P. R. Wiecha, A. Arbouet, C. Girard, and O. L. Muskens, “Deep learning in nano-photonics: inverse design and beyond,” Photon- ics Research 9, B182–B200 (2021). 30R.Verre, N.Macca...

work page internal anchor Pith review Pith/arXiv arXiv 2017

[1] [1]

The complexity of the design problem increases with the size of the design parameter space, leading to sev- eral challenges when developing methods to predict op- timal designs with respect to the desired optical prop- erties. These challenges include limitations in existing physics-based approaches to model light-matter interac- tions of particles with c...

work page internal anchor Pith review Pith/arXiv arXiv 2025

[2] [2]

Light–matter interactions in quantum nanopho- tonic devices,

Original images are to the left, and generated ones are to the right. The samples are chosen to illustrate the fact that one specific cross section sprectra might correspond to several nanostructures designs, the so called one-to-many problem. As a result, the GAN-model output can differ from the original image while its corresponding spectra are still cl...

work page arXiv 2021

[3] [3]

Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks

pp. 165–187. 28T. Feichtner, O. Selig, and B. Hecht, “Plasmonic nanoantenna design and fabrication based on evolutionary optimization,” Op- tics express 25, 10828–10842 (2017). 29P. R. Wiecha, A. Arbouet, C. Girard, and O. L. Muskens, “Deep learning in nano-photonics: inverse design and beyond,” Photon- ics Research 9, B182–B200 (2021). 30R.Verre, N.Macca...

work page internal anchor Pith review Pith/arXiv arXiv 2017