pith. sign in

arxiv: 2604.18062 · v1 · submitted 2026-04-20 · 💻 cs.LG · physics.flu-dyn

Towards a Foundation-Model Paradigm for Aerodynamic Prediction in Three-dimensional Design

Pith reviewed 2026-05-10 05:58 UTC · model grok-4.3

classification 💻 cs.LG physics.flu-dyn
keywords aerodynamicstransformerpre-trainingfine-tuningsurrogate model3D flow predictionfoundation modeltransonic wing
0
0 comments X

The pith

Pre-training on diverse wing geometries allows accurate aerodynamic predictions on new designs with only 450 fine-tuning samples.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper aims to show that a foundation-model approach—pre-training a Transformer on a large, varied set of aerodynamic simulations followed by limited fine-tuning—can produce reliable surrogate models for three-dimensional flow prediction much more data-efficiently than standard training. This would matter because high-fidelity simulations for complex 3D shapes are computationally expensive, so fewer required samples would speed up iterative design optimization in aerospace engineering. The authors develop AeroTransformer and demonstrate its performance on transonic wing configurations by first using nearly 30,000 samples from the SuperWing dataset and then adapting to specific perturbed shapes from the Common Research Model.

Core claim

Pre-training AeroTransformer on the SuperWing dataset of nearly 30,000 samples with broad geometric diversity, then fine-tuning on 450 task-specific samples for perturbed Common Research Model wings, achieves 0.36% error on surface-flow prediction, an 84.2% reduction relative to training from scratch.

What carries the argument

AeroTransformer, a Transformer-based architecture designed for large-scale aerodynamic training that learns transferable representations from diverse geometries during pre-training before adapting to task-specific data.

Load-bearing premise

That the pre-training on the broad SuperWing dataset creates representations that transfer well to fine-tuning on perturbed Common Research Model wing shapes without major domain-shift issues.

What would settle it

Demonstrating that fine-tuning the pre-trained model on 450 samples from a new wing geometry family yields error rates comparable to or higher than training from scratch.

Figures

Figures reproduced from arXiv: 2604.18062 by Babak Gholami, Caglar Gurbuz, Mohammad Rashed, Nils Thuerey, Yunjia Yang.

Figure 1
Figure 1. Figure 1: Our approach to building and utilizing foundation-model paradigm for aerodynamic design [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Transformer architecture The self-attention mechanism represents the core of the architecture. It determines how much each token should focus on every other token in the input sequence and helps the model learns the global correlation between every token. It is realized with a query-answer mechanism. Suppose the input sequence has M tokens with each having Nhidden dimensions, a value vector vi with dimensi… view at source ↗
Figure 3
Figure 3. Figure 3: A visual overview of the AeroTransformer architecture [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Transonic wing and its shape parameters The planform shape determines the chord lengths and the locations of each sectional airfoil on the x-z plane, while the spanwise distribution of y-axis locations and the rotation of the airfoils along z-axis are further determined by two extra group of parameters, the dihedrals yLE and the twist angles αtw. 4.1.2 Data formats The wing flow field prediction task in th… view at source ↗
Figure 5
Figure 5. Figure 5: Transferring surface mesh and quantities from simulation mesh to reference mesh [PITH_FULL_IMAGE:figures/full_fig_p015_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: An illustration highlighting the substantial differences in pre-training dataset (SuperWing) [PITH_FULL_IMAGE:figures/full_fig_p018_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Surface pressure predictions from ViT and AeroTransformer [PITH_FULL_IMAGE:figures/full_fig_p021_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: AeroTransformer performance when trained with larger batch sizes [PITH_FULL_IMAGE:figures/full_fig_p021_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: AeroTransformer performance when model and dataset size are scaled up [PITH_FULL_IMAGE:figures/full_fig_p022_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Impact of pre-training on task-specific prediction errors [PITH_FULL_IMAGE:figures/full_fig_p026_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Impact of pre-training on the surface flow prediction performance of CRM [PITH_FULL_IMAGE:figures/full_fig_p027_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Impact of pre-training on the aerodynamic curves of CRM [PITH_FULL_IMAGE:figures/full_fig_p028_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Downstream performance with different pre-trained models [PITH_FULL_IMAGE:figures/full_fig_p029_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Model performance trained with different amounts of task-specific samples [PITH_FULL_IMAGE:figures/full_fig_p030_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: Model performance fine-tuned with increasing training steps [PITH_FULL_IMAGE:figures/full_fig_p031_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Model performance improvement by spending extra time on increasing dataset size and [PITH_FULL_IMAGE:figures/full_fig_p032_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: Effect of parameter-efficient fine-tuning under different downstream data regimes [PITH_FULL_IMAGE:figures/full_fig_p033_17.png] view at source ↗
Figure 18
Figure 18. Figure 18: User interface for the interactive design tool Webwing, which remotely runs a pre-trained [PITH_FULL_IMAGE:figures/full_fig_p034_18.png] view at source ↗
read the original abstract

Accurate machine-learning models for aerodynamic prediction are essential for accelerating shape optimization, yet remain challenging to develop for complex three-dimensional configurations due to the high cost of generating training data. This work introduces a methodology for efficiently constructing accurate surrogate models for design purposes by first pre-training a large-scale model on diverse geometries and then fine-tuning it with a few more detailed task-specific samples. A Transformer-based architecture, AeroTransformer, is developed and tailored for large-scale training to learn aerodynamics. The methodology is evaluated on transonic wings, where the model is pre-trained on SuperWing, a dataset of nearly 30000 samples with broad geometric diversity, and subsequently fine-tuned to handle specific wing shapes perturbed from the Common Research Model. Results show that, with 450 task-specific samples, the proposed methodology achieves 0.36% error on surface-flow prediction, reducing 84.2% compared to training from scratch. The influence of model configurations and training strategies is also systematically studied to provide guidance on effectively training and deploying such models under limited data and computational budgets. To facilitate reuse, we release the datasets and the pre-trained models at https://github.com/tum-pbs/AeroTransformer. An interactive design tool is also built on the pre-trained model and is available online at https://webwing.pbs.cit.tum.de.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces a foundation-model approach for 3D aerodynamic surrogate modeling. A Transformer architecture (AeroTransformer) is pre-trained on the large, geometrically diverse SuperWing dataset of nearly 30,000 samples and then fine-tuned on 450 task-specific samples for perturbed Common Research Model wings under transonic conditions. The central empirical result is a 0.36% error on surface-flow prediction, representing an 84.2% reduction relative to training from scratch; systematic ablation studies on model size and training strategy are included, and the authors release the datasets, pre-trained weights, and an interactive web tool.

Significance. If the surface-flow accuracy is shown to correlate with integrated aerodynamic quantities and optimization performance, the pre-training-plus-fine-tuning paradigm could materially reduce the data-generation cost of high-fidelity 3D aerodynamic surrogates. The public release of the SuperWing dataset, pre-trained models, and reproduction code constitutes a concrete contribution that supports reproducibility and follow-on work in the field.

major comments (2)
  1. [Evaluation on transonic wings / Results] The headline claim positions the model for “accelerating shape optimization” and “design purposes,” yet the reported metric is a single scalar surface-flow pointwise error (0.36%). In transonic flow, local pressure or velocity discrepancies can integrate to non-negligible errors in lift/drag coefficients or produce inconsistent adjoint gradients; the manuscript should therefore also report errors on integrated forces (Cl, Cd) and, ideally, a simple gradient-based optimization test to anchor the design-utility assertion.
  2. [Methodology and experimental setup] The 84.2% error reduction is stated relative to “training from scratch.” The exact baseline protocol (identical architecture and capacity, same optimizer schedule, same data-augmentation pipeline, and identical number of gradient steps) must be documented so that the improvement can be unambiguously attributed to pre-training rather than to differences in training budget or hyper-parameter tuning.
minor comments (2)
  1. [Abstract and §4] The precise definition of the 0.36% error (e.g., relative L2 norm on pressure, velocity, or both; normalization details; whether it is averaged over the surface or volume) should be stated explicitly in the abstract and early in the results section for immediate interpretability.
  2. [Figures] Figure captions and axis labels should explicitly indicate whether error maps are absolute or relative and which flow variable (pressure, velocity magnitude, etc.) is visualized.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We have revised the paper to include errors on integrated force coefficients and to explicitly document the baseline training protocol, which we believe addresses the concerns while preserving the core contribution of the pre-training paradigm.

read point-by-point responses
  1. Referee: The headline claim positions the model for “accelerating shape optimization” and “design purposes,” yet the reported metric is a single scalar surface-flow pointwise error (0.36%). In transonic flow, local pressure or velocity discrepancies can integrate to non-negligible errors in lift/drag coefficients or produce inconsistent adjoint gradients; the manuscript should therefore also report errors on integrated forces (Cl, Cd) and, ideally, a simple gradient-based optimization test to anchor the design-utility assertion.

    Authors: We agree that integrated quantities are essential to substantiate claims about design utility. In the revised manuscript we have added a dedicated subsection reporting mean absolute errors on Cl and Cd for both the pre-trained/fine-tuned model and the from-scratch baseline on the same 450-sample test set; the relative improvement remains consistent with the surface-flow result (approximately 80 % reduction). We have also included a simple gradient-based optimization experiment in which the surrogate is used to minimize a weighted combination of Cl and Cd subject to geometric constraints; the pre-trained model yields faster convergence and a lower final objective value than the scratch-trained counterpart. These additions are now presented in Section 4.4. revision: yes

  2. Referee: The 84.2% error reduction is stated relative to “training from scratch.” The exact baseline protocol (identical architecture and capacity, same optimizer schedule, same data-augmentation pipeline, and identical number of gradient steps) must be documented so that the improvement can be unambiguously attributed to pre-training rather than to differences in training budget or hyper-parameter tuning.

    Authors: We appreciate the request for explicit documentation. The revised experimental-setup section now states that the from-scratch baseline employs the identical AeroTransformer architecture and parameter count, the same AdamW optimizer with identical learning-rate schedule and warm-up, the same data-augmentation pipeline, and exactly the same total number of gradient steps as the fine-tuning stage. A new table (Table 2) summarizes the hyper-parameters side-by-side for the two settings, confirming that the only difference is the initialization from the pre-trained weights. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical pre-train/fine-tune results on held-out data

full rationale

The paper reports an empirical machine-learning pipeline: pre-training AeroTransformer on the SuperWing dataset of ~30k samples, followed by fine-tuning on 450 task-specific perturbed CRM wing samples, with surface-flow error measured at 0.36% (84.2% reduction vs. scratch training). No derivation chain, equations, or first-principles predictions are claimed; the headline metric is a standard held-out evaluation on separate test configurations. Dataset release and external reproducibility remove any self-referential dependency. No self-citations, ansatzes, or fitted parameters are renamed as predictions. The result is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 1 invented entities

The central claim rests on the transferability of representations learned from a large diverse geometry dataset to narrow task-specific fine-tuning; this is an empirical domain assumption rather than a derived result. Model architecture choices and training hyperparameters constitute free parameters whose values are selected to produce the reported numbers.

free parameters (2)
  • fine-tuning sample count
    450 samples chosen for the reported task-specific adaptation; the number is selected to demonstrate low-data performance.
  • model configuration and training hyperparameters
    Various configurations are studied; their specific values are fitted or chosen to achieve the 0.36% error on the target task.
axioms (1)
  • domain assumption Pre-training on diverse 3D wing geometries produces transferable representations for fine-tuning on perturbed shapes from the same family
    Invoked when claiming that 450 samples suffice after pre-training on SuperWing.
invented entities (1)
  • AeroTransformer no independent evidence
    purpose: Transformer architecture tailored for large-scale aerodynamic learning
    New model variant introduced and trained in the paper; no independent evidence outside the reported experiments.

pith-pipeline@v0.9.0 · 5554 in / 1550 out tokens · 53253 ms · 2026-05-10T05:58:34.193937+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

57 extracted references · 57 canonical work pages

  1. [1]

    Brunton, J

    Steven L. Brunton, J. Nathan Kutz, Krithika Manohar, Aleksandr Y. Aravkin, Kristi Mor- gansen, Jennifer Klemisch, Nicholas Goebel, James Buttrick, Jeffrey Poskin, Adriana W. Blom-Schieber, Thomas Hogan, and Darren McDonald. Data-driven aerospace engineering: Reframing the industry with machine learning.AIAA Journal, pages 1–26, 2021

  2. [2]

    Joaquim R.R.A. Martins. Aerodynamic design optimization: Challenges and perspectives. Computers & Fluids, 239:105391, May 2022

  3. [3]

    Fast Buffet-Onset Prediction and Opti- mization Method Based on Pretrained Flowfield Prediction Model.AIAA Journal, 62(8):2979– 95, August 2024

    Yunjia Yang, Runze Li, Yufei Zhang, and Haixin Chen. Fast Buffet-Onset Prediction and Opti- mization Method Based on Pretrained Flowfield Prediction Model.AIAA Journal, 62(8):2979– 95, August 2024

  4. [4]

    Aerodynamic Robust Design Research Using Adjoint-Based Optimization under Operating Uncertainties

    Yuhang Ma, Jiecheng Du, Tihao Yang, Yayun Shi, Libo Wang, and Wei Wang. Aerodynamic Robust Design Research Using Adjoint-Based Optimization under Operating Uncertainties. Aerospace, 10(10):831, September 2023

  5. [5]

    Gaetan K. W. Kenway and Joaquim R. R. A. Martins. Multipoint Aerodynamic Shape Opti- mization Investigations of the Common Research Model Wing.AIAA Journal, 54(1):113–128, January 2016

  6. [6]

    Jichao Li, Xiaosong Du, and Joaquim R.R.A. Martins. Machine learning in aerodynamic shape optimization.Progress in Aerospace Sciences, 134:100849, October 2022

  7. [7]

    Deep learning methods for reynolds-averaged navier–stokes simulations of airfoil flows.AIAA Journal, 58(1):25–36, 2020

    Nils Thuerey, Konstantin Weissenow, Lukas Prantl, and Xiangyu Hu. Deep learning methods for reynolds-averaged navier–stokes simulations of airfoil flows.AIAA Journal, 58(1):25–36, 2020

  8. [8]

    Towards high-accuracy deep learning inference of compressible flows over aerofoils.Computers & Fluids, 250:105707, 2023

    Li-Wei Chen and Nils Thuerey. Towards high-accuracy deep learning inference of compressible flows over aerofoils.Computers & Fluids, 250:105707, 2023

  9. [9]

    Ashwin Renganathan, Romit Maulik, and Jai Ahuja

    S. Ashwin Renganathan, Romit Maulik, and Jai Ahuja. Enhanced data efficiency using deep neural networks and Gaussian processes for aerodynamic design optimization.Aerospace Sci- ence and Technology, 111:106522, April 2021

  10. [10]

    Kovani, Marina G

    Konstantina G. Kovani, Marina G. Kontou, Varvara G. Asouti, and Kyriakos C. Giannakoglou. DNN-Driven Gradient-Based Shape Optimization in Fluid Mechanics. In Lazaros Iliadis, Ilias 37 Maglogiannis, Serafin Alonso, Chrisina Jayne, and Elias Pimenidis, editors,Engineering Appli- cations of Neural Networks, volume 1826, pages 379–390. Springer Nature Switzer...

  11. [11]

    Data-based approach for wing shape design optimization

    Jichao Li and Mengqi Zhang. Data-based approach for wing shape design optimization. Aerospace Science and Technology, 112:106639, May 2021

  12. [12]

    Numerical investigation of min- imum drag profiles in laminar flow using deep learning surrogates.Journal of Fluid Mechanics, 919, 2021

    Li-Wei Chen, Berkay A Cakal, Xiangyu Hu, and Nils Thuerey. Numerical investigation of min- imum drag profiles in laminar flow using deep learning surrogates.Journal of Fluid Mechanics, 919, 2021

  13. [13]

    Hasan, S

    M. Hasan, S. Redonnet, and D. Zhongmin. Aerodynamic optimization of aircraft wings using machine learning.Advances in Engineering Software, 200:103801, February 2025

  14. [14]

    Jichao Li, Mohamed Amine Bouhlel, and Joaquim R. R. A. Martins. Data-Based Approach for Fast Airfoil Analysis and Optimization.AIAA Journal, 57(2):581–596, February 2019

  15. [15]

    Mohamed Amine Bouhlel, Sicheng He, and Joaquim R. R. A. Martins. Scalable gradi- ent–enhanced artificial neural networks for airfoil shape design in the subsonic and transonic regimes.Structural and Multidisciplinary Optimization, 61(4):1363–1376, April 2020

  16. [16]

    Novel Pressure-Based Optimization Method Using Deep Learning Techniques.AIAA Journal, 62(2):708–724, February 2024

    Jiehua Tian, Feng Qu, Di Sun, and Qing Wang. Novel Pressure-Based Optimization Method Using Deep Learning Techniques.AIAA Journal, 62(2):708–724, February 2024

  17. [17]

    Generalizable Multifidelity Aerodynamic Wing Shape Design Optimization.Journal of Aircraft, 2025

    Aobo Yang, Jichao Li, and Rhea P Liem. Generalizable Multifidelity Aerodynamic Wing Shape Design Optimization.Journal of Aircraft, 2025

  18. [18]

    Mader, and Joaquim R

    Neil Wu, Charles A. Mader, and Joaquim R. R. A. Martins. Sensitivity-Based Geometric Parametrization and Automatic Scaling for Aerodynamic Shape Optimization.AIAA Journal, 62(1):231–246, January 2024

  19. [19]

    Aerodynamic shape optimization of hypersonic aircraft using data-driven generative nonlinear parameterization.Chinese Journal of Aeronautics, page 103924, November 2025

    Yan Chen, Jichao Li, and Jinsheng Cai. Aerodynamic shape optimization of hypersonic aircraft using data-driven generative nonlinear parameterization.Chinese Journal of Aeronautics, page 103924, November 2025

  20. [20]

    Uncertainty-aware data-based method for fast and reliable shape optimization.Structural and Multidisciplinary Optimization, 69(4):95, April 2026

    Yunjia Yang, Runze Li, Yufei Zhang, and Haixin Chen. Uncertainty-aware data-based method for fast and reliable shape optimization.Structural and Multidisciplinary Optimization, 69(4):95, April 2026

  21. [21]

    Learning three-dimensional flow for interactive aerody- namic design.ACM Transactions on Graphics, 37(4):1–10, August 2018

    Nobuyuki Umetani and Bernd Bickel. Learning three-dimensional flow for interactive aerody- namic design.ACM Transactions on Graphics, 37(4):1–10, August 2018. 38

  22. [22]

    Going with the speed of sound: Pushing neural surrogates into highly-turbulent transonic regimes, 2025

    Fabian Paischer, Leo Cotteleer, Yann Dreze, Richard Kurle, Dylan Rubini, Maurits Bleeker, Tobias Kronlachner, and Johannes Brandstetter. Going with the speed of sound: Pushing neural surrogates into highly-turbulent transonic regimes, 2025

  23. [23]

    Maddix, and Peter Yu

    Neil Ashton, Charles Mockett, Marian Fuchs, Louis Fliessbach, Hendrik Hetmann, Thilo Knacke, Norbert Sch¨ onwald, Vangelis Skaperdas, Grigoris Fotiadis, Astrid Walle, Burkhard Hupertz, Danielle C. Maddix, and Peter Yu. DrivAerML: High-Fidelity Computational Fluid Dynamics Dataset for Road-Car External Aerodynamics. InForty-second International Con- ferenc...

  24. [24]

    ONERA’s CRM WBPN database for machine learning activities, related regression challenge and first results.Computers & Fluids, 302:106838, November 2025

    Jacques Peter, Quentin Bennehard, S´ ebastien Heib, Jean-Luc Hantrais-Gervois, and Fr´ ed´ eric Mo¨ ens. ONERA’s CRM WBPN database for machine learning activities, related regression challenge and first results.Computers & Fluids, 302:106838, November 2025

  25. [25]

    Transolver: A fast transformer solver for PDEs on general geometries

    Haixu Wu, Huakun Luo, Haowen Wang, Jianmin Wang, and Mingsheng Long. Transolver: A fast transformer solver for PDEs on general geometries. InForty-first International Conference on Machine Learning, 2024

  26. [26]

    Poseidon: Efficient foundation models for pdes

    Maximilian Herde, Bogdan Raonic, Tobias Rohner, Roger K¨ appeli, Roberto Molinaro, Em- manuel de B´ ezenac, and Siddhartha Mishra. Poseidon: Efficient foundation models for pdes. InNeurIPS, 2024

  27. [27]

    Self-supervised learning based on Transformer for flow reconstruction and prediction.Physics of Fluids, 36(2):023607, February 2024

    Bonan Xu, Yuanye Zhou, and Xin Bian. Self-supervised learning based on Transformer for flow reconstruction and prediction.Physics of Fluids, 36(2):023607, February 2024

  28. [28]

    PDE-transformer: Efficient and versatile transformers for physics simulations

    Benjamin Holzschuh, Qiang Liu, Georg Kohl, and Nils Thuerey. PDE-transformer: Efficient and versatile transformers for physics simulations. InForty-second International Conference on Machine Learning, 2025

  29. [29]

    Unisolver: PDE- conditional transformers towards universal neural PDE solvers

    Hang Zhou, Yuezhou Ma, Haixu Wu, Haowen Wang, and Mingsheng Long. Unisolver: PDE- conditional transformers towards universal neural PDE solvers. InForty-second International Conference on Machine Learning, 2025

  30. [30]

    Mmet: A multi-input and multi-scale transformer for efficient pdes solving

    Yichen Luo, Jia Wang, Dapeng Lan, Yu Liu, and Zhibo Pang. Mmet: A multi-input and multi-scale transformer for efficient pdes solving. InIJCAI, pages 7634–7642, 2025

  31. [31]

    Geometric- perspective transfer learning for fast aerodynamic prediction in few-shot tasks.Physical Review Fluids, 9(10):104101, October 2024

    Yang Shen, Hao Zhang, Wei Huang, Chao-yang Liu, and Zhen-guo Wang. Geometric- perspective transfer learning for fast aerodynamic prediction in few-shot tasks.Physical Review Fluids, 9(10):104101, October 2024

  32. [32]

    Deep transfer learning for three-dimensional aerodynamic pressure prediction under data scarcity.Theoretical and Applied Mechanics Letters, 15(2):100571, March 2025

    Hao Zhang, Yang Shen, Wei Huang, Zan Xie, and Yao-bin Niu. Deep transfer learning for three-dimensional aerodynamic pressure prediction under data scarcity.Theoretical and Applied Mechanics Letters, 15(2):100571, March 2025. 39

  33. [33]

    Transferable scaling function learning method for knowledge embedded aerodynamic database construction.Aerospace Science and Technology, 176:112097, September 2026

    Haitao Lin, Xu Wang, and Weiwei Zhang. Transferable scaling function learning method for knowledge embedded aerodynamic database construction.Aerospace Science and Technology, 176:112097, September 2026

  34. [34]

    Modification of the Class-Shape-Transformation Parameterization Based on Radial Basis Functions.Journal of Aircraft, 61(2):451–469, March 2024

    Bingchen Du, Zhiliang Lu, Tongqing Guo, Di Zhou, and Qiaozhong Li. Modification of the Class-Shape-Transformation Parameterization Based on Radial Basis Functions.Journal of Aircraft, 61(2):451–469, March 2024

  35. [35]

    Parametric generative schemes with geometric constraints for encoding and synthesizing airfoils.Engineering Applications of Artificial Intel- ligence, 128:107505, February 2024

    Hairun Xie, Jing Wang, and Miao Zhang. Parametric generative schemes with geometric constraints for encoding and synthesizing airfoils.Engineering Applications of Artificial Intel- ligence, 128:107505, February 2024

  36. [36]

    Attention is all you need

    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need. In31st Conference on Neural Information Processing Systems (NIPS 2017), pages 1–11, 2017

  37. [37]

    An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

    Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. InInternational Conference on Learning Representations, 2021

  38. [38]

    Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

    Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. In2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 9992–10002, Mon- treal, QC, Canada, October 2021. IEEE

  39. [39]

    Scalable Diffusion Models with Transformers

    William Peebles and Saining Xie. Scalable Diffusion Models with Transformers. In2023 IEEE/CVF International Conference on Computer Vision (ICCV), pages 4172–4182, Paris, France, October 2023. IEEE

  40. [40]

    ConFIG: Towards Conflict-free Training of Physics Informed Neural Networks

    Qiang Liu, Mengyu Chu, and Nils Thuerey. ConFIG: Towards Conflict-free Training of Physics Informed Neural Networks. InThe Thirteenth International Conference on Learning Repre- sentations, 2025

  41. [41]

    On the difficulty of training recurrent neural networks

    Razvan Pascanu, Tomas Mikolov, and Yoshua Bengio. On the difficulty of training recurrent neural networks. In Sanjoy Dasgupta and David McAllester, editors,Proceedings of the 30th International Conference on Machine Learning, volume 28 ofProceedings of Machine Learning Research, pages 1310–1318. PMLR, 2013

  42. [42]

    To- wards a Unified View of Parameter-Efficient Transfer Learning

    Junxian He, Chunting Zhou, Xuezhe Ma, Taylor Berg-Kirkpatrick, and Graham Neubig. To- wards a Unified View of Parameter-Efficient Transfer Learning. InICLR 2022, 2022. 40

  43. [43]

    Hu, yelong shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen

    Edward J. Hu, yelong shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. LoRA: Low-Rank Adaptation of Large Language Models. In International Conference on Learning Representations, 2022

  44. [44]

    Su- perwing: a comprehensive transonic wing dataset for data-driven aerodynamic design, 2025

    Yunjia Yang, Weishao Tang, Mengxin Liu, Nils Thuerey, Yufei Zhang, and Haixin Chen. Su- perwing: a comprehensive transonic wing dataset for data-driven aerodynamic design, 2025

  45. [45]

    Development of a Common Research Model for Applied CFD Validation Studies

    John Vassberg, Mark Dehaan, Melissa Rivers, and Richard Wahls. Development of a Common Research Model for Applied CFD Validation Studies. In26th AIAA Applied Aerodynamics Conference, Honolulu, Hawaii, August 2008. American Institute of Aeronautics and Astronau- tics

  46. [46]

    Mader, Gaetan K

    Charles A. Mader, Gaetan K. W. Kenway, Anil Yildirim, and Joaquim R. R. A. Martins. AD- flow: An Open-Source Computational Fluid Dynamics Solver for Aerodynamic and Multidis- ciplinary Optimization.Journal of Aerospace Information Systems, 17(9):508–527, September 2020

  47. [47]

    Transferable machine learning model for the aerodynamic prediction of swept wings.Physics of Fluids, 36(7):076105, July 2024

    Yunjia Yang, Runze Li, Yufei Zhang, Lu Lu, and Haixin Chen. Transferable machine learning model for the aerodynamic prediction of swept wings.Physics of Fluids, 36(7):076105, July 2024

  48. [48]

    Rapid aerodynamic prediction of swept wings via physics-embedded transfer learning.AIAA Journal, 63(6):2545–2559, 2025

    Yunjia Yang, Runze Li, Yufei Zhang, Lu Lu, and Haixin Chen. Rapid aerodynamic prediction of swept wings via physics-embedded transfer learning.AIAA Journal, 63(6):2545–2559, 2025. Publisher: American Institute of Aeronautics and Astronautics

  49. [49]

    Rapid aerodynamic prediction for wings via physics-embedded transformer

    Yunjia Yang, Weishao Tang, Haixin Chen, and Yufei Zhang. Rapid aerodynamic prediction for wings via physics-embedded transformer. In11th European Conference for Aeronautics and Space Sciences (EUCASS), 2025

  50. [50]

    Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, and Dario Amodei

    Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B. Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, and Dario Amodei. Scaling laws for neural language models, 2020

  51. [51]

    Visualizing data using t-sne.Journal of machine learning research, 9(11), 2008

    Laurens Van der Maaten and Geoffrey Hinton. Visualizing data using t-sne.Journal of machine learning research, 9(11), 2008. Appendix A. Dimensionality reduction analysis of the wing shapes In the proposed framework, the pre-training dataset is designed to cover a broad range of wing geometries with moderate geometric fidelity, while the fine-tuning datase...

  52. [52]

    We perform PCA on the flattened grid points of the wing shapes

    PCA PCA provides a linear estimate of the intrinsic dimensionality of the dataset by analyzing the variance captured by orthogonal modes. We perform PCA on the flattened grid points of the wing shapes. Before applying PCA, all parameters are standardized to zero mean and unit variance to ensure that parameters with different physical scales contribute equ...

  53. [53]

    We construct twot-SNE embeddings with a perplexity of 50 for the full pre-training and fine-tuning datasets, respectively

    Visualization witht-distributed Stochastic Neighbor Embedding To further examine the relationship between the pre-training and fine-tuning datasets, we employ t-distributed Stochastic Neighbor Embedding (t-SNE) [51], a nonlinear dimensionality reduction method that is particularly effective at preserving local neighborhood structure in high-dimensional da...

  54. [54]

    Baseline models for prediction of surface flow We use the U-Net, ViT, and the Transolver for the baseline of surface-flow prediction. Consid- ering the principle above, we control all model capacity by adjusting the hidden dimensionN hidden 43 such that the total number of trainable parameters is of the same order (approximately 1M) as the S-size AeroTran...

  55. [55]

    They directly predict aerodynamic co- efficients from geometric parameters rather than from meshes

    Baseline models for prediction of aerodynamic coefficients Random Forest (RF) and Light Gradient Boosting Machine (LGBM) are used as two non-neural baselines for the geometry-to-performance prediction task. They directly predict aerodynamic co- efficients from geometric parameters rather than from meshes. To ensure all inputs and outputs were on the same ...

  56. [56]

    Models are trained with different numbers of optimization steps, ranging from 36.6k to 585.6k, while all other training settings remain identical

    Total training steps To assess the effect of training steps on the performance of the pre-trained model, we conducted a pre-experiment using the L-size AeroTransformer on the full pre-training dataset. Models are trained with different numbers of optimization steps, ranging from 36.6k to 585.6k, while all other training settings remain identical. Figure C...

  57. [57]

    Gradient clipping The stability of the gradient plays a critical role in training, especially for large models. In Fig. C2, we show the loss on the validation samples during the training of an S-size and L- size AeroTransformer with a full pre-training dataset. The two models are trained with the same settings, i.e., a maximum learning rate of 10−3, the d...