Transolver-3: Scaling Up Transformer Solvers to Industrial-Scale Geometries

Haixu Wu; Hang Zhou; Haonan Shangguan; Huikun Weng; Jianmin Wang; Mingsheng Long; Yuezhou Ma

arxiv: 2602.04940 · v2 · pith:DZIJMFJTnew · submitted 2026-02-04 · 💻 cs.LG

Transolver-3: Scaling Up Transformer Solvers to Industrial-Scale Geometries

Hang Zhou , Haixu Wu , Haonan Shangguan , Yuezhou Ma , Huikun Weng , Jianmin Wang , Mingsheng Long This is my paper

classification 💻 cs.LG

keywords transolver-3meshesindustrial-scalesolverscellsgeometrieshigh-fidelityhigh-resolution

0 comments

read the original abstract

Deep learning has emerged as a transformative tool for the neural surrogate modeling of partial differential equations (PDEs), known as neural PDE solvers. However, scaling these solvers to industrial-scale geometries with over $10^8$ cells remains a fundamental challenge due to the prohibitive memory complexity of processing high-resolution meshes. We present Transolver-3, a new member of the Transolver family as a highly scalable framework designed for high-fidelity physics simulations. To bridge the gap between limited GPU capacity and the resolution requirements of complex engineering tasks, we introduce two key architectural optimizations: faster slice and deslice by exploiting matrix multiplication associative property and geometry slice tiling to partition the computation of physical states. Combined with an amortized training strategy by learning on random subsets of original high-resolution meshes and a physical state caching technique during inference, Transolver-3 enables high-fidelity field prediction on industrial-scale meshes. Extensive experiments demonstrate that Transolver-3 can handle meshes with over 160 million cells, achieving impressive performance across three challenging simulation benchmarks, including aircraft and automotive design tasks. Code is available at https://github.com/thuml/Transolver-3.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 8 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Neural Statistical Functions
cs.LG 2026-05 unverdicted novelty 7.0

Neural statistical functions use prefix statistics to unify and directly predict statistical quantities over continuous ranges from pre-trained single-sample models without repeated sampling.
M$^3$: Reframing Training Measures for Discretized Physical Simulations
cs.AI 2026-05 unverdicted novelty 7.0

M³ partitions space by physical variation using multi-scale Morton ordering to balance training measures, yielding up to 4.7× lower error on industrial volumetric datasets and outperforming higher-resolution training ...
Learning Neural Operator Surrogates for the Black Hole Accretion Code
astro-ph.HE 2026-04 unverdicted novelty 7.0

Physics-informed Fourier neural operators recover plasmoid formation in sparse SRRMHD vortex data where data-only models fail, and transformer operators approximate AMR jet evolution, marking first reported uses in th...
Attention mechanism for scalable mesh-based neural surrogates of free-surface fluids
cs.CE 2026-06 unverdicted novelty 6.0

Self-attention mechanisms are used to build mesh-preserving neural surrogates that approximate PFEM dynamics for free-surface flows, delivering accurate transient predictions and improved scalability on 2D and 3D benchmarks.
Courant: a State-Adaptive Perceiver-Based Neural Surrogate with Local Support and Interpretable Field Decomposition
cs.LG 2026-05 unverdicted novelty 6.0

Courant is a state-adaptive Perceiver encoder-processor-decoder surrogate trained with L2 loss that yields interpretable, multiscale, locally supported latent features acting as time-evolving spatial basis functions.
ShardTensor: Domain Parallelism for Scientific Machine Learning
cs.DC 2026-05 unverdicted novelty 6.0

ShardTensor is a domain-parallelism system for SciML that enables flexible scaling of extreme-resolution spatial datasets by removing the constraint of batch size one per device.
Adapting Automotive Aerodynamics Surrogates to New Vehicle Families via Transfer Learning
cs.CE 2026-05 unverdicted novelty 5.0

LoRA adapters enable a 61.47M-parameter aerodynamics Transformer pretrained on four vehicle families to adapt to a held-out fifth family with 20 samples, reaching R²=0.85 and outperforming full fine-tuning and from-sc...
RETO: A Rotary-Enhanced Transformer Operator for High-Fidelity Prediction of Automotive Aerodynamics
eess.IV 2026-04 unverdicted novelty 4.0

RETO achieves relative L2 errors of 0.063 on ShapeNet and 0.089/0.097 on DrivAerML surface pressure/velocity, outperforming Transolver and other baselines.