pith. sign in

arxiv: 2606.12994 · v1 · pith:LCSRNELAnew · submitted 2026-06-11 · 💻 cs.LG · cs.CE

DeepJEB++: Foundation Model-Driven Large-Scale 3D Engineering Dataset via 2D Latent Space Augmentation

Pith reviewed 2026-06-27 07:45 UTC · model grok-4.3

classification 💻 cs.LG cs.CE
keywords 3D dataset augmentationjet engine bracketslatent diffusion modelsfoundation modelsfinite element labelingengineering design datasimulation labelsdata scarcity
0
0 comments X

The pith

A pipeline augments fewer than 400 seed jet-engine brackets into 15,360 simulation-labeled 3D meshes by operating in 2D latent space before lifting to 3D.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows how to overcome the scarcity of large 3D engineering datasets that include both geometry and physics labels. It does so by first expanding designs inside a data-rich 2D latent space with a fine-tuned diffusion model, then using a vision-language filter to keep only manufacturable outputs, lifting the survivors to 3D meshes with a domain-adapted generative model, and finally running an automated pipeline that detects load and bolt interfaces to attach finite-element labels. A sympathetic reader would care because data-driven design methods cannot scale without thousands of geometry-plus-performance pairs, and manual creation of such pairs is prohibitively expensive. The work claims this staged process achieves a 40-fold increase while staying within single-GPU budgets per stage and preserving the geometric and label properties needed for downstream use.

Core claim

Starting from fewer than 400 seed designs, the three-stage process produces 15,360 simulation-labeled 3D brackets. Stage 1 fine-tunes a pretrained 2D latent diffusion model on multi-view renders and synthesizes new views by latent interpolation, then applies a vision-language-model filter to retain only manufacturable results. Stage 2 lifts the validated 2D images to 3D meshes using a domain-adapted generative foundation model. Stage 3 automatically identifies load and bolt interfaces on each mesh and computes finite-element labels for mass, stress, and displacement. Quality is assessed on manufacturability, fidelity to SimJEB ground truth, and distributional consistency.

What carries the argument

2D latent-space augmentation with a fine-tuned diffusion model followed by vision-language filtering and 3D lifting with a domain-adapted generative model.

If this is right

  • A 40x increase in labeled 3D engineering data becomes feasible from a few hundred seeds using only one GPU per stage.
  • Automated interface detection removes the need for manual boundary-condition assignment before simulation.
  • The resulting dataset supports reproducible training of AI models for structural design tasks.
  • Quality checks along manufacturability, label fidelity, and distributional axes provide a repeatable evaluation template for other augmentation pipelines.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same 2D-to-3D lifting pattern could be tested on other load-bearing components such as brackets in different industries.
  • If the filter and lifting steps generalize, the approach reduces the data bottleneck that currently limits physics-informed machine learning in mechanical design.
  • Public release of the 15,360-mesh set creates a concrete benchmark that later papers can use to measure further augmentation gains.

Load-bearing premise

The vision-language filter and the 3D generative model together keep manufacturability and simulation-label accuracy high enough that no systematic geometric or physics errors invalidate later engineering use.

What would settle it

A random sample of 200 generated meshes is run through an independent finite-element solver; if more than 10 percent produce stress or displacement values that deviate by more than 15 percent from the SimJEB reference under identical boundary conditions, the claim of label fidelity fails.

read the original abstract

Data-driven engineering design is constrained by the lack of large-scale 3D datasets that pair geometry with physics-based performance labels. In particular, existing 3D data augmentation techniques have limitations in preserving subtle and diverse geometric variations, and it remains difficult to automate the subsequent simulation-labeling process, where boundary conditions vary depending on the generated geometry. We present DeepJEB++, a foundation-model-driven data-augmentation framework that expands a small seed set of jet engine brackets into a large, simulation-labeled 3D dataset under constrained resources. Our key idea is to augment in the data-rich 2D latent space, then transfer to 3D. In Stage 1, we fine-tune a pretrained 2D latent diffusion model on multi-view renders and synthesize novel views by latent interpolation, retaining manufacturable designs through a vision-language-model (VLM) quality filter. In Stage 2, the validated images are lifted to 3D meshes by a domain-adapted generative foundation model. In Stage 3, an automated pipeline recognizes the load and bolt interfaces on each mesh and assigns finite-element labels -- mass, stress, and displacement -- without manual intervention. We assess augmentation quality along three intrinsic axes: manufacturability, label fidelity against the SimJEB ground truth, and distributional consistency. Starting from fewer than 400 seed designs, DeepJEB++ yields 15,360 simulation-labeled 3D brackets -- a 40x expansion -- using a single GPU per stage. The dataset will be made publicly available to support reproducible engineering-AI research.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The paper introduces DeepJEB++, a foundation-model-driven framework that augments fewer than 400 seed 3D jet engine bracket designs into 15,360 simulation-labeled meshes (40x expansion) via 2D latent-space augmentation. Stage 1 fine-tunes a pretrained 2D latent diffusion model on multi-view renders, synthesizes novel views by interpolation, and applies a VLM quality filter for manufacturability. Stage 2 lifts validated images to 3D meshes using a domain-adapted generative foundation model. Stage 3 automates recognition of load/bolt interfaces and assignment of finite-element labels (mass, stress, displacement). Quality is assessed along manufacturability, label fidelity to SimJEB ground truth, and distributional consistency; the dataset is to be released publicly, with all stages using a single GPU.

Significance. If the quality assessments hold with supporting quantitative evidence, the work would deliver a valuable public resource for data-driven engineering design, directly addressing the scarcity of large-scale 3D geometry-physics paired datasets. The single-GPU pipeline and reliance on external pretrained models demonstrate practical resource efficiency. Public release of the dataset would support reproducible research in engineering AI.

major comments (2)
  1. [Abstract] Abstract: The claim that quality was assessed along manufacturability, label fidelity, and distributional consistency supplies no quantitative metrics, error bars, failure rates, confusion matrices for interface detection, or ablation results on out-of-distribution shapes. This is load-bearing for the central 40x expansion claim, because even modest rates of invalid boundary-condition assignments would render a non-negligible fraction of the 15,360 samples physically unusable for downstream tasks.
  2. [Abstract] Abstract (Stage 3): The automated interface-recognition and labeling pipeline is described as operating without manual intervention and preserving label fidelity against SimJEB, yet no validation statistics or comparison to manual labeling are reported to bound the error rate on generated geometries.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for highlighting the need for explicit quantitative validation to support the dataset's claimed quality and usability. We address each major comment below and commit to revisions that strengthen the evidence without altering the core claims.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The claim that quality was assessed along manufacturability, label fidelity, and distributional consistency supplies no quantitative metrics, error bars, failure rates, confusion matrices for interface detection, or ablation results on out-of-distribution shapes. This is load-bearing for the central 40x expansion claim, because even modest rates of invalid boundary-condition assignments would render a non-negligible fraction of the 15,360 samples physically unusable for downstream tasks.

    Authors: We agree that the abstract's quality-assessment claim requires supporting quantitative evidence to substantiate the 40x expansion. The manuscript evaluates manufacturability via VLM filtering rates, label fidelity via comparison to SimJEB ground truth, and distributional consistency via statistical tests, but these are not quantified with the specific metrics, error bars, failure rates, confusion matrices, or OOD ablations noted. We will revise the abstract to report key numbers (e.g., pass rates, correlation coefficients, MMD scores) and expand the results section with error bars, interface-detection confusion matrices, failure rates, and OOD ablations. This directly addresses the concern about physically unusable samples. revision: yes

  2. Referee: [Abstract] Abstract (Stage 3): The automated interface-recognition and labeling pipeline is described as operating without manual intervention and preserving label fidelity against SimJEB, yet no validation statistics or comparison to manual labeling are reported to bound the error rate on generated geometries.

    Authors: We concur that bounding the error rate on generated geometries is essential for the automated Stage 3 pipeline. The current text reports fidelity preservation against SimJEB for seed designs and states the pipeline runs without manual intervention, but does not include validation statistics or manual-labeling comparisons for the augmented set. We will add these in revision: agreement rates and error bounds on a manually labeled subset of generated samples, plus any relevant confusion matrices, updating both the abstract and Stage 3 description. revision: yes

Circularity Check

0 steps flagged

No circularity: expansion factor is direct output count from external-model pipeline

full rationale

The paper presents a three-stage pipeline (2D latent diffusion fine-tuning + VLM filter, 3D lift via domain-adapted foundation model, automated interface recognition and FE labeling) that starts from <400 seeds and produces 15,360 labeled meshes. The reported 40× expansion is simply the enumerated size of the final dataset; no equations, fitted parameters, or self-citations are invoked to derive or predict this number. Label fidelity is assessed against the external SimJEB ground truth rather than being defined in terms of the pipeline's own outputs. All core components rely on pretrained external models whose training is independent of the present dataset. No self-definitional, fitted-input, or self-citation-load-bearing reductions are present.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities are named. The method implicitly depends on the reliability of off-the-shelf pretrained models and the VLM filter.

axioms (2)
  • domain assumption Pretrained 2D latent diffusion models and vision-language models can be fine-tuned and applied to retain manufacturable jet-engine-bracket geometries
    Invoked in Stage 1 to generate and filter novel views
  • domain assumption Domain-adapted generative foundation model can lift validated 2D images to 3D meshes while preserving geometric features needed for finite-element analysis
    Invoked in Stage 2

pith-pipeline@v0.9.1-grok · 5845 in / 1465 out tokens · 22012 ms · 2026-06-27T07:45:39.592468+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

45 extracted references · 8 canonical work pages · 8 internal anchors

  1. [1]

    Deep generative design: Integration of topology optimization and generative models.ASME J

    Oh, S., Jung, Y., Kim, S., Lee, I., and Kang, N. Deep generative design: Integration of topology optimization and generative models.ASME J. Mech. Des., 141(11):111405, 2019

  2. [2]

    H., and Ahmed, F

    Regenwetter, L., Nobari, A. H., and Ahmed, F. Deep generative models in engineering design: A review.ASME J. Mech. Des., 144(7):071704, 2022

  3. [3]

    Nie, Z., Lin, T., Jiang, H., and Kara, L. B. TopologyGAN: Topology opti- mization using generative adversarial networks based on physical fields over the initial domain.ASME J. Mech. Des., 143(3):031715, 2021

  4. [4]

    D., Simpson, T

    Cunningham, J. D., Simpson, T. W., and Tucker, C. S. An investigation of surrogate models for efficient performance-based decoding of 3D point clouds. ASME J. Mech. Des., 141(12):121401, 2019

  5. [5]

    Pfaff, T., Fortunato, M., Sanchez-Gonzalez, A., and Battaglia, P. W. Learning mesh-based simulation with graph networks.ICLR, 2021

  6. [6]

    SimJEB: Simulated jet engine bracket dataset.Computer Graphics Forum, 40(5):9–17, 2021

    Whalen, E., Beyene, A., and Mueller, C. SimJEB: Simulated jet engine bracket dataset.Computer Graphics Forum, 40(5):9–17, 2021

  7. [7]

    DeepJEB: 3D deep learning-based synthetic jet engine bracket dataset.ASME J

    Hong, S., Kwon, Y., Shin, D., Park, J., and Kang, N. DeepJEB: 3D deep learning-based synthetic jet engine bracket dataset.ASME J. Mech. Des., 147(4):041703, 2025

  8. [8]

    J., Florence, P., Straub, J., Newcombe, R., and Lovegrove, S

    Park, J. J., Florence, P., Straub, J., Newcombe, R., and Lovegrove, S. DeepSDF: Learning continuous signed distance functions for shape representation.CVPR, 2019

  9. [9]

    Building beyond human imagination with foundation models for geometry and physics

    PhysicsX. Building beyond human imagination with foundation models for geometry and physics. https://www.physicsx.ai/newsroom/, 2024

  10. [10]

    ShapeNet: An Information-Rich 3D Model Repository

    Chang, A. X., Funkhouser, T., Guibas, L., et al. ShapeNet: An information-rich 3D model repository.arXiv preprint arXiv:1512.03012, 2015

  11. [11]

    ABC: A big CAD model dataset for geometric deep learning.CVPR, 2019

    Koch, S., Matveev, A., Jiang, Z., et al. ABC: A big CAD model dataset for geometric deep learning.CVPR, 2019

  12. [12]

    FRAMED: An AutoML approach forstructuralperformancepredictionofbicycleframes.Computer-AidedDesign, 156, 2023

    Regenwetter, L., Weaver, C., and Ahmed, F. FRAMED: An AutoML approach forstructuralperformancepredictionofbicycleframes.Computer-AidedDesign, 156, 2023

  13. [13]

    BIKED: A dataset for computa- tional bicycle design with machine learning benchmarks.ASME J

    Regenwetter, L., Curry, B., and Ahmed, F. BIKED: A dataset for computa- tional bicycle design with machine learning benchmarks.ASME J. Mech. Des., 144(3):031706, 2022

  14. [14]

    Bagazinski, N. J. and Ahmed, F. Ship-D: Ship hull dataset for design optimiza- tion using machine learning.IDETC/CIE, 2023

  15. [15]

    DrivAerNet++: A large-scale multimodal car dataset with CFD simulations and deep learning benchmarks

    Elrefaie, M., Morar, F., Dai, A., and Ahmed, F. DrivAerNet++: A large-scale multimodal car dataset with CFD simulations and deep learning benchmarks. NeurIPS, 2024

  16. [16]

    D., et al

    Cobb, A. D., et al. AircraftVerse: A large-scale multimodal dataset of aerial vehicle designs.NeurIPS Datasets and Benchmarks Track, 2023

  17. [17]

    Point-E: A System for Generating 3D Point Clouds from Complex Prompts

    Nichol, A., Jun, H., Dhariwal, P., Mishkin, P., and Chen, M. Point-E: A system forgenerating3D pointcloudsfromcomplex prompts.arXiv:2212.08751, 2022

  18. [18]

    Zero-1-to-3: Zero-shot one image to 3D object.ICCV, 2023

    Liu, R., Wu, R., Van Hoorick, B., Tokmakov, P., Zakharov, S., and Vondrick, C. Zero-1-to-3: Zero-shot one image to 3D object.ICCV, 2023

  19. [19]

    T., and Mildenhall, B

    Poole, B., Jain, A., Barron, J. T., and Mildenhall, B. DreamFusion: Text-to-3D using 2D diffusion.ICLR, 2023

  20. [20]

    Structured 3D latents for scalable and versatile 3D generation (TRELLIS).CVPR, 2025

    Xiang, J., Lv, Z., Xu, S., Deng, Y., Wang, R., Zhang, B., Chen, D., Tong, X., and Yang, J. Structured 3D latents for scalable and versatile 3D generation (TRELLIS).CVPR, 2025. Journal of Mechanical DesignPREPRINT/ 15

  21. [21]

    NeRF: Representing scenes as neural radiance fields for view synthesis

    Mildenhall,B.,Srinivasan,P.P.,Tancik,M.,Barron,J.T.,Ramamoorthi,R.,and Ng, R. NeRF: Representing scenes as neural radiance fields for view synthesis. ECCV, 2020

  22. [22]

    3D Gaussian splatting for real-time radiance field rendering.ACM Trans

    Kerbl, B., Kopanas, G., Leimkühler, T., and Drettakis, G. 3D Gaussian splatting for real-time radiance field rendering.ACM Trans. Graph., 42(4), 2023

  23. [23]

    Flow straight and fast: Learning to generate and transfer data with rectified flow.ICLR, 2023

    Liu, X., Gong, C., and Liu, Q. Flow straight and fast: Learning to generate and transfer data with rectified flow.ICLR, 2023

  24. [24]

    LRM: Large reconstruction model for single image to 3D

    Hong, Y., Zhang, K., Gu, J., Bi, S., Zhou, Y., Liu, D., Liu, F., Sunkavalli, K., Bui, T., and Tan, H. LRM: Large reconstruction model for single image to 3D. ICLR, 2024

  25. [25]

    One-2-3-45: Any single image to 3D mesh in 45 seconds without per-shape optimization

    Liu, M., Xu, C., Jin, H., Chen, L., Varma T, M., Xu, Z., and Su, H. One-2-3-45: Any single image to 3D mesh in 45 seconds without per-shape optimization. NeurIPS, 2023

  26. [26]

    Wonder3D: Single image to 3D using cross-domain diffusion.CVPR, 2024

    Long, X., Guo, Y.-C., Lin, C., Liu, Y., Dou, Z., Liu, L., Ma, Y., Zhang, S.-H., Habermann, M., Theobalt, C., and Wang, W. Wonder3D: Single image to 3D using cross-domain diffusion.CVPR, 2024

  27. [27]

    InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models

    Xu, J., Cheng, W., Gao, Y., Wang, X., Gao, S., and Shan, Y. InstantMesh: Efficient 3D mesh generation from a single image with sparse-view large recon- struction models.arXiv:2404.07191, 2024

  28. [28]

    TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models

    Li, Y., Zou, Z.-X., Liu, Z., Wang, D., Liang, Y., Yu, Z., Liu, X., Guo, Y.- C., Liang, D., Ouyang, W., and Cao, Y.-P. TripoSG: High-fidelity 3D shape synthesis using large-scale rectified flow models.arXiv:2502.06608, 2025

  29. [29]

    Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation

    Tencent Hunyuan3D Team. Hunyuan3D 2.0: Scaling diffusion models for high- resolution textured 3D assets generation.arXiv:2501.12202, 2025

  30. [30]

    and Ahmed, F

    Chen, W. and Ahmed, F. PaDGAN: Learning to generate high-quality novel designs.ASME J. Mech. Des., 143(3):031703, 2021

  31. [31]

    PcDGAN: A continuous condi- tional diverse generative adversarial network for inverse design.KDD, 2021

    Heyrani Nobari, A., Chen, W., and Ahmed, F. PcDGAN: A continuous condi- tional diverse generative adversarial network for inverse design.KDD, 2021

  32. [32]

    3D design using generative adversar- ial networks and physics-based validation.ASME J

    Shu, D., Cunningham, J., Stump, G., et al. 3D design using generative adversar- ial networks and physics-based validation.ASME J. Mech. Des., 142(7):071701, 2020

  33. [33]

    W., Da, D., Fuge, M., and Rai, R

    Wang, J., Chen, W. W., Da, D., Fuge, M., and Rai, R. IH-GAN: A conditional generative model for implicit surface-based inverse design of cellular structures. CMAME, 396:115060, 2022

  34. [34]

    and Ahmed, F

    Mazé, F. and Ahmed, F. Diffusion models beat GANs on topology optimization. Proc. AAAI Conf. Artif. Intell., 37(8):9108–9116, 2023

  35. [35]

    W., Hallacy, C., et al

    Radford, A., Kim, J. W., Hallacy, C., et al. Learning transferable visual models from natural language supervision.ICML, 2021

  36. [36]

    BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models

    Li, J., Li, D., Xiong, C., and Hoi, S. BLIP-2: Bootstrapping language- image pre-training with frozen image encoders and large language models. arXiv:2301.12597, 2023

  37. [37]

    Liu, H., Li, C., Wu, Q., and Lee, Y. J. Visual instruction tuning. arXiv:2304.08485, 2023

  38. [38]

    High- resolution image synthesis with latent diffusion models.CVPR, 2022

    Rombach, R., Blattmann, A., Lorenz, D., Esser, P., and Ommer, B. High- resolution image synthesis with latent diffusion models.CVPR, 2022

  39. [39]

    Denoising diffusion probabilistic models

    Ho, J., Jain, A., and Abbeel, P. Denoising diffusion probabilistic models. NeurIPS, 2020

  40. [40]

    and Golland, P

    Wang, C. and Golland, P. Interpolating between images with diffusion models. ICML, 2023

  41. [41]

    Emerging properties in self-supervised vision transformers.ICCV, 2021

    Caron, M., Touvron, H., Misra, I., et al. Emerging properties in self-supervised vision transformers.ICCV, 2021

  42. [42]

    A circle fitting procedure and its error analysis.IEEE Trans

    Kasa, I. A circle fitting procedure and its error analysis.IEEE Trans. Instrum. Meas., 25(1):8–14, 1976

  43. [43]

    Least-squares estimation of transformation parameters between two point patterns.IEEE Trans

    Umeyama, S. Least-squares estimation of transformation parameters between two point patterns.IEEE Trans. Pattern Anal. Mach. Intell., 13(4):376–380, 1991

  44. [44]

    UniPC: A unified predictor- corrector framework for fast sampling of diffusion models.NeurIPS, 2023

    Zhao, W., Bai, L., Rao, Y., Zhou, J., and Lu, J. UniPC: A unified predictor- corrector framework for fast sampling of diffusion models.NeurIPS, 2023

  45. [45]

    Classifier-Free Diffusion Guidance

    Ho, J. and Salimans, T. Classifier-free diffusion guidance.arXiv:2207.12598, 2022. 16 /PREPRINT Transactions of the ASME