SE3D: Building a radiative transfer emulator to fit panchromatic resolved galaxy observations with 3D models of dust and stars

Cheng Li; Junkai Zhang; Steven Ramnichal; Stijn Wuyts

arxiv: 2511.19623 · v2 · pith:KVYG6DU5new · submitted 2025-11-24 · 🌌 astro-ph.GA · astro-ph.IM

SE3D: Building a radiative transfer emulator to fit panchromatic resolved galaxy observations with 3D models of dust and stars

Steven Ramnichal , Junkai Zhang , Stijn Wuyts , Cheng Li This is my paper

Pith reviewed 2026-05-21 19:09 UTC · model grok-4.3

classification 🌌 astro-ph.GA astro-ph.IM

keywords radiative transfer emulatorpanchromatic galaxy observations3D dust modelsBayesian neural networkspectral energy distributionwavelength-dependent sizesgalaxy structural parametersdust attenuation

0 comments

The pith

A Bayesian neural network emulator trained on 3D radiative transfer toy models reproduces galaxy spectral distributions and structural parameters at 0.05 dex accuracy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces SE3D, a framework that fits both the total light output of galaxies across many wavelengths and how their apparent size, light profile and axis ratio change with wavelength. It builds this by training a machine learning emulator on thousands of simplified 3D galaxy models that vary in stellar and dust arrangements, include radial population gradients, and are processed through full radiative transfer calculations under different viewing angles. The emulator uses a Bayesian neural network and matches the detailed outputs of those calculations to within about 0.05 dex error over a wide range of input values and across the UVJ colours of real galaxies. This computational shortcut makes it feasible to fit actual panchromatic resolved observations without running expensive radiative transfer for every trial model. A sensitivity analysis confirms the network has learned the physical links between galaxy properties and the measured fluxes, colours and sizes.

Core claim

The central claim is that a Bayesian neural network emulator, trained on a library of toy model galaxies varying in stellar and dust geometries with radial stellar population gradients and processed with 3D dust radiative transfer under a range of viewing angles, reproduces the spectral energy distributions and the wavelength-dependent global structural parameters (size, light profile, projected axis ratio) at an accuracy of ~0.05 dex or less across the dynamic range of input parameters and the rest-frame UVJ colour space spanned by observed galaxies.

What carries the argument

The Bayesian neural network emulator that maps galaxy physical properties (stellar and dust geometries, population gradients) to direct observables (fluxes, colours, sizes, size ratios across wavebands).

If this is right

The framework supports simultaneous self-consistent fitting of spectral energy distributions and wavelength-dependent structural parameters to panchromatic resolved observations.
The emulator enables efficient exploration of physical conditions that produce different total-to-selective attenuation ratios Rv, especially those tied to projected dust surface mass density.
Sensitivity analysis demonstrates that the network has learned the intricate mappings between galaxy physical properties and observables such as fluxes, colours and sizes.
The method can be used to analyse how stellar and dust arrangements affect observed light profiles and axis ratios at different wavelengths.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the current toy model library already captures the dominant variations, the emulator could be applied to large survey samples to derive statistical constraints on dust geometries across galaxy populations.
Adding more varied features such as clumps or spiral structure to the training library would test whether the accuracy holds for more complex real systems.
Comparing dust properties derived from emulator fits against those obtained from independent infrared or submillimetre data would provide a direct external validation.
The wavelength-dependent size information recovered by the emulator could help interpret observations of high-redshift galaxies where dust effects are strong but spatial resolution is limited.

Load-bearing premise

The library of toy models with their chosen stellar and dust geometries and radial gradients is representative enough of real galaxies for the emulator to produce reliable fits to actual panchromatic resolved observations.

What would settle it

Fitting real panchromatic resolved galaxy observations with the emulator and finding inferred dust masses or geometries that systematically disagree with independent estimates from far-infrared emission or spectroscopy would show the toy models are not representative.

Figures

Figures reproduced from arXiv: 2511.19623 by Cheng Li, Junkai Zhang, Steven Ramnichal, Stijn Wuyts.

**Figure 1.** Figure 1: A visual schematic displaying the workflow of SE3D modelling. Top: radiative transfer is applied to toy model galaxies to produce 3D mock data cubes, which are distilled into 4 spectral distributions (flux, size, Sérsic index and axis ratio as a function of wavelength). Middle: a Bayesian Neural Network (BNN) emulator is trained to improve computational efficiency. Bottom: the flexible and efficient emulat… view at source ↗

**Figure 2.** Figure 2: Effect of varying a small subset of model parameters on the predicted SEDs and SRDs. Each panel shows the result of modifying one parameter while keeping others fixed to the reference model (shown in black and described in Appendix C). From left to right, we vary the dust content, the scale of the galaxy, the radial gradient of the time of peak star formation, and the inclination. Note: because of how we d… view at source ↗

**Figure 3.** Figure 3: Validation of the emulator. Residuals, in dex, between 75,000 (unseen) SKIRT ground truth spectral distributions and the corresponding emulator predictions are displayed. The blue shaded regions showcase the central 68th percentile with the running median shown as the blue solid line. Integrating over wavelengths, we further mark the 16th (red dashed), 50th (red solid) and 84th (red dashed) percentiles for… view at source ↗

**Figure 4.** Figure 4: Emulator performance (NMAD) as evaluated on our testing set of 75,000 toy models, for varying training library sizes. The coloured polygons display the range of computed NMADs for a given library size, where for sizes smaller than the full library size, multiple iterations are computed by sampling disjoint subsets of the corresponding size from the remaining data. The red dashed line indicates a 0.05 dex e… view at source ↗

**Figure 6.** Figure 6: Spearman rank correlation coefficients between the input parameters in our library and the SD residuals (integrated across wavelength) for the respective toy model galaxies. The performance of the emulator remains approximately constant across the dynamic range sampled for each parameter. Residuals in predicted 𝑞 formally increase with 𝜃, but in an absolute sense are negligible (< 0.01 dex). strongly dete… view at source ↗

**Figure 9.** Figure 9: Same as [PITH_FULL_IMAGE:figures/full_fig_p011_9.png] view at source ↗

**Figure 10.** Figure 10: Same as [PITH_FULL_IMAGE:figures/full_fig_p012_10.png] view at source ↗

read the original abstract

We present a framework for analysing panchromatic and spatially resolved galaxy observations, dubbed SE3D. SE3D simultaneously and self-consistently models a galaxy's spectral energy distribution and its spectral distributions of global structural parameters: the wavelength-dependent galaxy size, light profile and projected axis ratio. To this end, it employs a machine learning emulator trained on a large library of toy model galaxies processed with 3D dust radiative transfer and mock-observed under a range of viewing angles. The toy models vary in their stellar and dust geometries, and include radial stellar population gradients. The computationally efficient machine learning emulator uses a Bayesian neural network architecture, and reproduces the spectral distributions at an accuracy of ~ 0.05 dex or less across the dynamic range of input parameters, and across the rest-frame UVJ colour space spanned by observed galaxies. We carry out a sensitivity analysis demonstrating that the emulator has successfully learned the intricate mappings between galaxy physical properties and direct observables (fluxes, colours, sizes, size ratios between different wavebands, ...). We further discuss the physical conditions giving rise to a range of total-to-selective attenuation ratios, Rv, with among them most prominently the projected dust surface mass density.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper presents SE3D, a framework for simultaneously modeling the panchromatic SED and wavelength-dependent structural parameters (size, light profile, axis ratio) of resolved galaxies. It trains a Bayesian neural network emulator on a large library of toy-model galaxies that incorporate varied stellar/dust geometries and radial population gradients, processed via 3D dust radiative transfer and mock observations at multiple viewing angles. The emulator is reported to achieve ~0.05 dex accuracy on spectral distributions across the input parameter range and UVJ colour space, with a sensitivity analysis showing it has learned mappings from physical parameters to observables (fluxes, colours, sizes, size ratios). The work discusses physical conditions producing a range of total-to-selective attenuation ratios R_V, notably linked to projected dust surface mass density.

Significance. If the central accuracy claim and generalizability hold, SE3D would provide a computationally efficient route to fitting self-consistent 3D dust-star models to panchromatic resolved data, enabling better constraints on dust geometry and attenuation in galaxies. The sensitivity analysis is a clear strength, as it directly tests whether the network has internalized the physical mappings rather than merely memorizing the training set. The approach of using toy models with controlled gradients and geometries is methodologically sound for isolating effects, but its impact depends on demonstrating that the sampled manifold is representative enough for application to real observations.

major comments (2)

[Abstract] Abstract and sensitivity analysis section: the claim that the emulator 'reproduces the spectral distributions at an accuracy of ~0.05 dex or less ... across the rest-frame UVJ colour space spanned by observed galaxies' is evaluated exclusively on held-out toy models. Because the central use case is fitting actual panchromatic resolved observations, a direct test against independent radiative-transfer calculations on hydrodynamical simulations or against multi-wavelength data for well-studied resolved galaxies (with independently constrained parameters) is required to establish that the reported accuracy translates outside the training distribution.
[Sensitivity analysis] The sensitivity analysis demonstrates that the network learned mappings between physical parameters and observables (fluxes, colours, sizes, size ratios). However, the manuscript does not report quantitative metrics (e.g., residual trends or bias as a function of dust surface density or viewing angle) for the derived R_V values or for the wavelength-dependent size ratios; without these, it is difficult to assess whether the emulator recovers the physical trends discussed in the final paragraph at the precision needed for scientific application.

minor comments (2)

[Methods] The description of the Bayesian neural network architecture (number of layers, hidden units, prior choices, training schedule) should be expanded with a dedicated methods subsection or table so that the reproducibility of the ~0.05 dex figure can be verified.
[Figures] Figure captions and axis labels for the sensitivity-analysis plots should explicitly state the wavelength bands or colour combinations shown, and include error bars or residual panels to make the 0.05 dex claim visually quantifiable.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments, which help clarify the scope and limitations of our emulator validation. We address each major point below, indicating where revisions will be made to the manuscript.

read point-by-point responses

Referee: [Abstract] Abstract and sensitivity analysis section: the claim that the emulator 'reproduces the spectral distributions at an accuracy of ~0.05 dex or less ... across the rest-frame UVJ colour space spanned by observed galaxies' is evaluated exclusively on held-out toy models. Because the central use case is fitting actual panchromatic resolved observations, a direct test against independent radiative-transfer calculations on hydrodynamical simulations or against multi-wavelength data for well-studied resolved galaxies (with independently constrained parameters) is required to establish that the reported accuracy translates outside the training distribution.

Authors: We agree that the reported accuracy is demonstrated on held-out toy models rather than on hydrodynamical simulations or real galaxy data with independent constraints. Our toy-model library was constructed specifically to span the range of stellar/dust geometries, radial gradients, and viewing angles needed to reproduce the UVJ colour space and structural trends of observed galaxies, allowing controlled isolation of physical effects. A full end-to-end test on hydrodynamical outputs would require new, computationally expensive 3D radiative-transfer post-processing of large simulation suites and is outside the present scope; such validation is planned for follow-up work. We will revise the abstract and add a dedicated paragraph in the discussion section clarifying that the quoted accuracy applies within the toy-model manifold and discussing expected applicability (and possible biases) when fitting real observations. revision: partial
Referee: [Sensitivity analysis] The sensitivity analysis demonstrates that the network learned mappings between physical parameters and observables (fluxes, colours, sizes, size ratios). However, the manuscript does not report quantitative metrics (e.g., residual trends or bias as a function of dust surface density or viewing angle) for the derived R_V values or for the wavelength-dependent size ratios; without these, it is difficult to assess whether the emulator recovers the physical trends discussed in the final paragraph at the precision needed for scientific application.

Authors: We will add quantitative residual analysis to the sensitivity section. Specifically, we will include plots and summary statistics of the residuals in recovered R_V and wavelength-dependent size ratios as functions of projected dust surface mass density and viewing angle, together with bias and scatter metrics. These additions will directly quantify how well the emulator reproduces the physical trends highlighted in the discussion. revision: yes

Circularity Check

0 steps flagged

No circularity: emulator accuracy is empirical performance on radiative-transfer library, not a definitional or fitted-input reduction

full rationale

The paper builds an independent library of toy models with varied stellar/dust geometries and gradients, runs external 3D dust radiative transfer to generate mock observations, then trains a Bayesian neural network emulator on that data. The ~0.05 dex accuracy is reported as a measured reproduction across the input parameter range and UVJ space, while the sensitivity analysis checks learned mappings between physical parameters and observables. No equation or claim reduces by construction to a fitted parameter renamed as prediction, no self-citation is load-bearing for the central result, and no ansatz or uniqueness theorem is smuggled in. The chain is a standard supervised ML pipeline evaluated on the generated distribution and remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the representativeness of the toy model library and the ability of the neural network to generalize the learned mappings beyond the training set. No explicit free parameters or invented entities are named in the abstract.

axioms (1)

domain assumption Toy models varying in stellar and dust geometries with radial stellar population gradients capture the essential physics needed to emulate real galaxy observations.
The emulator is trained exclusively on this library and applied to fitting observed galaxies.

pith-pipeline@v0.9.0 · 5761 in / 1402 out tokens · 50870 ms · 2026-05-21T19:09:35.239185+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The computationally efficient machine learning emulator uses a Bayesian neural network architecture, and reproduces the spectral distributions at an accuracy of ~ 0.05 dex or less across the dynamic range of input parameters, and across the rest-frame UVJ colour space spanned by observed galaxies.
IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We carry out a sensitivity analysis demonstrating that the emulator has successfully learned the intricate mappings between galaxy physical properties and direct observables (fluxes, colours, sizes, size ratios between different wavebands, ...).

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

5 extracted references · 5 canonical work pages

[1]

Optimizer Algorithms used to update neural networks weights Adam, AdamW, NAdam, RMSProp

work page
[2]

Momentum Applies smoothing to gradient descent [0.85, 0.99]

work page
[3]

Weight decay Regularization applied to Optimizer to prevent overfitting [1×10 −5,1×10 −2] 9.𝛽 1, 𝛽 2 Parameters controlling the decay rate of the first and second moments for gradients [0.9, 0.999] Training

work page
[4]

lr Learning rate [1×10 −6,1×10 −2]

work page
[5]

Figure A1.Validation of model predicting whether a toy model galaxy is physical or unphysical, as evaluated on an unseen testing set

Batch size Number of data rows processed per training epoch [32, 264] 12.grad clip Value at which gradients are clipped every training epoch [0.5, 10] 13.KL weight Weight coefficient for the Kullback-Leibler divergence describing model uncertainty [0.0, 1.0] 14.ZP weight Weight coefficient for the difference between predicted and true zero points [0.0, 1....

work page 2020

[1] [1]

Optimizer Algorithms used to update neural networks weights Adam, AdamW, NAdam, RMSProp

work page

[2] [2]

Momentum Applies smoothing to gradient descent [0.85, 0.99]

work page

[3] [3]

Weight decay Regularization applied to Optimizer to prevent overfitting [1×10 −5,1×10 −2] 9.𝛽 1, 𝛽 2 Parameters controlling the decay rate of the first and second moments for gradients [0.9, 0.999] Training

work page

[4] [4]

lr Learning rate [1×10 −6,1×10 −2]

work page

[5] [5]

Figure A1.Validation of model predicting whether a toy model galaxy is physical or unphysical, as evaluated on an unseen testing set

Batch size Number of data rows processed per training epoch [32, 264] 12.grad clip Value at which gradients are clipped every training epoch [0.5, 10] 13.KL weight Weight coefficient for the Kullback-Leibler divergence describing model uncertainty [0.0, 1.0] 14.ZP weight Weight coefficient for the difference between predicted and true zero points [0.0, 1....

work page 2020