Differentiable Acoustic Radiance Transfer

Enzo De Sena; Kyogu Lee; Matteo Scerbo; Min Jun Choi; Seungu Han; Sungho Lee

arxiv: 2509.15946 · v2 · submitted 2025-09-19 · 💻 cs.SD · eess.AS· eess.SP

Differentiable Acoustic Radiance Transfer

Sungho Lee , Matteo Scerbo , Seungu Han , Min Jun Choi , Kyogu Lee , Enzo De Sena This is my paper

Pith reviewed 2026-05-18 15:56 UTC · model grok-4.3

classification 💻 cs.SD eess.ASeess.SP

keywords acoustic radiance transferdifferentiable renderingroom acousticsgeometric acousticsgradient optimizationsparse measurementsacoustic field learning

0 comments

The pith

DART makes the acoustic radiance transfer method differentiable to optimize material properties and generalize better from sparse acoustic measurements.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper presents DART, a differentiable implementation of acoustic radiance transfer for efficient room acoustics modeling. It builds on the discretization of the time-dependent rendering equation to represent energy exchange between surface patches with varying materials. The key innovation is enabling gradient-based optimization of these material properties. When applied to predicting energy responses for unseen source-receiver positions, DART shows improved generalization in cases with few measurements compared to traditional signal processing techniques and neural networks. It achieves this while keeping the approach straightforward and fully interpretable, and the code is released openly.

Core claim

DART is an efficient differentiable version of ART that discretizes the time-dependent rendering equation for modeling time- and direction-dependent acoustic energy exchange between surface patches. This allows gradient-based optimization of material properties. Experiments on a variant of acoustic field learning demonstrate that it generalizes better under sparse measurement scenarios than signal processing and neural network baselines while preserving simplicity and interpretability.

What carries the argument

Differentiable discretization of the time-dependent rendering equation into surface patches to compute and optimize energy transfers.

If this is right

Material properties in acoustic models can be tuned automatically using gradients from observed data.
DART provides better predictions for new configurations when training data from measurements is limited.
The method remains interpretable, unlike many neural network alternatives.
Open-source release facilitates further development in geometric acoustics.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Future work could extend DART to optimize room geometry in addition to materials.
This approach might reduce reliance on extensive sensor arrays for calibrating acoustic environments.
Hybrid models combining DART with learning techniques could handle more complex wave phenomena.

Load-bearing premise

The surface patch discretization of the time-dependent rendering equation accurately models real acoustic energy exchange in the evaluated setups.

What would settle it

Measuring acoustic responses in a real room with known material properties and checking if DART's optimized parameters match those known values within expected error margins.

Figures

Figures reproduced from arXiv: 2509.15946 by Enzo De Sena, Kyogu Lee, Matteo Scerbo, Min Jun Choi, Seungu Han, Sungho Lee.

**Figure 1.** Figure 1: ARE and ART. Acoustic Rendering Equation By accounting for the nonnegligible speed of sound, Siltanen et al. [33] extended Kajiya’s rendering equation for light transport [41] to the acoustic rendering equation (ARE). Sound is regarded as a “ray” with acoustic radiance L(x ′ , Ω, t) ∈ R +, time-dependent energy flux per projected area and per solid angle. It is a function of surface point x ′ ∈ A, emittin… view at source ↗

**Figure 2.** Figure 2: Decomposed Rˆ hj,ik. Kernel Decomposition Our first key idea is that, by decomposing the kernel, we can decouple the effects of the fixed room geometry and materials, and precompute the former in advance of optimization. First, following prior ARTs [33, 35, 36], we separate the delay term: Rˆ hj,ik[n] ≈ Dˆ hj [n] · Sˆ hj,ik. (17) Dˆ hj represents a discrete delay signal with delay length corresponding to … view at source ↗

**Figure 3.** Figure 3: Overview of DART. Material Parameterization The material matrix still needs the numerical integration of the BSDFs during optimization. We explore two strategies that sidestep this cost, corresponding to two variants of DART. First, we can bypass the integration and directly learn the matrix entries, factorized into a reflection coefficient αi per patch Ai and an energy-preserving (lossless) matrix M¯ . Mˆ… view at source ↗

**Figure 4.** Figure 4: CR dataset, unseen split of Office → Anechoic scene. Benchmarks We evaluate DART with 2 real-world datasets. First, we use the Hearing Anything Anywhere (HAA) dataset [27]. We follow the same split as initially proposed, i.e., 12 measurements for training. While the HAA dataset serves as an excellent benchmark, it also has the weakness of each scene having only one room, with a single fixed source position… view at source ↗

**Figure 5.** Figure 5: Evaluation results on the Coupled Room (CR) dataset scenes under the unseen split scenario. [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

**Figure 7.** Figure 7: Test results with different amounts of measurements (top) and geometric distortion (bottom). [PITH_FULL_IMAGE:figures/full_fig_p009_7.png] view at source ↗

**Figure 6.** Figure 6: Optimized coefficients α. Material Visualization We can observe that all the baselines especially struggle at two scenes, Office → Anechoic and Office → Stairwell. These scenes comprise two rooms with drastically different acoustic properties, e.g., Anechoic having much lower reverberation time compared to Office. The baselines, trained with measurements with receivers only at Office, fail to recognize thi… view at source ↗

**Figure 8.** Figure 8: Per-scene results with different amounts of measurements. [PITH_FULL_IMAGE:figures/full_fig_p030_8.png] view at source ↗

**Figure 9.** Figure 9: Per-scene results with different amounts of geometric distortion. [PITH_FULL_IMAGE:figures/full_fig_p031_9.png] view at source ↗

**Figure 10.** Figure 10: Classroom from Hearing Anything Anywhere (HAA) dataset. 0 1 x 0 2 4 6 8 10 12 14 16 18 y 0 1 2 z 0 1 x 0 2 4 6 8 10 12 14 16 18 y [PITH_FULL_IMAGE:figures/full_fig_p033_10.png] view at source ↗

**Figure 11.** Figure 11: Hallway from Hearing Anything Anywhere (HAA) dataset. 33 [PITH_FULL_IMAGE:figures/full_fig_p033_11.png] view at source ↗

**Figure 12.** Figure 12: Dampened from Hearing Anything Anywhere (HAA) dataset. 0 1 2 3 4 5 6 7 8 x 0 2 4 6 8 10 12 y 0 1 2 3 4 5 6 z 0 1 2 3 4 5 6 7 8 x 0 2 4 6 8 10 12 y [PITH_FULL_IMAGE:figures/full_fig_p034_12.png] view at source ↗

**Figure 13.** Figure 13: Complex from Hearing Anything Anywhere (HAA) dataset. 34 [PITH_FULL_IMAGE:figures/full_fig_p034_13.png] view at source ↗

**Figure 14.** Figure 14: MeetingRoom → Hallway from Coupled Room (CR) dataset. 0 2 4 6 8 10 x 2 0 2 4 y 0 1 2 3 4 5 6 7 8 z 0 2 4 6 8 10 x 2 0 2 4 y [PITH_FULL_IMAGE:figures/full_fig_p035_14.png] view at source ↗

**Figure 15.** Figure 15: Office → Anechoic from Coupled Room (CR) dataset. 35 [PITH_FULL_IMAGE:figures/full_fig_p035_15.png] view at source ↗

**Figure 16.** Figure 16: Office → Kitchen from Coupled Room (CR) dataset. 8 6 4 2 0 2 4 x 4 3 2 1 0 1 2 y 0 2 4 6 8 10 12 14 z 8 6 4 2x 0 2 4 4 3 2 1 0 1 2 y [PITH_FULL_IMAGE:figures/full_fig_p036_16.png] view at source ↗

**Figure 17.** Figure 17: Office → Stairwell from Coupled Room (CR) dataset. 36 [PITH_FULL_IMAGE:figures/full_fig_p036_17.png] view at source ↗

read the original abstract

Geometric acoustics is an efficient framework for room acoustics modeling, governed by the canonical time-dependent rendering equation. Acoustic radiance transfer (ART) solves the equation by discretization, modeling time- and direction-dependent energy exchange between surface patches with flexible material properties. We introduce DART, an efficient, differentiable implementation of ART that enables gradient-based optimization of material properties. We evaluate DART on a simpler variant of acoustic field learning that aims to predict energy responses for novel source-receiver configurations. Experimental results demonstrate that DART generalizes better under sparse measurement scenarios than existing signal processing and neural network baselines, while maintaining simplicity and full interpretability. We open-source our implementation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

DART makes ART differentiable for gradient-based material tuning and shows better sparse generalization than the baselines they tried, but the physical fidelity of the patch discretization needs closer checks.

read the letter

The main point is that this work takes the existing acoustic radiance transfer approach, makes it differentiable, and uses that to optimize material properties from limited measurements. They test it on predicting energy responses for new source-receiver pairs and report clearer gains over signal-processing and neural baselines in sparse cases, while keeping the model interpretable and open-sourcing the code. That combination is the actual addition here; the core ART discretization was already around, but the gradient path and the sparse-field experiment are new in this framing. It does a clean job of staying inside the geometric acoustics framework instead of jumping to a black-box network, which preserves the direction- and time-dependent transfer terms and makes the results easier to inspect. The open implementation is also useful for anyone who wants to plug it into their own pipeline. The soft spot is the reliance on the surface-patch discretization of the time-dependent rendering equation. If patch size, visibility rules, or the omission of wave phenomena create systematic mismatches with real energy exchange, then the gradients will optimize inside that approximation rather than toward physical truth. The held-out predictions could look strong simply because the model is consistent with itself. I would want to see the exact error metrics, dataset construction details, and any ablation on patch resolution before accepting the generalization claim at face value. This paper is for researchers and engineers already working with geometric room acoustics who need a differentiable layer for optimization tasks in VR, architecture, or audio design. A reader who knows the original ART papers will get the most out of it quickly. It is worth sending to peer review because the extension is straightforward, the code is available, and the sparse-measurement angle is practically relevant, even if the physical-accuracy questions will need referee attention.

Referee Report

2 major / 2 minor

Summary. The paper introduces DART, a differentiable implementation of Acoustic Radiance Transfer (ART) that discretizes the canonical time-dependent rendering equation to model direction- and time-dependent energy exchange between surface patches with optimizable material properties. It evaluates this on a simplified acoustic field learning task of predicting energy responses for novel source-receiver pairs, claiming improved generalization under sparse measurements relative to signal-processing and neural-network baselines while preserving simplicity and full interpretability; the implementation is open-sourced.

Significance. If the central generalization result holds after addressing validation gaps, the work would provide a useful, interpretable alternative to black-box neural methods for material optimization in geometric acoustics. The open-source release and emphasis on differentiability within an established ART framework are concrete strengths that support reproducibility and potential adoption in simulation pipelines.

major comments (2)

[§3] §3 (ART discretization and rendering equation): The central claim that DART generalizes better under sparse measurements rests on the surface-patch discretization of the time-dependent rendering equation faithfully representing real acoustic energy exchange. The manuscript should add a quantitative validation (e.g., comparison of patch-based predictions against wave-based ground truth or measured impulse responses) for the tested room configurations; without it, material optimization may fit discretization artifacts rather than physical behavior, undermining the generalization advantage.
[§4] §4 (experimental evaluation): The reported superiority over baselines is load-bearing for the contribution, yet the manuscript provides no error bars, exact sparsity levels (number of measurements per scene), room geometries, or per-baseline quantitative metrics (e.g., mean squared error on held-out pairs). These details are required to confirm that the advantage is attributable to the differentiable ART formulation rather than implementation specifics or dataset choices.

minor comments (2)

Ensure the open-source repository link appears in the camera-ready version and includes the exact scripts used to generate the reported figures and tables.
[§2] Clarify the precise definition of 'energy response' (e.g., whether it is integrated over time bins or frequency bands) in the problem formulation to aid reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback and positive assessment of the open-source release and interpretability aspects. We address each major comment below and have revised the manuscript to strengthen the validation and reporting of results.

read point-by-point responses

Referee: [§3] §3 (ART discretization and rendering equation): The central claim that DART generalizes better under sparse measurements rests on the surface-patch discretization of the time-dependent rendering equation faithfully representing real acoustic energy exchange. The manuscript should add a quantitative validation (e.g., comparison of patch-based predictions against wave-based ground truth or measured impulse responses) for the tested room configurations; without it, material optimization may fit discretization artifacts rather than physical behavior, undermining the generalization advantage.

Authors: We agree that direct quantitative validation of the discretization strengthens the claims. The surface-patch discretization follows the established ART formulation from prior geometric acoustics literature, which has been shown to accurately model energy exchange for the mid-to-high frequency regimes targeted here. To address the concern explicitly, we have added a new subsection in the revised manuscript with a quantitative comparison of DART patch-based predictions against a wave-based FDTD solver on one representative room configuration from our test set. The comparison reports relative error in energy decay curves and transfer functions, showing that DART captures the dominant late-time energy exchange behavior with errors primarily in the earliest reflections (as expected from the geometric approximation). This supports that material optimization operates on physically meaningful quantities rather than pure discretization artifacts. We have also clarified the frequency range and assumptions in §3. revision: yes
Referee: [§4] §4 (experimental evaluation): The reported superiority over baselines is load-bearing for the contribution, yet the manuscript provides no error bars, exact sparsity levels (number of measurements per scene), room geometries, or per-baseline quantitative metrics (e.g., mean squared error on held-out pairs). These details are required to confirm that the advantage is attributable to the differentiable ART formulation rather than implementation specifics or dataset choices.

Authors: We fully agree that these experimental details are necessary for rigorous evaluation and reproducibility. In the revised manuscript we have expanded §4 with the following: (i) error bars and standard deviations computed over five independent runs with different random seeds for measurement selection and initialization; (ii) explicit sparsity levels (4, 8, and 16 source-receiver measurements per scene); (iii) detailed description of all room geometries, including dimensions, surface counts, and material coefficient ranges; and (iv) a new table reporting per-baseline mean squared error (MSE) and standard deviation on held-out pairs for each sparsity level. These additions confirm that the observed generalization advantage is attributable to the differentiable ART structure rather than implementation or dataset artifacts. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper presents DART as a differentiable extension of the established Acoustic Radiance Transfer (ART) discretization of the time-dependent rendering equation into surface patches. The central claim of improved generalization to novel source-receiver pairs under sparse measurements is supported by empirical comparison to signal-processing and neural baselines rather than by any reduction of predictions to fitted parameters or self-referential definitions. No load-bearing self-citations, uniqueness theorems imported from prior author work, or ansatz smuggling appear in the derivation; the differentiability step simply enables gradient-based material optimization within the pre-existing geometric-acoustics framework, leaving the held-out prediction task independent of the model inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only view shows reliance on the standard time-dependent rendering equation and the existing ART discretization; no new free parameters, ad-hoc axioms, or invented entities are described.

axioms (1)

standard math Geometric acoustics is governed by the canonical time-dependent rendering equation.
Stated directly in the abstract as the governing framework.

pith-pipeline@v0.9.0 · 5647 in / 1076 out tokens · 39599 ms · 2026-05-18T15:56:14.310450+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We discretize the geometry into Npat patches, and also partition the incoming and outgoing directions of each patch Ai into Ndir solid angles... Then, the acoustic radiance transfer (ART) considers discrete radiance... (Eq. 13)
IndisputableMonolith/Foundation/AlphaCoordinateFixation.lean J_uniquely_calibrated_via_higher_derivative unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Kernel Decomposition... ˆRhj,ik[n]≈ ˆDhj[n]· ˆShj,ik... further decomposed into mean visibility matrix ˆV and material matrix ˆM (Eq. 18)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

73 extracted references · 73 canonical work pages · 2 internal anchors

[1]

Crc Press, 2016

Heinrich Kuttruff.Room acoustics. Crc Press, 2016

work page 2016
[2]

Lukas Aspöck, Sönke Pelzer, Frank Wefers, and Michael V orländer.A real-time auralization plugin for architectural design and education. 2014

work page 2014
[3]

Gsound: Interactive sound propagation for games

Carl Schissler and Dinesh Manocha. Gsound: Interactive sound propagation for games. In Audio Engineering Society Conference: 41st International Conference: Audio for Games. Audio Engineering Society, 2011

work page 2011
[4]

Interactive sound propagation with bidirectional path tracing.ACM Transactions on Graphics (TOG), 35(6):1–11, 2016

Chunxiao Cao, Zhong Ren, Carl Schissler, Dinesh Manocha, and Kun Zhou. Interactive sound propagation with bidirectional path tracing.ACM Transactions on Graphics (TOG), 35(6):1–11, 2016

work page 2016
[5]

Perceptual comparison of efficient real-time geometrical acoustics engines in virtual reality

Sebastia Vicenc Amengual Gari, Carl Schissler, and Philip Robinson. Perceptual comparison of efficient real-time geometrical acoustics engines in virtual reality. InAudio Engineering Society Conference: AES 2024 International Audio for Games Conference. Audio Engineering Society, 2024

work page 2024
[6]

Real-time acoustic modeling for distributed virtual environments

Thomas Funkhouser, Patrick Min, and Ingrid Carlbom. Real-time acoustic modeling for distributed virtual environments. InProceedings of the 26th annual conference on Computer graphics and interactive techniques, pages 365–374, 1999

work page 1999
[7]

On the relative importance of visual and spatial audio rendering on vr immersion.Frontiers in Signal Processing, 2:904866, 2022

Thomas Potter, Zoran Cvetkovi ´c, and Enzo De Sena. On the relative importance of visual and spatial audio rendering on vr immersion.Frontiers in Signal Processing, 2:904866, 2022

work page 2022
[8]

Novel-view acoustic synthesis

Changan Chen, Alexander Richard, Roman Shapovalov, Vamsi Krishna Ithapu, Natalia Neverova, Kristen Grauman, and Andrea Vedaldi. Novel-view acoustic synthesis. InPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6409–6419, 2023

work page 2023
[9]

Av-cloud: Spatial audio rendering through audio-visual cloud splatting

Mingfei Chen and Eli Shlizerman. Av-cloud: Spatial audio rendering through audio-visual cloud splatting. InThe Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024

work page 2024
[10]

Soaf: Scene occlusion-aware neural acoustic field.arXiv preprint arXiv:2407.02264, 2024

Huiyu Gao, Jiahao Ma, David Ahmedt-Aristizabal, Chuong Nguyen, and Miaomiao Liu. Soaf: Scene occlusion-aware neural acoustic field.arXiv preprint arXiv:2407.02264, 2024

work page arXiv 2024
[11]

Learning neural acoustic fields.Advances in Neural Information Processing Systems, 35:3165– 3177, 2022

Andrew Luo, Yilun Du, Michael Tarr, Josh Tenenbaum, Antonio Torralba, and Chuang Gan. Learning neural acoustic fields.Advances in Neural Information Processing Systems, 35:3165– 3177, 2022

work page 2022
[12]

Inras: Implicit neural representation for audio scenes.Advances in Neural Information Processing Systems, 35:8144–8158, 2022

Kun Su, Mingfei Chen, and Eli Shlizerman. Inras: Implicit neural representation for audio scenes.Advances in Neural Information Processing Systems, 35:8144–8158, 2022

work page 2022
[13]

Few-shot audio- visual learning of environment acoustics.Advances in Neural Information Processing Systems, 35:2522–2536, 2022

Sagnik Majumder, Changan Chen, Ziad Al-Halah, and Kristen Grauman. Few-shot audio- visual learning of environment acoustics.Advances in Neural Information Processing Systems, 35:2522–2536, 2022

work page 2022
[14]

Deep neural room acoustics primitive

Yuhang He, Anoop Cherian, Gordon Wichern, and Andrew Markham. Deep neural room acoustics primitive. InForty-first International Conference on Machine Learning, 2024

work page 2024
[15]

Acoustic volume rendering for neural impulse response fields.arXiv preprint arXiv:2411.06307, 2024

Zitong Lan, Chenhao Zheng, Zhiwei Zheng, and Mingmin Zhao. Acoustic volume rendering for neural impulse response fields.arXiv preprint arXiv:2411.06307, 2024. 11

work page arXiv 2024
[16]

Novel view acoustic parameter estimation.arXiv preprint arXiv:2410.23523, 2024

Ricardo Falcon-Perez, Ruohan Gao, Gregor Mueckl, Sebastia V Amengual Gari, and Ishwarya Ananthabhotla. Novel view acoustic parameter estimation.arXiv preprint arXiv:2410.23523, 2024

work page arXiv 2024
[17]

Soundspaces: Audio-visual navigation in 3d environments

Changan Chen, Unnat Jain, Carl Schissler, Sebastia Vicenc Amengual Gari, Ziad Al-Halah, Vamsi Krishna Ithapu, Philip Robinson, and Kristen Grauman. Soundspaces: Audio-visual navigation in 3d environments. InComputer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part VI 16, pages 17–36. Springer, 2020

work page 2020
[18]

Real acoustic fields: An audio-visual room acoustics dataset and benchmark

Ziyang Chen, Israel D Gebru, Christian Richardt, Anurag Kumar, William Laney, Andrew Owens, and Alexander Richard. Real acoustic fields: An audio-visual room acoustics dataset and benchmark. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21886–21896, 2024

work page 2024
[19]

Hearing anywhere in any environment.arXiv preprint arXiv:2504.10746, 2025

Xiulong Liu, Anurag Kumar, Paul Calamia, Sebastia V Amengual, Calvin Murdock, Ishwarya Ananthabhotla, Philip Robinson, Eli Shlizerman, Vamsi Krishna Ithapu, and Ruohan Gao. Hearing anywhere in any environment.arXiv preprint arXiv:2504.10746, 2025

work page arXiv 2025
[20]

Av-nerf: Learning neural fields for real-world audio-visual scene synthesis.Advances in Neural Information Processing Systems, 36:37472–37490, 2023

Susan Liang, Chao Huang, Yapeng Tian, Anurag Kumar, and Chenliang Xu. Av-nerf: Learning neural fields for real-world audio-visual scene synthesis.Advances in Neural Information Processing Systems, 36:37472–37490, 2023

work page 2023
[21]

Neraf: 3d scene infused neural radiance and acoustic fields.arXiv preprint arXiv:2405.18213, 2024

Amandine Brunetto, Sascha Hornauer, and Fabien Moutarde. Neraf: 3d scene infused neural radiance and acoustic fields.arXiv preprint arXiv:2405.18213, 2024

work page arXiv 2024
[22]

Av-gs: Learning material and geometry aware priors for novel view acoustic synthesis.arXiv preprint arXiv:2406.08920, 2024

Swapnil Bhosale, Haosen Yang, Diptesh Kanojia, Jiankang Deng, and Xiatian Zhu. Av-gs: Learning material and geometry aware priors for novel view acoustic synthesis.arXiv preprint arXiv:2406.08920, 2024

work page arXiv 2024
[23]

Mesh2ir: Neural acoustic impulse response generator for complex 3d scenes

Anton Ratnarajah, Zhenyu Tang, Rohith Aralikatti, and Dinesh Manocha. Mesh2ir: Neural acoustic impulse response generator for complex 3d scenes. InProceedings of the 30th ACM International Conference on Multimedia, pages 924–933, 2022

work page 2022
[24]

Ddsp: Differentiable digital signal processing.arXiv preprint arXiv:2001.04643, 2020

Jesse Engel, Lamtharn Hantrakul, Chenjie Gu, and Adam Roberts. Ddsp: Differentiable digital signal processing.arXiv preprint arXiv:2001.04643, 2020

work page arXiv 2001
[25]

A review of differentiable digital signal processing for music and speech synthesis.Frontiers in Signal Processing, 3:1284100, 2024

Ben Hayes, Jordie Shier, György Fazekas, Andrew McPherson, and Charalampos Saitis. A review of differentiable digital signal processing for music and speech synthesis.Frontiers in Signal Processing, 3:1284100, 2024

work page 2024
[26]

Identification of surface acoustic impedances in a reverberant room using the fdtd method

Niccoló Antonello, Toon van Waterschoot, Marc Moonen, and Patrick A Naylor. Identification of surface acoustic impedances in a reverberant room using the fdtd method. In2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC), pages 114–118. IEEE, 2014

work page 2014
[27]

Hearing anything anywhere

Mason Long Wang, Ryosuke Sawata, Samuel Clarke, Ruohan Gao, Shangzhe Wu, and Jiajun Wu. Hearing anything anywhere. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11790–11799, 2024

work page 2024
[28]

A differentiable image source model for room acoustics optimization

Bowen Zhi, Alisha Sharma, Dmitry N Zotkin, and Ramani Duraiswami. A differentiable image source model for room acoustics optimization. In2023 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pages 1–5. IEEE, 2023

work page 2023
[29]

Acoustic classification and optimization for multi-modal rendering of real-world scenes.IEEE transactions on visualization and computer graphics, 24(3):1246–1259, 2017

Carl Schissler, Christian Loftin, and Dinesh Manocha. Acoustic classification and optimization for multi-modal rendering of real-world scenes.IEEE transactions on visualization and computer graphics, 24(3):1246–1259, 2017

work page 2017
[30]

Scene-aware audio for 360 videos

Dingzeyu Li, Timothy R Langlois, and Changxi Zheng. Scene-aware audio for 360 videos. ACM Transactions on Graphics (TOG), 37(4):1–12, 2018

work page 2018
[31]

Scene-aware audio rendering via deep acoustic analysis.IEEE transactions on visualization and computer graphics, 26(5):1991–2001, 2020

Zhenyu Tang, Nicholas J Bryan, Dingzeyu Li, Timothy R Langlois, and Dinesh Manocha. Scene-aware audio rendering via deep acoustic analysis.IEEE transactions on visualization and computer graphics, 26(5):1991–2001, 2020. 12

work page 1991
[32]

John wiley & sons, 2000

Lawrence E Kinsler, Austin R Frey, Alan B Coppens, and James V Sanders.Fundamentals of acoustics. John wiley & sons, 2000

work page 2000
[33]

The room acoustic rendering equation.The Journal of the Acoustical Society of America, 122(3):1624–1635, 2007

Samuel Siltanen, Tapio Lokki, Sami Kiminki, and Lauri Savioja. The room acoustic rendering equation.The Journal of the Acoustical Society of America, 122(3):1624–1635, 2007

work page 2007
[34]

Modeling early reflections of room impulse responses using a radiance transfer method

Hequn Bai, Gael Richard, and Laurent Daudet. Modeling early reflections of room impulse responses using a radiance transfer method. In2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, pages 1–4. IEEE, 2013

work page 2013
[35]

Geometric-based reverberator using acoustic rendering networks

Hequn Bai, Gael Richard, and Laurent Daudet. Geometric-based reverberator using acoustic rendering networks. In2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pages 1–5. IEEE, 2015

work page 2015
[36]

Room acoustic rendering networks with control of scattering and early reflections.IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024

Matteo Scerbo, Lauri Savioja, and Enzo De Sena. Room acoustic rendering networks with control of scattering and early reflections.IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024

work page 2024
[37]

Mod-art: Modal decomposition of acoustic radiance transfer.arXiv preprint arXiv:2412.04534, 2024

Matteo Scerbo, Sebastian J Schlecht, Randall Ali, Lauri Savioja, and Enzo De Sena. Mod-art: Modal decomposition of acoustic radiance transfer.arXiv preprint arXiv:2412.04534, 2024

work page arXiv 2024
[38]

Frequency domain acoustic radiance transfer for real-time auralization.Acta Acustica united with Acustica, 95(1):106–117, 2009

Samuel Siltanen, Tapio Lokki, and Lauri Savioja. Frequency domain acoustic radiance transfer for real-time auralization.Acta Acustica united with Acustica, 95(1):106–117, 2009

work page 2009
[39]

Efficient acoustic radiance transfer method with time-dependent reflections

Samuel Siltanen, Tapio Lokki, and Lauri Savioja. Efficient acoustic radiance transfer method with time-dependent reflections. InProceedings of Meetings on Acoustics. AIP Publishing, 2011

work page 2011
[40]

Acoustic analysis and dataset of transitions between coupled rooms

Thomas McKenzie, Sebastian J Schlecht, and Ville Pulkki. Acoustic analysis and dataset of transitions between coupled rooms. InICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 481–485. IEEE, 2021

work page 2021
[41]

The rendering equation

James T Kajiya. The rendering equation. InProceedings of the 13th annual conference on Computer graphics and interactive techniques, pages 143–150, 1986

work page 1986
[42]

Overview of geometrical room acoustic modeling tech- niques.The Journal of the Acoustical Society of America, 138(2):708–730, 2015

Lauri Savioja and U Peter Svensson. Overview of geometrical room acoustic modeling tech- niques.The Journal of the Acoustical Society of America, 138(2):708–730, 2015

work page 2015
[43]

Directional reflectance and emissivity of an opaque surface.Applied optics, 4(7):767–775, 1965

Fred E Nicodemus. Directional reflectance and emissivity of an opaque surface.Applied optics, 4(7):767–775, 1965

work page 1965
[44]

The theory and measurement of bidirectional reflectance distribution function (brdf) and bidirectional transmittance distribution function (btdf)

Frederick O Bartell, Eustace L Dereniak, and William L Wolfe. The theory and measurement of bidirectional reflectance distribution function (brdf) and bidirectional transmittance distribution function (btdf). InRadiation scattering in optical systems, volume 257, pages 154–160. SPIE, 1981

work page 1981
[45]

Differentiable artificial reverberation

Sungho Lee, Hyeong-Seok Choi, and Kyogu Lee. Differentiable artificial reverberation. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 30:2541–2556, 2022

work page 2022
[46]

Rir2fdn: An improved room impulse response analysis and synthesis

Gloria Dal Santo, Benoit Alary, Karolina Prawda, Sebastian Schlecht, and Vesa Välimäki. Rir2fdn: An improved room impulse response analysis and synthesis. InInternational Confer- ence on Digital Audio Effects, pages 230–237. University of Surrey, 2024

work page 2024
[47]

Julius Smith, 2007

Julius O Smith.Mathematics of the discrete Fourier transform (DFT): with audio applications. Julius Smith, 2007

work page 2007
[48]

Microfacet models for refraction through rough surfaces.Rendering techniques, 2007:18th, 2007

Bruce Walter, Stephen R Marschner, Hongsong Li, and Kenneth E Torrance. Microfacet models for refraction through rough surfaces.Rendering techniques, 2007:18th, 2007

work page 2007
[49]

Springer, 2019

Allan D Pierce.Acoustics: an introduction to its physical principles and applications. Springer, 2019. 13

work page 2019
[50]

Flamo: An open-source library for frequency-domain differentiable audio process- ing

Gloria Dal Santo, Gian Marco De Bortoli, Karolina Prawda, Sebastian J Schlecht, and Vesa Välimäki. Flamo: An open-source library for frequency-domain differentiable audio process- ing. InICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1–5. IEEE, 2025

work page 2025
[51]

An analysis/synthesis approach to real-time artificial reverberation

J-M Jot. An analysis/synthesis approach to real-time artificial reverberation. InAcoustics, Speech, and Signal Processing, IEEE International Conference on, volume 2, pages 221–224. IEEE Computer Society, 1992

work page 1992
[52]

Decoupled weight decay regularization.International Conference on Learning Representations (ICLR), 2019

Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization.International Conference on Learning Representations (ICLR), 2019

work page 2019
[53]

Splitting the unit delay [fir/all pass filters design].IEEE Signal Processing Magazine, 13(1):30–60, 1996

Timo I Laakso, Vesa Valimaki, Matti Karjalainen, and Unto K Laine. Splitting the unit delay [fir/all pass filters design].IEEE Signal Processing Magazine, 13(1):30–60, 1996

work page 1996
[54]

PyTorch: An Imperative Style, High-Performance Deep Learning Library

A Paszke. Pytorch: An imperative style, high-performance deep learning library.arXiv preprint arXiv:1912.01703, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1912
[55]

Interactive sound propagation using compact acoustic transfer operators.ACM Transactions on Graphics (TOG), 31(1):1–12, 2012

Lakulish Antani, Anish Chandak, Lauri Savioja, and Dinesh Manocha. Interactive sound propagation using compact acoustic transfer operators.ACM Transactions on Graphics (TOG), 31(1):1–12, 2012

work page 2012
[56]

Nerf: Representing scenes as neural radiance fields for view synthesis

Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik, Jonathan T Barron, Ravi Ramamoor- thi, and Ren Ng. Nerf: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 65(1):99–106, 2021

work page 2021
[57]

Diffraction modeling in acoustic radiance transfer method

Samuel Siltanen and Tapio Lokki. Diffraction modeling in acoustic radiance transfer method. Journal of the Acoustical Society of America, 123(5):3759, 2008

work page 2008
[58]

Combination of acoustical radiosity and the image source method.The Journal of the Acoustical Society of America, 133(6):3963–3974, 2013

Georgios I Koutsouris, Jonas Brunskog, Cheol-Ho Jeong, and Finn Jacobsen. Combination of acoustical radiosity and the image source method.The Journal of the Acoustical Society of America, 133(6):3963–3974, 2013

work page 2013
[59]

Interactive rendering with arbitrary brdfs using separable approximations

Jan Kautz and Michael D McCool. Interactive rendering with arbitrary brdfs using separable approximations. InRendering Techniques’ 99: Proceedings of the Eurographics Workshop in Granada, Spain, June 21–23, 1999 10, pages 247–260. Springer, 1999

work page 1999
[60]

Differentiable neural radiosity.arXiv preprint arXiv:2201.13190, 2022

Saeed Hadadan and Matthias Zwicker. Differentiable neural radiosity.arXiv preprint arXiv:2201.13190, 2022

work page arXiv 2022
[61]

Inverse global illumination using a neural radiometric prior

Saeed Hadadan, Geng Lin, Jan Novák, Fabrice Rousselle, and Matthias Zwicker. Inverse global illumination using a neural radiometric prior. InACM SIGGRAPH 2023 Conference Proceedings, pages 1–11, 2023

work page 2023
[62]

A progres- sive refinement approach to fast radiosity image generation

Michael F Cohen, Shenchang Eric Chen, John R Wallace, and Donald P Greenberg. A progres- sive refinement approach to fast radiosity image generation. InProceedings of the 15th annual conference on Computer graphics and interactive techniques, pages 75–84, 1988

work page 1988
[63]

Monte carlo estimators for differential light transport.ACM Transactions on Graphics (TOG), 40(4):1–16, 2021

Tizian Zeltner, Sébastien Speierer, Iliyan Georgiev, and Wenzel Jakob. Monte carlo estimators for differential light transport.ACM Transactions on Graphics (TOG), 40(4):1–16, 2021

work page 2021
[64]

On the multiplication of successions of fourier constants.Proceedings of the Royal Society of London

William Henry Young. On the multiplication of successions of fourier constants.Proceedings of the Royal Society of London. Series A, Containing Papers of a Mathematical and Physical Character, 87(596):331–339, 1912

work page 1912
[65]

Image method for efficiently simulating small-room acoustics.The Journal of the Acoustical Society of America, 65(4):943–950, 1979

Jont B Allen and David A Berkley. Image method for efficiently simulating small-room acoustics.The Journal of the Acoustical Society of America, 65(4):943–950, 1979

work page 1979
[66]

Improved mirror source method in roomacoustics.Journal of sound and vibration, 256(5):873–940, 2002

FP Mechel. Improved mirror source method in roomacoustics.Journal of sound and vibration, 256(5):873–940, 2002

work page 2002
[67]

Niccolo Antonello, Enzo De Sena, Marc Moonen, Patrick A Naylor, and Toon Van Waterschoot. Room impulse response interpolation using a sparse spatio-temporal representation of the sound field.IEEE/ACM Transactions on Audio, Speech, and Language Processing, 25(10):1929–1941, 2017. 14

work page 1929
[68]

Instant neural graphics primitives with a multiresolution hash encoding.ACM transactions on graphics (TOG), 41(4):1– 15, 2022

Thomas Müller, Alex Evans, Christoph Schied, and Alexander Keller. Instant neural graphics primitives with a multiresolution hash encoding.ACM transactions on graphics (TOG), 41(4):1– 15, 2022

work page 2022
[69]

Auralization of impulse responses modeled on the basis of ray-tracing results.Journal of the audio engineering society, 41(11):876–880, 1993

K Heinrich Kuttruff. Auralization of impulse responses modeled on the basis of ray-tracing results.Journal of the audio engineering society, 41(11):876–880, 1993

work page 1993
[70]

Warp: A high-performance python framework for gpu simulation and graph- ics

Miles Macklin. Warp: A high-performance python framework for gpu simulation and graph- ics. https://github.com/nvidia/warp, March 2022. NVIDIA GPU Technology Conference (GTC)

work page 2022
[71]

The Replica Dataset: A Digital Replica of Indoor Spaces

Julian Straub, Thomas Whelan, Lingni Ma, Yufan Chen, Erik Wijmans, Simon Green, Jakob J Engel, Raul Mur-Artal, Carl Ren, Shobhit Verma, et al. The replica dataset: A digital replica of indoor spaces.arXiv preprint arXiv:1906.05797, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1906
[72]

fromAi with directionSik

Michael Garland and Paul S Heckbert. Surface simplification using quadric error metrics. In Proceedings of the 24th annual conference on Computer graphics and interactive techniques, pages 209–216, 1997. 15 A Derivations List of SymbolsRefer to Table 4. Table 4: List of commonly used symbols in this paper. Symbol(s) Description ABoundary room geometry. Ai...

work page 1997
[73]

bounce points

to model the early specular reflections. In addition, a learnable signal for the residuals (e.g., late reverberation) is introduced and combined with the ISM part via a learnable crossfade envelope. The residual signal is shared for all source-receiver pairs. DiffRIR training comprises two processes. (i) First, for each known source-receiver pair, valid i...

work page arXiv

[1] [1]

Crc Press, 2016

Heinrich Kuttruff.Room acoustics. Crc Press, 2016

work page 2016

[2] [2]

Lukas Aspöck, Sönke Pelzer, Frank Wefers, and Michael V orländer.A real-time auralization plugin for architectural design and education. 2014

work page 2014

[3] [3]

Gsound: Interactive sound propagation for games

Carl Schissler and Dinesh Manocha. Gsound: Interactive sound propagation for games. In Audio Engineering Society Conference: 41st International Conference: Audio for Games. Audio Engineering Society, 2011

work page 2011

[4] [4]

Interactive sound propagation with bidirectional path tracing.ACM Transactions on Graphics (TOG), 35(6):1–11, 2016

Chunxiao Cao, Zhong Ren, Carl Schissler, Dinesh Manocha, and Kun Zhou. Interactive sound propagation with bidirectional path tracing.ACM Transactions on Graphics (TOG), 35(6):1–11, 2016

work page 2016

[5] [5]

Perceptual comparison of efficient real-time geometrical acoustics engines in virtual reality

Sebastia Vicenc Amengual Gari, Carl Schissler, and Philip Robinson. Perceptual comparison of efficient real-time geometrical acoustics engines in virtual reality. InAudio Engineering Society Conference: AES 2024 International Audio for Games Conference. Audio Engineering Society, 2024

work page 2024

[6] [6]

Real-time acoustic modeling for distributed virtual environments

Thomas Funkhouser, Patrick Min, and Ingrid Carlbom. Real-time acoustic modeling for distributed virtual environments. InProceedings of the 26th annual conference on Computer graphics and interactive techniques, pages 365–374, 1999

work page 1999

[7] [7]

On the relative importance of visual and spatial audio rendering on vr immersion.Frontiers in Signal Processing, 2:904866, 2022

Thomas Potter, Zoran Cvetkovi ´c, and Enzo De Sena. On the relative importance of visual and spatial audio rendering on vr immersion.Frontiers in Signal Processing, 2:904866, 2022

work page 2022

[8] [8]

Novel-view acoustic synthesis

Changan Chen, Alexander Richard, Roman Shapovalov, Vamsi Krishna Ithapu, Natalia Neverova, Kristen Grauman, and Andrea Vedaldi. Novel-view acoustic synthesis. InPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6409–6419, 2023

work page 2023

[9] [9]

Av-cloud: Spatial audio rendering through audio-visual cloud splatting

Mingfei Chen and Eli Shlizerman. Av-cloud: Spatial audio rendering through audio-visual cloud splatting. InThe Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024

work page 2024

[10] [10]

Soaf: Scene occlusion-aware neural acoustic field.arXiv preprint arXiv:2407.02264, 2024

Huiyu Gao, Jiahao Ma, David Ahmedt-Aristizabal, Chuong Nguyen, and Miaomiao Liu. Soaf: Scene occlusion-aware neural acoustic field.arXiv preprint arXiv:2407.02264, 2024

work page arXiv 2024

[11] [11]

Learning neural acoustic fields.Advances in Neural Information Processing Systems, 35:3165– 3177, 2022

Andrew Luo, Yilun Du, Michael Tarr, Josh Tenenbaum, Antonio Torralba, and Chuang Gan. Learning neural acoustic fields.Advances in Neural Information Processing Systems, 35:3165– 3177, 2022

work page 2022

[12] [12]

Inras: Implicit neural representation for audio scenes.Advances in Neural Information Processing Systems, 35:8144–8158, 2022

Kun Su, Mingfei Chen, and Eli Shlizerman. Inras: Implicit neural representation for audio scenes.Advances in Neural Information Processing Systems, 35:8144–8158, 2022

work page 2022

[13] [13]

Few-shot audio- visual learning of environment acoustics.Advances in Neural Information Processing Systems, 35:2522–2536, 2022

Sagnik Majumder, Changan Chen, Ziad Al-Halah, and Kristen Grauman. Few-shot audio- visual learning of environment acoustics.Advances in Neural Information Processing Systems, 35:2522–2536, 2022

work page 2022

[14] [14]

Deep neural room acoustics primitive

Yuhang He, Anoop Cherian, Gordon Wichern, and Andrew Markham. Deep neural room acoustics primitive. InForty-first International Conference on Machine Learning, 2024

work page 2024

[15] [15]

Acoustic volume rendering for neural impulse response fields.arXiv preprint arXiv:2411.06307, 2024

Zitong Lan, Chenhao Zheng, Zhiwei Zheng, and Mingmin Zhao. Acoustic volume rendering for neural impulse response fields.arXiv preprint arXiv:2411.06307, 2024. 11

work page arXiv 2024

[16] [16]

Novel view acoustic parameter estimation.arXiv preprint arXiv:2410.23523, 2024

Ricardo Falcon-Perez, Ruohan Gao, Gregor Mueckl, Sebastia V Amengual Gari, and Ishwarya Ananthabhotla. Novel view acoustic parameter estimation.arXiv preprint arXiv:2410.23523, 2024

work page arXiv 2024

[17] [17]

Soundspaces: Audio-visual navigation in 3d environments

Changan Chen, Unnat Jain, Carl Schissler, Sebastia Vicenc Amengual Gari, Ziad Al-Halah, Vamsi Krishna Ithapu, Philip Robinson, and Kristen Grauman. Soundspaces: Audio-visual navigation in 3d environments. InComputer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part VI 16, pages 17–36. Springer, 2020

work page 2020

[18] [18]

Real acoustic fields: An audio-visual room acoustics dataset and benchmark

Ziyang Chen, Israel D Gebru, Christian Richardt, Anurag Kumar, William Laney, Andrew Owens, and Alexander Richard. Real acoustic fields: An audio-visual room acoustics dataset and benchmark. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21886–21896, 2024

work page 2024

[19] [19]

Hearing anywhere in any environment.arXiv preprint arXiv:2504.10746, 2025

Xiulong Liu, Anurag Kumar, Paul Calamia, Sebastia V Amengual, Calvin Murdock, Ishwarya Ananthabhotla, Philip Robinson, Eli Shlizerman, Vamsi Krishna Ithapu, and Ruohan Gao. Hearing anywhere in any environment.arXiv preprint arXiv:2504.10746, 2025

work page arXiv 2025

[20] [20]

Av-nerf: Learning neural fields for real-world audio-visual scene synthesis.Advances in Neural Information Processing Systems, 36:37472–37490, 2023

Susan Liang, Chao Huang, Yapeng Tian, Anurag Kumar, and Chenliang Xu. Av-nerf: Learning neural fields for real-world audio-visual scene synthesis.Advances in Neural Information Processing Systems, 36:37472–37490, 2023

work page 2023

[21] [21]

Neraf: 3d scene infused neural radiance and acoustic fields.arXiv preprint arXiv:2405.18213, 2024

Amandine Brunetto, Sascha Hornauer, and Fabien Moutarde. Neraf: 3d scene infused neural radiance and acoustic fields.arXiv preprint arXiv:2405.18213, 2024

work page arXiv 2024

[22] [22]

Av-gs: Learning material and geometry aware priors for novel view acoustic synthesis.arXiv preprint arXiv:2406.08920, 2024

Swapnil Bhosale, Haosen Yang, Diptesh Kanojia, Jiankang Deng, and Xiatian Zhu. Av-gs: Learning material and geometry aware priors for novel view acoustic synthesis.arXiv preprint arXiv:2406.08920, 2024

work page arXiv 2024

[23] [23]

Mesh2ir: Neural acoustic impulse response generator for complex 3d scenes

Anton Ratnarajah, Zhenyu Tang, Rohith Aralikatti, and Dinesh Manocha. Mesh2ir: Neural acoustic impulse response generator for complex 3d scenes. InProceedings of the 30th ACM International Conference on Multimedia, pages 924–933, 2022

work page 2022

[24] [24]

Ddsp: Differentiable digital signal processing.arXiv preprint arXiv:2001.04643, 2020

Jesse Engel, Lamtharn Hantrakul, Chenjie Gu, and Adam Roberts. Ddsp: Differentiable digital signal processing.arXiv preprint arXiv:2001.04643, 2020

work page arXiv 2001

[25] [25]

A review of differentiable digital signal processing for music and speech synthesis.Frontiers in Signal Processing, 3:1284100, 2024

Ben Hayes, Jordie Shier, György Fazekas, Andrew McPherson, and Charalampos Saitis. A review of differentiable digital signal processing for music and speech synthesis.Frontiers in Signal Processing, 3:1284100, 2024

work page 2024

[26] [26]

Identification of surface acoustic impedances in a reverberant room using the fdtd method

Niccoló Antonello, Toon van Waterschoot, Marc Moonen, and Patrick A Naylor. Identification of surface acoustic impedances in a reverberant room using the fdtd method. In2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC), pages 114–118. IEEE, 2014

work page 2014

[27] [27]

Hearing anything anywhere

Mason Long Wang, Ryosuke Sawata, Samuel Clarke, Ruohan Gao, Shangzhe Wu, and Jiajun Wu. Hearing anything anywhere. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11790–11799, 2024

work page 2024

[28] [28]

A differentiable image source model for room acoustics optimization

Bowen Zhi, Alisha Sharma, Dmitry N Zotkin, and Ramani Duraiswami. A differentiable image source model for room acoustics optimization. In2023 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pages 1–5. IEEE, 2023

work page 2023

[29] [29]

Acoustic classification and optimization for multi-modal rendering of real-world scenes.IEEE transactions on visualization and computer graphics, 24(3):1246–1259, 2017

Carl Schissler, Christian Loftin, and Dinesh Manocha. Acoustic classification and optimization for multi-modal rendering of real-world scenes.IEEE transactions on visualization and computer graphics, 24(3):1246–1259, 2017

work page 2017

[30] [30]

Scene-aware audio for 360 videos

Dingzeyu Li, Timothy R Langlois, and Changxi Zheng. Scene-aware audio for 360 videos. ACM Transactions on Graphics (TOG), 37(4):1–12, 2018

work page 2018

[31] [31]

Scene-aware audio rendering via deep acoustic analysis.IEEE transactions on visualization and computer graphics, 26(5):1991–2001, 2020

Zhenyu Tang, Nicholas J Bryan, Dingzeyu Li, Timothy R Langlois, and Dinesh Manocha. Scene-aware audio rendering via deep acoustic analysis.IEEE transactions on visualization and computer graphics, 26(5):1991–2001, 2020. 12

work page 1991

[32] [32]

John wiley & sons, 2000

Lawrence E Kinsler, Austin R Frey, Alan B Coppens, and James V Sanders.Fundamentals of acoustics. John wiley & sons, 2000

work page 2000

[33] [33]

The room acoustic rendering equation.The Journal of the Acoustical Society of America, 122(3):1624–1635, 2007

Samuel Siltanen, Tapio Lokki, Sami Kiminki, and Lauri Savioja. The room acoustic rendering equation.The Journal of the Acoustical Society of America, 122(3):1624–1635, 2007

work page 2007

[34] [34]

Modeling early reflections of room impulse responses using a radiance transfer method

Hequn Bai, Gael Richard, and Laurent Daudet. Modeling early reflections of room impulse responses using a radiance transfer method. In2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, pages 1–4. IEEE, 2013

work page 2013

[35] [35]

Geometric-based reverberator using acoustic rendering networks

Hequn Bai, Gael Richard, and Laurent Daudet. Geometric-based reverberator using acoustic rendering networks. In2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pages 1–5. IEEE, 2015

work page 2015

[36] [36]

Room acoustic rendering networks with control of scattering and early reflections.IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024

Matteo Scerbo, Lauri Savioja, and Enzo De Sena. Room acoustic rendering networks with control of scattering and early reflections.IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024

work page 2024

[37] [37]

Mod-art: Modal decomposition of acoustic radiance transfer.arXiv preprint arXiv:2412.04534, 2024

Matteo Scerbo, Sebastian J Schlecht, Randall Ali, Lauri Savioja, and Enzo De Sena. Mod-art: Modal decomposition of acoustic radiance transfer.arXiv preprint arXiv:2412.04534, 2024

work page arXiv 2024

[38] [38]

Frequency domain acoustic radiance transfer for real-time auralization.Acta Acustica united with Acustica, 95(1):106–117, 2009

Samuel Siltanen, Tapio Lokki, and Lauri Savioja. Frequency domain acoustic radiance transfer for real-time auralization.Acta Acustica united with Acustica, 95(1):106–117, 2009

work page 2009

[39] [39]

Efficient acoustic radiance transfer method with time-dependent reflections

Samuel Siltanen, Tapio Lokki, and Lauri Savioja. Efficient acoustic radiance transfer method with time-dependent reflections. InProceedings of Meetings on Acoustics. AIP Publishing, 2011

work page 2011

[40] [40]

Acoustic analysis and dataset of transitions between coupled rooms

Thomas McKenzie, Sebastian J Schlecht, and Ville Pulkki. Acoustic analysis and dataset of transitions between coupled rooms. InICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 481–485. IEEE, 2021

work page 2021

[41] [41]

The rendering equation

James T Kajiya. The rendering equation. InProceedings of the 13th annual conference on Computer graphics and interactive techniques, pages 143–150, 1986

work page 1986

[42] [42]

Overview of geometrical room acoustic modeling tech- niques.The Journal of the Acoustical Society of America, 138(2):708–730, 2015

Lauri Savioja and U Peter Svensson. Overview of geometrical room acoustic modeling tech- niques.The Journal of the Acoustical Society of America, 138(2):708–730, 2015

work page 2015

[43] [43]

Directional reflectance and emissivity of an opaque surface.Applied optics, 4(7):767–775, 1965

Fred E Nicodemus. Directional reflectance and emissivity of an opaque surface.Applied optics, 4(7):767–775, 1965

work page 1965

[44] [44]

The theory and measurement of bidirectional reflectance distribution function (brdf) and bidirectional transmittance distribution function (btdf)

Frederick O Bartell, Eustace L Dereniak, and William L Wolfe. The theory and measurement of bidirectional reflectance distribution function (brdf) and bidirectional transmittance distribution function (btdf). InRadiation scattering in optical systems, volume 257, pages 154–160. SPIE, 1981

work page 1981

[45] [45]

Differentiable artificial reverberation

Sungho Lee, Hyeong-Seok Choi, and Kyogu Lee. Differentiable artificial reverberation. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 30:2541–2556, 2022

work page 2022

[46] [46]

Rir2fdn: An improved room impulse response analysis and synthesis

Gloria Dal Santo, Benoit Alary, Karolina Prawda, Sebastian Schlecht, and Vesa Välimäki. Rir2fdn: An improved room impulse response analysis and synthesis. InInternational Confer- ence on Digital Audio Effects, pages 230–237. University of Surrey, 2024

work page 2024

[47] [47]

Julius Smith, 2007

Julius O Smith.Mathematics of the discrete Fourier transform (DFT): with audio applications. Julius Smith, 2007

work page 2007

[48] [48]

Microfacet models for refraction through rough surfaces.Rendering techniques, 2007:18th, 2007

Bruce Walter, Stephen R Marschner, Hongsong Li, and Kenneth E Torrance. Microfacet models for refraction through rough surfaces.Rendering techniques, 2007:18th, 2007

work page 2007

[49] [49]

Springer, 2019

Allan D Pierce.Acoustics: an introduction to its physical principles and applications. Springer, 2019. 13

work page 2019

[50] [50]

Flamo: An open-source library for frequency-domain differentiable audio process- ing

Gloria Dal Santo, Gian Marco De Bortoli, Karolina Prawda, Sebastian J Schlecht, and Vesa Välimäki. Flamo: An open-source library for frequency-domain differentiable audio process- ing. InICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1–5. IEEE, 2025

work page 2025

[51] [51]

An analysis/synthesis approach to real-time artificial reverberation

J-M Jot. An analysis/synthesis approach to real-time artificial reverberation. InAcoustics, Speech, and Signal Processing, IEEE International Conference on, volume 2, pages 221–224. IEEE Computer Society, 1992

work page 1992

[52] [52]

Decoupled weight decay regularization.International Conference on Learning Representations (ICLR), 2019

Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization.International Conference on Learning Representations (ICLR), 2019

work page 2019

[53] [53]

Splitting the unit delay [fir/all pass filters design].IEEE Signal Processing Magazine, 13(1):30–60, 1996

Timo I Laakso, Vesa Valimaki, Matti Karjalainen, and Unto K Laine. Splitting the unit delay [fir/all pass filters design].IEEE Signal Processing Magazine, 13(1):30–60, 1996

work page 1996

[54] [54]

PyTorch: An Imperative Style, High-Performance Deep Learning Library

A Paszke. Pytorch: An imperative style, high-performance deep learning library.arXiv preprint arXiv:1912.01703, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1912

[55] [55]

Interactive sound propagation using compact acoustic transfer operators.ACM Transactions on Graphics (TOG), 31(1):1–12, 2012

Lakulish Antani, Anish Chandak, Lauri Savioja, and Dinesh Manocha. Interactive sound propagation using compact acoustic transfer operators.ACM Transactions on Graphics (TOG), 31(1):1–12, 2012

work page 2012

[56] [56]

Nerf: Representing scenes as neural radiance fields for view synthesis

Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik, Jonathan T Barron, Ravi Ramamoor- thi, and Ren Ng. Nerf: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 65(1):99–106, 2021

work page 2021

[57] [57]

Diffraction modeling in acoustic radiance transfer method

Samuel Siltanen and Tapio Lokki. Diffraction modeling in acoustic radiance transfer method. Journal of the Acoustical Society of America, 123(5):3759, 2008

work page 2008

[58] [58]

Combination of acoustical radiosity and the image source method.The Journal of the Acoustical Society of America, 133(6):3963–3974, 2013

Georgios I Koutsouris, Jonas Brunskog, Cheol-Ho Jeong, and Finn Jacobsen. Combination of acoustical radiosity and the image source method.The Journal of the Acoustical Society of America, 133(6):3963–3974, 2013

work page 2013

[59] [59]

Interactive rendering with arbitrary brdfs using separable approximations

Jan Kautz and Michael D McCool. Interactive rendering with arbitrary brdfs using separable approximations. InRendering Techniques’ 99: Proceedings of the Eurographics Workshop in Granada, Spain, June 21–23, 1999 10, pages 247–260. Springer, 1999

work page 1999

[60] [60]

Differentiable neural radiosity.arXiv preprint arXiv:2201.13190, 2022

Saeed Hadadan and Matthias Zwicker. Differentiable neural radiosity.arXiv preprint arXiv:2201.13190, 2022

work page arXiv 2022

[61] [61]

Inverse global illumination using a neural radiometric prior

Saeed Hadadan, Geng Lin, Jan Novák, Fabrice Rousselle, and Matthias Zwicker. Inverse global illumination using a neural radiometric prior. InACM SIGGRAPH 2023 Conference Proceedings, pages 1–11, 2023

work page 2023

[62] [62]

A progres- sive refinement approach to fast radiosity image generation

Michael F Cohen, Shenchang Eric Chen, John R Wallace, and Donald P Greenberg. A progres- sive refinement approach to fast radiosity image generation. InProceedings of the 15th annual conference on Computer graphics and interactive techniques, pages 75–84, 1988

work page 1988

[63] [63]

Monte carlo estimators for differential light transport.ACM Transactions on Graphics (TOG), 40(4):1–16, 2021

Tizian Zeltner, Sébastien Speierer, Iliyan Georgiev, and Wenzel Jakob. Monte carlo estimators for differential light transport.ACM Transactions on Graphics (TOG), 40(4):1–16, 2021

work page 2021

[64] [64]

On the multiplication of successions of fourier constants.Proceedings of the Royal Society of London

William Henry Young. On the multiplication of successions of fourier constants.Proceedings of the Royal Society of London. Series A, Containing Papers of a Mathematical and Physical Character, 87(596):331–339, 1912

work page 1912

[65] [65]

Image method for efficiently simulating small-room acoustics.The Journal of the Acoustical Society of America, 65(4):943–950, 1979

Jont B Allen and David A Berkley. Image method for efficiently simulating small-room acoustics.The Journal of the Acoustical Society of America, 65(4):943–950, 1979

work page 1979

[66] [66]

Improved mirror source method in roomacoustics.Journal of sound and vibration, 256(5):873–940, 2002

FP Mechel. Improved mirror source method in roomacoustics.Journal of sound and vibration, 256(5):873–940, 2002

work page 2002

[67] [67]

Niccolo Antonello, Enzo De Sena, Marc Moonen, Patrick A Naylor, and Toon Van Waterschoot. Room impulse response interpolation using a sparse spatio-temporal representation of the sound field.IEEE/ACM Transactions on Audio, Speech, and Language Processing, 25(10):1929–1941, 2017. 14

work page 1929

[68] [68]

Instant neural graphics primitives with a multiresolution hash encoding.ACM transactions on graphics (TOG), 41(4):1– 15, 2022

Thomas Müller, Alex Evans, Christoph Schied, and Alexander Keller. Instant neural graphics primitives with a multiresolution hash encoding.ACM transactions on graphics (TOG), 41(4):1– 15, 2022

work page 2022

[69] [69]

Auralization of impulse responses modeled on the basis of ray-tracing results.Journal of the audio engineering society, 41(11):876–880, 1993

K Heinrich Kuttruff. Auralization of impulse responses modeled on the basis of ray-tracing results.Journal of the audio engineering society, 41(11):876–880, 1993

work page 1993

[70] [70]

Warp: A high-performance python framework for gpu simulation and graph- ics

Miles Macklin. Warp: A high-performance python framework for gpu simulation and graph- ics. https://github.com/nvidia/warp, March 2022. NVIDIA GPU Technology Conference (GTC)

work page 2022

[71] [71]

The Replica Dataset: A Digital Replica of Indoor Spaces

Julian Straub, Thomas Whelan, Lingni Ma, Yufan Chen, Erik Wijmans, Simon Green, Jakob J Engel, Raul Mur-Artal, Carl Ren, Shobhit Verma, et al. The replica dataset: A digital replica of indoor spaces.arXiv preprint arXiv:1906.05797, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1906

[72] [72]

fromAi with directionSik

Michael Garland and Paul S Heckbert. Surface simplification using quadric error metrics. In Proceedings of the 24th annual conference on Computer graphics and interactive techniques, pages 209–216, 1997. 15 A Derivations List of SymbolsRefer to Table 4. Table 4: List of commonly used symbols in this paper. Symbol(s) Description ABoundary room geometry. Ai...

work page 1997

[73] [73]

bounce points

to model the early specular reflections. In addition, a learnable signal for the residuals (e.g., late reverberation) is introduced and combined with the ISM part via a learnable crossfade envelope. The residual signal is shared for all source-receiver pairs. DiffRIR training comprises two processes. (i) First, for each known source-receiver pair, valid i...

work page arXiv