FXplorer: A Map-Based Interface for Exploratory Audio Effect Design
Pith reviewed 2026-06-27 19:06 UTC · model grok-4.3
The pith
FXplorer places audio effect presets in a perceptually informed 2D space so users can browse transformations as a continuous landscape and interpolate between them.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
FXplorer organizes audio effects within a perceptually informed 2D space, allowing sound transformations to be browsed as a continuous landscape rather than as isolated presets. By combining established spatial interaction approaches and interpretable DAW-style controls with recent embedding-based machine learning methods for similarity and semantic search, the system brings exploration and parameter refinement into a single workspace. FXplorer supports composition, production, or performance by allowing users to edit and interpolate between effect presets interactively.
What carries the argument
The perceptually informed 2D space that embeds audio effects via machine learning similarity measures, enabling spatial browsing and interpolation alongside DAW-style controls.
If this is right
- Sound transformations become continuous rather than discrete, so users can glide between presets instead of switching modules.
- Editing and searching occur in one view, removing the need to alternate between separate search and adjustment panels.
- Real-time interpolation in the map supports live performance or iterative composition without resetting parameters.
- Semantic search integrated into the spatial view lets users locate effects by description while staying in the same workspace.
Where Pith is reading between the lines
- The same 2D embedding approach could be tested on other parameter-rich creative tools such as synthesizer patches or visual filter banks.
- Dynamic updating of the map from user listening data might further reduce the gap between individual taste and the displayed layout.
- If the map proves stable across genres, it could serve as a shared reference for teaching sound design concepts to beginners.
Load-bearing premise
Embedding methods for audio similarity will produce a navigable 2D map that combines effectively with spatial interaction and existing DAW controls without creating new usability problems.
What would settle it
A controlled user study in which participants using FXplorer complete fewer successful sound designs or report higher cognitive load than participants using a conventional list-based DAW interface with the same effects.
Figures
read the original abstract
Audio effects (FX) shape sound in contemporary music practice. However, most interfaces present them as discrete modules and parameters that favor targeted adjustment over exploratory listening. This separation can make it difficult to build intuition about the broader space of possible transformations or to move fluidly between searching and refinement. We present FXplorer, an interface that organizes audio effects within a perceptually informed 2D space, allowing sound transformations to be browsed as a continuous landscape rather than as isolated presets. By combining established spatial interaction approaches and interpretable DAW-style controls with recent embedding-based machine learning methods for similarity and semantic search, the system brings exploration and parameter refinement into a single workspace. FXplorer supports composition, production, or performance by allowing users to edit and interpolate between effect presets interactively.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents FXplorer, a map-based interface for exploratory audio effect design. It organizes audio effects within a perceptually informed 2D space using embedding-based machine learning methods for similarity and semantic search. The system combines this spatial organization with spatial interaction techniques and DAW-style controls to enable users to browse effects as a continuous landscape, edit presets, and interpolate between them interactively, with the goal of integrating exploration and parameter refinement into a single workspace to support composition, production, or performance.
Significance. If the perceptual embedding produces musically meaningful neighborhoods and the interface proves usable without introducing disorientation or loss of control, the work could meaningfully advance creative audio tools by bridging discrete preset browsing with continuous exploration. The combination of ML embeddings with established spatial and DAW interaction paradigms is a coherent synthesis that targets a documented limitation in current interfaces. However, the manuscript contains no implementation details, perceptual validation, or usage evidence, so any significance remains prospective rather than demonstrated.
major comments (2)
- [Abstract] Abstract: The claim that FXplorer 'supports composition, production, or performance by allowing users to edit and interpolate between effect presets interactively' is load-bearing for the paper's contribution yet is unsupported; the manuscript supplies no user studies, perceptual validation of the 2D embedding, quantitative measures of interpolation quality, or comparisons against existing DAW workflows.
- [Abstract] Abstract: The assertion that the space is 'perceptually informed' via 'embedding-based machine learning methods' is central to the interface rationale but receives no technical elaboration on model choice, training corpus, similarity metric, or any validation that neighborhoods correspond to audible similarity rather than abstract embedding proximity.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive review. The manuscript presents the design of FXplorer as a novel interface concept. We address each major comment below, acknowledging where the abstract overstates the current evidence and indicating the revisions we will make.
read point-by-point responses
-
Referee: [Abstract] Abstract: The claim that FXplorer 'supports composition, production, or performance by allowing users to edit and interpolate between effect presets interactively' is load-bearing for the paper's contribution yet is unsupported; the manuscript supplies no user studies, perceptual validation of the 2D embedding, quantitative measures of interpolation quality, or comparisons against existing DAW workflows.
Authors: We agree that the manuscript contains no user studies, perceptual validations, quantitative measures, or workflow comparisons. The paper's contribution is the interface design that integrates spatial organization with DAW-style controls. We will revise the abstract to state that FXplorer is designed to support composition, production, or performance by enabling interactive editing and interpolation, framing the claim as prospective rather than demonstrated. revision: yes
-
Referee: [Abstract] Abstract: The assertion that the space is 'perceptually informed' via 'embedding-based machine learning methods' is central to the interface rationale but receives no technical elaboration on model choice, training corpus, similarity metric, or any validation that neighborhoods correspond to audible similarity rather than abstract embedding proximity.
Authors: The current abstract does not elaborate on these technical aspects. We will revise the abstract to briefly specify the embedding model, training corpus, and similarity metric employed. We will also qualify 'perceptually informed' to indicate that it derives from embeddings trained on audio similarity data rather than claiming direct perceptual validation, which is not present in the manuscript. revision: yes
Circularity Check
No significant circularity; system proposal with no derivations or fitted results
full rationale
The paper is a high-level description of an interface (FXplorer) that combines spatial interaction, DAW controls, and embedding-based ML methods. No equations, parameter fitting, predictions, or derivation chains are present in the provided text. The central claim is a design proposal rather than a mathematical result that reduces to its inputs. No self-citations or uniqueness theorems are invoked in a load-bearing way. This matches the default expectation of no circularity for non-derivational papers.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Antoine Caillon and Philippe Esling. 2021. RAVE: A variational autoen- coder for fast and high-quality neural audio synthesis.arXiv preprint arXiv:2111.05011(2021)
arXiv 2021
-
[2]
Annie Chu, Patrick O’Reilly, Julia Barnett, and Bryan Pardo. 2025. Text2fx: Harnessing clap embeddings for text-guided audio effects. InICASSP 2025- 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 1–5
2025
-
[3]
Francesco Ardan Dal Rí, Domenico Stefani, Luca Turchet, and Nicola Conci
-
[4]
InProceedings of the 28-th Int
MorphDrive: Latent Conditioning for Cross-Circuit Effect Modeling and a Parametric Audio Dataset of Analog Overdrive Pedals. InProceedings of the 28-th Int. Conf. on Digital Audio Effects (DAFx25)(Ancona, Italy), L. Gabrielli and S. Cecchi (Eds.)
-
[5]
Stefano Delle Monache, Nicolas Misdariis, and Elif Özcan. 2022. Semantic models of sound-driven design: Designing with listening in mind.Design Studies83 (2022), 101134
2022
-
[6]
Seungheon Doh, Junghyun Koo, Marco A Martínez-Ramírez, Wei-Hsiang Liao, Juhan Nam, and Yuki Mitsufuji. 2025. Can Large Language Models Predict Audio Effects Parameters from Natural Language?arXiv preprint arXiv:2505.20770(2025)
arXiv 2025
-
[7]
Jesse Engel, Cinjon Resnick, Adam Roberts, Sander Dieleman, Mohammad Norouzi, Douglas Eck, and Karen Simonyan. 2017. Neural audio synthesis of musical notes with wavenet autoencoders. InInternational conference on machine learning. PMLR, 1068–1077
2017
-
[8]
Frederic Font and Giuseppe Bandiera. 2017. Freesound explorer: make music while discovering freesound!. InProceedings of the 3rd Web Audio Conference
2017
-
[9]
Ohad Fried, Zeyu Jin, Reid Oda, and Adam Finkelstein. 2014. AudioQuilt: 2D Arrangements of Audio Samples using Metric Learning and Kernelized Sorting. InProceedings of the International Conference on New Interfaces for Musical Expression. Goldsmiths, University of London, London, United Kingdom, 281–
2014
-
[10]
https://doi.org/10.5281/zenodo.1178766
-
[11]
Leandro Garber, Tomás Ciccola, and Juan Cruz Amusategui. 2020. AudioStellar, an open source corpus-based musical instrument for latent sound structure discovery and sonic experimentation. InProceedings of ICMC
2020
-
[12]
L. H. Hantrakul. 2017. lamtharnhantrakul/klustr. [Online]. https://github.c om/lamtharnhantrakul/klustr
2017
-
[13]
Aaron Hertzmann. 2022. Toward Modeling Creative Processes for Algorithmic Painting. InICCC. https://api.semanticscholar.org/CorpusId:248506151
2022
-
[14]
Hojoon Ki, Jongsuk Kim, Minchan Kwon, and Junmo Kim. 2026. FxSearcher: gradient-free text-driven audio transformation. InICASSP 2026-2026 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 15462–15466
2026
-
[15]
Alexander Lunt and Sebastian Trump. 2023. Latent Space Explorer.AIMC 2023(aug 29 2023). https://aimc2023.pubpub.org/pub/zgc5j7ha
2023
-
[16]
Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE.Journal of Machine Learning Research9, Nov (2008), 2579–2605
2008
-
[17]
Leland McInnes, John Healy, and James Melville. 2018. Umap: Uniform mani- fold approximation and projection for dimension reduction.arXiv preprint arXiv:1802.03426(2018)
Pith/arXiv arXiv 2018
-
[18]
Jason Naradowsky. 2021. Amp-space: A large-scale dataset for fine-grained timbre transformation. In2021 24th International Conference on Digital Audio Effects (DAFx). IEEE, 57–64
2021
-
[19]
Bryan Pardo, Mark Cartwright, Prem Seetharaman, and Bongjun Kim. 2019. Learning to build natural audio production interfaces. InArts, Vol. 8. MDPI, 110
2019
-
[20]
Karl Pearson. 1901. LIII. On lines and planes of closest fit to systems of points in space.The London, Edinburgh, and Dublin philosophical magazine and journal of science2, 11 (1901), 559–572
1901
-
[21]
Gerard Roma, Owen Green, and Pierre Alexandre Tremblay. 2019. Adaptive Mapping of Sound Collections for Data-driven Musical Interfaces. InProceed- ings of the International Conference on New Interfaces for Musical Expression, Marcelo Queiroz and Anna Xambó Sedó (Eds.). UFRGS, Porto Alegre, Brazil, 313–318. https://doi.org/10.5281/zenodo.3672976
-
[22]
Diemo Schwarz, Grégory Beller, Bruno Verbrugghe, and Sam Britton. 2006. Real-time corpus-based concatenative synthesis with catart. In9th Interna- tional Conference on Digital Audio Effects (DAFx). 279–282
2006
-
[23]
Prem Seetharaman and Bryan Pardo. 2016. Audealize: Crowdsourced audio production tools.Journal of the Audio Engineering Society64, 9 (2016), 683–695
2016
-
[24]
Spyridon Stasis, Nicholas Jillings, Sean Enderby, and Ryan Stables. 2017. Audio processing chain recommendation. InProceedings of the 20th International Conference on Digital Audio Effects,(Edinburgh, UK)
2017
-
[25]
Christian J Steinmetz, Shubhr Singh, Marco Comunità, Ilias Ibnyahya, Shanxin Yuan, Emmanouil Benetos, and Joshua D Reiss. 2024. St-ito: Controlling audio effects for style transfer with inference-time optimization.arXiv preprint arXiv:2410.21233(2024)
arXiv 2024
-
[26]
Manny Tan and Kyle McDonald. 2017. Infinite Drum Machine. [Online] https://experiments.withgoogle.com/ai/drum-machine/
2017
-
[27]
Robert Tubb and Simon Dixon. 2014. The Divergent Interface: Supporting Creative Exploration of Parameter Spaces. InProceedings of the International Conference on New Interfaces for Musical Expression. Goldsmiths, University of London, London, United Kingdom, 227–232. https://doi.org/10.5281/zenodo .1178967
-
[28]
Yusong Wu, Ke Chen, Tianyu Zhang, Yuchen Hui, Taylor Berg-Kirkpatrick, and Shlomo Dubnov. 2023. Large-scale contrastive language-audio pretraining with feature fusion and keyword-to-caption augmentation. InICASSP 2023- 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 1–5
2023
-
[29]
XLN Audio. 2021. XO - XLN Audio. [Online]. https://www.xlnaudio.com/pro ducts/xo
2021
-
[30]
Ruihan Yang, Tianyao Chen, Yiyi Zhang, and Gus Xia. 2019. Inspecting and interacting with meaningful music representations using VAE.arXiv preprint arXiv:1904.08842(2019)
Pith/arXiv arXiv 2019
-
[31]
Shuoyang Jasper Zheng, Anna Xambó Sedó, and Nick Bryan-Kinns. 2025. Exploring gestural affordances in audio latent space navigation.Frontiers in Computer Science7 (2025), 1575202
2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.