Composite Bayesian Optimization In Function Spaces Using NEON -- Neural Epistemic Operator Networks

Leonardo Ferreira Guilhoto; Paris Perdikaris

arxiv: 2404.03099 · v1 · pith:YBBNUV4Gnew · submitted 2024-04-03 · 💻 cs.LG · cs.AI· cs.CE· cs.IT· math.IT· stat.ML

Composite Bayesian Optimization In Function Spaces Using NEON -- Neural Epistemic Operator Networks

Leonardo Ferreira Guilhoto , Paris Perdikaris This is my paper

Pith reviewed 2026-05-24 02:18 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.CEcs.ITmath.ITstat.ML

keywords neural operatorsBayesian optimizationepistemic uncertaintyfunction spacescomposite optimizationoperator learningsequential decision making

0 comments

The pith

NEON uses one operator network to match deep ensembles in composite Bayesian optimization over function spaces while using orders of magnitude fewer parameters.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents NEON as an architecture that produces predictions together with epistemic uncertainty from a single operator network backbone. It applies this to composite Bayesian optimization, where one seeks to maximize an unknown composition f = g ∘ h and h maps inputs to elements of a function space. Experiments on toy problems and real-world tasks show that the resulting acquisition strategy reaches state-of-the-art performance. A sympathetic reader cares because the approach removes the need to train and store large ensembles when the objective involves functional intermediates.

Core claim

NEON is an architecture for generating predictions with uncertainty using a single operator network backbone, which presents orders of magnitude less trainable parameters than deep ensembles of comparable performance. When applied to the problem of composite Bayesian optimization of f = g ∘ h, where h : X → C(𝒴, ℝ^{d_s}) is an unknown map outputting elements of a function space and g is a known cheap functional, NEON achieves state-of-the-art performance on toy and real-world scenarios.

What carries the argument

NEON (Neural Epistemic Operator Networks), a single operator network backbone that supplies epistemic uncertainty estimates to guide acquisition in composite Bayesian optimization over function spaces.

If this is right

Composite Bayesian optimization over functional outputs becomes feasible with far lower memory and training cost.
Operator learning models can supply the uncertainty needed for sequential decision making without maintaining multiple independent networks.
The same backbone can be reused across multiple composite problems that share the same functional output space.
Real-time or resource-constrained applications of function-space optimization become practical.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

NEON-style single-backbone uncertainty could be tested on other sequential tasks that already use operator networks, such as control of PDE-governed systems.
If the uncertainty quality generalizes, similar single-network designs might replace ensembles in related operator-learning settings that currently rely on them for calibration.
The method invites direct comparison against other cheap uncertainty mechanisms, such as last-layer Laplace approximations, inside the same composite BO loop.

Load-bearing premise

A single operator network backbone can produce epistemic uncertainty estimates of quality comparable to deep ensembles for the purpose of guiding composite Bayesian optimization.

What would settle it

A controlled composite Bayesian optimization benchmark in which NEON-guided search reaches demonstrably worse final values than an ensemble baseline of matched predictive accuracy.

Figures

Figures reproduced from arXiv: 2404.03099 by Leonardo Ferreira Guilhoto, Paris Perdikaris.

**Figure 1.** Figure 1: Example of h(u) ∈ C([0, 221]2 , R 2 ) for the the Cell Towers problem. The input u ∈ R 30 encodes transmission parameters of 15 cell towers, which are used to produce the function seen above, where signal intensity and interference are plotted, respectively. This information is the used to compute a score f(u) = g(h(u)) ∈ R which evaluates the quality of cellular service in the region. By using operator co… view at source ↗

**Figure 2.** Figure 2: Diagrams for the architectures used in this paper. The NEON architecture (top) combines the deterministic [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: Diagrams representing the two decoders used in the experiments considered in this paper. On the left, the [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: Experimental results for the Environment Model (left) and Brusselator PDE (right) problems. In both cases [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗

**Figure 5.** Figure 5: Experimental results for the Optical Interferometer (left) and Cell Towers (right) problems. For the [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗

**Figure 6.** Figure 6: Full experimental results for the Environmental Modeling (left) and Brusselator PDE (right) problems. The [PITH_FULL_IMAGE:figures/full_fig_p014_6.png] view at source ↗

**Figure 7.** Figure 7: Best result obtained by NEON for the Optical Interferometer problem. Here we plot the 16 components of [PITH_FULL_IMAGE:figures/full_fig_p015_7.png] view at source ↗

**Figure 8.** Figure 8: Best result obtained by NEON for the Cell Towers problem. Here we plot the signal strength and interference [PITH_FULL_IMAGE:figures/full_fig_p016_8.png] view at source ↗

**Figure 9.** Figure 9: Full experimental results for the Optical Interferometer (left) and Cell Towers (right) problems. The dashed [PITH_FULL_IMAGE:figures/full_fig_p016_9.png] view at source ↗

**Figure 10.** Figure 10: Average results among 5 trials comparing parallel acquisition functions using [PITH_FULL_IMAGE:figures/full_fig_p017_10.png] view at source ↗

read the original abstract

Operator learning is a rising field of scientific computing where inputs or outputs of a machine learning model are functions defined in infinite-dimensional spaces. In this paper, we introduce NEON (Neural Epistemic Operator Networks), an architecture for generating predictions with uncertainty using a single operator network backbone, which presents orders of magnitude less trainable parameters than deep ensembles of comparable performance. We showcase the utility of this method for sequential decision-making by examining the problem of composite Bayesian Optimization (BO), where we aim to optimize a function $f=g\circ h$, where $h:X\to C(\mathcal{Y},\mathbb{R}^{d_s})$ is an unknown map which outputs elements of a function space, and $g: C(\mathcal{Y},\mathbb{R}^{d_s})\to \mathbb{R}$ is a known and cheap-to-compute functional. By comparing our approach to other state-of-the-art methods on toy and real world scenarios, we demonstrate that NEON achieves state-of-the-art performance while requiring orders of magnitude less trainable parameters.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

NEON gives a single operator network epistemic uncertainty for composite Bayesian optimization with far fewer parameters than ensembles, and the experiments support the performance on the tested tasks.

read the letter

NEON is a single-backbone approach to epistemic uncertainty in operator networks that enables composite Bayesian optimization with far fewer parameters than ensembles, and the experiments on toy and real tasks back up the performance. What is new is the specific construction of NEON for this purpose. It avoids the need for deep ensembles by building uncertainty into one network, which is a practical step for function-space problems. The paper does well in focusing on the composite BO setting, where the goal is to optimize a composition involving an unknown map to a function space. By measuring optimization outcomes rather than just model accuracy, they provide relevant evidence for the method's utility in sequential decision making. Soft spots are limited. The comparisons claim SOTA, but the strength depends on how the baselines were implemented and whether the real-world scenarios are representative. The parameter savings are the clear advantage, but confirming the uncertainty estimates are of high quality across more cases would help. No circular reasoning or internal contradictions stand out. This work is for people working on scientific machine learning, particularly those combining operator learning with optimization under uncertainty. It offers value to readers looking for efficient ways to handle epistemic uncertainty in high-dimensional function spaces. I would recommend engaging with it through peer review, as the idea is grounded and the results are presented in a way that allows evaluation.

Referee Report

0 major / 2 minor

Summary. The manuscript introduces NEON (Neural Epistemic Operator Networks), an operator-learning architecture that produces epistemic uncertainty estimates from a single network backbone rather than an ensemble. The method is applied to composite Bayesian optimization of the form f = g ∘ h, where h maps to a function space and g is a known, cheap functional; experiments on toy problems and real-world tasks are reported to show state-of-the-art optimization performance while using orders of magnitude fewer trainable parameters than comparable deep ensembles.

Significance. If the experimental results hold, the work supplies a concrete, parameter-efficient mechanism for epistemic uncertainty in infinite-dimensional operator learning that directly supports sequential decision-making. The evaluation measures optimization regret rather than isolated predictive metrics, and the architecture description supplies an explicit route to uncertainty that is shown to be competitive with ensembles; these elements strengthen the central claim.

minor comments (2)

The notation for the composite objective (h : X → C(Y, R^{d_s})) is introduced in the abstract but would benefit from an explicit reminder in the first paragraph of §3 when the BO acquisition functions are defined.
Figure 2 caption states that NEON uses 'a single backbone'; a one-sentence clarification of how the epistemic head is attached without increasing the parameter count relative to a deterministic operator network would improve readability.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their thorough review, positive assessment of the significance of NEON for uncertainty-aware operator learning in composite Bayesian optimization, and recommendation to accept the manuscript.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper's central claims rest on introducing the NEON architecture for epistemic uncertainty via a single operator-network backbone and then validating its utility for composite Bayesian optimization through direct empirical comparisons against SOTA baselines on toy and real-world tasks. These comparisons measure optimization performance (not merely internal predictive metrics) and are independent of any self-referential definitions, parameter fits renamed as predictions, or load-bearing self-citations. No equations or architectural choices in the provided description reduce by construction to the target results; the derivation chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review yields no identifiable free parameters, axioms, or invented entities.

pith-pipeline@v0.9.0 · 5722 in / 920 out tokens · 20300 ms · 2026-05-24T02:18:59.225521+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

51 extracted references · 51 canonical work pages · 3 internal anchors

[1]

Carl Edward Rasmussen and Christopher K. I. Williams. Gaussian processes for machine learning. Adaptive computation and machine learning. MIT Press, 2006

work page 2006
[2]

Bayesian neural networks: An introduction and survey

Ethan Goan and Clinton Fookes. Bayesian neural networks: An introduction and survey. In Case Studies in Applied Bayesian Data Science, pages 45–87. Springer International Publishing, 2020

work page 2020
[3]

Simple and scalable predictive uncertainty estimation using deep ensembles, 2016

Balaji Lakshminarayanan, Alexander Pritzel, and Charles Blundell. Simple and scalable predictive uncertainty estimation using deep ensembles, 2016

work page 2016
[4]

Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators

Lu Lu, Pengzhan Jin, Guofei Pang, Zhongqiang Zhang, and George Em Karniadakis. Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators. Nature Machine Intelligence, 3(3):218–229, mar 2021. L. Ferreira Guilhoto & P. Pedikaris 10 A Preprint - April 5, 2024 Composite Bayesian Optimization In Function Spaces Using N...

work page 2021
[5]

Fourier neural operator for parametric partial differential equations, 2021

Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, and Anima Anandkumar. Fourier neural operator for parametric partial differential equations, 2021

work page 2021
[6]

Neural operator: Learning maps between function spaces with applications to pdes

Nikola Kovachki, Zongyi Li, Burigede Liu, Kamyar Azizzadenesheli, Kaushik Bhattacharya, Andrew Stuart, and Anima Anandkumar. Neural operator: Learning maps between function spaces with applications to pdes. Journal of Machine Learning Research, 24(89):1–97, 2023

work page 2023
[7]

Learning operators with coupled attention

Georgios Kissas, Jacob H Seidman, Leonardo Ferreira Guilhoto, Victor M Preciado, George J Pappas, and Paris Perdikaris. Learning operators with coupled attention. The Journal of Machine Learning Research, 23(1):9636– 9698, 2022

work page 2022
[8]

Learning the solution operator of parametric partial differential equations with physics-informed deeponets

Sifan Wang, Hanwen Wang, and Paris Perdikaris. Learning the solution operator of parametric partial differential equations with physics-informed deeponets. Science advances, 7(40):eabi8605, 2021

work page 2021
[9]

Improved architectures and training algorithms for deep operator networks

Sifan Wang, Hanwen Wang, and Paris Perdikaris. Improved architectures and training algorithms for deep operator networks. Journal of Scientific Computing, 92(2):35, 2022

work page 2022
[10]

Scalable uncertainty quantification for deep operator networks using randomized priors

Yibo Yang, Georgios Kissas, and Paris Perdikaris. Scalable uncertainty quantification for deep operator networks using randomized priors. Computer Methods in Applied Mechanics and Engineering, 399:115399, 2022

work page 2022
[11]

Uncertainty quantification in scientific machine learning: Methods, metrics, and comparisons.Journal of Computational Physics, 477:111902, 2023

Apostolos F Psaros, Xuhui Meng, Zongren Zou, Ling Guo, and George Em Karniadakis. Uncertainty quantification in scientific machine learning: Methods, metrics, and comparisons.Journal of Computational Physics, 477:111902, 2023

work page 2023
[12]

Gomez, Tim G

Angelos Filos, Sebastian Farquhar, Aidan N. Gomez, Tim G. J. Rudner, Zachary Kenton, Lewis Smith, Milad Alizadeh, Arnoud de Kroon, and Yarin Gal. A systematic comparison of bayesian deep learning robustness in diabetic retinopathy tasks, 2019

work page 2019
[13]

Novoa, Justin Ko, Susan M

Andre Esteva, Brett Kuprel, Roberto A. Novoa, Justin Ko, Susan M. Swetter, Helen M. Blau, and Sebastian Thrun. Dermatologist-level classification of skin cancer with deep neural networks. Nature, 542(7639):115–118, January 2017

work page 2017
[14]

Autonomous driving with deep learning: A survey of state-of-art technologies, 2020

Yu Huang and Yue Chen. Autonomous driving with deep learning: A survey of state-of-art technologies, 2020

work page 2020
[15]

Bayesian active learning for classification and preference learning, 2011

Neil Houlsby, Ferenc Huszár, Zoubin Ghahramani, and Máté Lengyel. Bayesian active learning for classification and preference learning, 2011

work page 2011
[16]

Batchbald: Efficient and diverse batch acquisition for deep bayesian active learning, 2019

Andreas Kirsch, Joost van Amersfoort, and Yarin Gal. Batchbald: Efficient and diverse batch acquisition for deep bayesian active learning, 2019

work page 2019
[17]

Epistemic neural networks

Ian Osband, Zheng Wen, Mohammad Asghari, Morteza Ibrahimi, Xiyuan Lu, and Benjamin Van Roy. Epistemic neural networks. CoRR, abs/2107.08924, 2021

work page arXiv 2021
[18]

Recent advances in bayesian optimization, 2022

Xilu Wang, Yaochu Jin, Sebastian Schmitt, and Markus Olhofer. Recent advances in bayesian optimization, 2022

work page 2022
[19]

Jiang, Samuel Daulton, Benjamin Letham, Andrew Gordon Wilson, and Eytan Bakshy

Maximilian Balandat, Brian Karrer, Daniel R. Jiang, Samuel Daulton, Benjamin Letham, Andrew Gordon Wilson, and Eytan Bakshy. Botorch: Programmable bayesian optimization in pytorch. CoRR, abs/1910.06403, 2019

work page arXiv 1910
[20]

Dropout as a bayesian approximation: Representing model uncertainty in deep learning, 2016

Yarin Gal and Zoubin Ghahramani. Dropout as a bayesian approximation: Representing model uncertainty in deep learning, 2016

work page 2016
[21]

Seidman, Georgios Kissas, Paris Perdikaris, and George J

Jacob H. Seidman, Georgios Kissas, Paris Perdikaris, and George J. Pappas. Nomad: Nonlinear manifold decoders for operator learning, 2022

work page 2022
[22]

Scalable bayesian optimization with randomized prior networks

Mohamed Aziz Bhouri, Michael Joly, Robert Yu, Soumalya Sarkar, and Paris Perdikaris. Scalable bayesian optimization with randomized prior networks. Computer Methods in Applied Mechanics and Engineering , 417:116428, 2023

work page 2023
[23]

Bayesian optimization with high-dimensional outputs

Wesley J Maddox, Maximilian Balandat, Andrew G Wilson, and Eytan Bakshy. Bayesian optimization with high-dimensional outputs. Advances in neural information processing systems, 34:19274–19287, 2021

work page 2021
[24]

Universal approximation to nonlinear operators by neural networks with arbitrary activation functions and its application to dynamical systems

Tianping Chen and Hong Chen. Universal approximation to nonlinear operators by neural networks with arbitrary activation functions and its application to dynamical systems. IEEE Transactions on Neural Networks, 6(4):911–917, 1995

work page 1995
[25]

Neural operator prediction of linear instability waves in high-speed boundary layers

Patricio Clark Di Leoni, Lu Lu, Charles Meneveau, George Em Karniadakis, and Tamer A Zaki. Neural operator prediction of linear instability waves in high-speed boundary layers. Journal of Computational Physics, 474:111793, 2023

work page 2023
[26]

Mionet: Learning multiple-input operators via tensor product, 2022

Pengzhan Jin, Shuai Meng, and Lu Lu. Mionet: Learning multiple-input operators via tensor product, 2022

work page 2022
[27]

Raul Astudillo and Peter I. Frazier. Bayesian optimization of composite functions, 2019. L. Ferreira Guilhoto & P. Pedikaris 11 A Preprint - April 5, 2024 Composite Bayesian Optimization In Function Spaces Using NEON - Neural Epistemic Operator Networks

work page 2019
[28]

Joint composite latent space bayesian optimization

Natalie Maus, Zhiyuan Jerry Lin, Maximilian Balandat, and Eytan Bakshy. Joint composite latent space bayesian optimization. arXiv preprint arXiv:2311.02213, 2023

work page arXiv 2023
[29]

Optimizing coverage and capacity in cellular networks using machine learning

Ryan M Dreifuerst, Samuel Daulton, Yuchen Qian, Paul Varkey, Maximilian Balandat, Sanjay Kasturia, Anoop Tomar, Ali Yazdan, Vish Ponnampalam, and Robert W Heath. Optimizing coverage and capacity in cellular networks using machine learning. In ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 8138–814...

work page 2021
[30]

Deep learning for bayesian optimization of scientific problems with high-dimensional structure

Samuel Kim, Peter Y Lu, Charlotte Loh, Jamie Smith, Jasper Snoek, and Marin Solja ˇci´c. Deep learning for bayesian optimization of scientific problems with high-dimensional structure. Transactions on Machine Learning Research, 2022

work page 2022
[31]

Srinivasan, Ben Mildenhall, Sara Fridovich-Keil, Nithin Raghavan, Utkarsh Singhal, Ravi Ramamoorthi, Jonathan T

Matthew Tancik, Pratul P. Srinivasan, Ben Mildenhall, Sara Fridovich-Keil, Nithin Raghavan, Utkarsh Singhal, Ravi Ramamoorthi, Jonathan T. Barron, and Ren Ng. Fourier features let networks learn high frequency functions in low dimensional domains. NeurIPS, 2020

work page 2020
[32]

Matthews, Kwang Moo Yi, Gopal Sharma, Dmitry Lagun, and Andrea Tagliasacchi

Daniel Rebain, Mark J. Matthews, Kwang Moo Yi, Gopal Sharma, Dmitry Lagun, and Andrea Tagliasacchi. Attention beats concatenation for conditioning neural fields, 2022

work page 2022
[33]

On the difficulty of training Recurrent Neural Networks

Razvan Pascanu, Tomás Mikolov, and Yoshua Bengio. Understanding the exploding gradient problem.CoRR, abs/1211.5063, 2012

work page internal anchor Pith review Pith/arXiv arXiv 2012
[34]

Rectifier nonlinearities improve neural network acoustic models

Andrew L Maas, Awni Y Hannun, Andrew Y Ng, et al. Rectifier nonlinearities improve neural network acoustic models. In Proc. icml, volume 30-1, page 3. Atlanta, GA, 2013

work page 2013
[35]

Unexpected improvements to expected improvement for bayesian optimization

Sebastian Ament, Samuel Daulton, David Eriksson, Maximilian Balandat, and Eytan Bakshy. Unexpected improvements to expected improvement for bayesian optimization. Advances in Neural Information Processing Systems, 36, 2024

work page 2024
[36]

Gaussian Process Optimization in the Bandit Setting: No Regret and Experimental Design

Niranjan Srinivas, Andreas Krause, Sham M. Kakade, and Matthias W. Seeger. Gaussian process bandits without regret: An experimental design approach. CoRR, abs/0912.3995, 2009

work page internal anchor Pith review Pith/arXiv arXiv 2009
[37]

Oliphant, Matt Haberland, Tyler Reddy, David Cournapeau, Evgeni Burovski, Pearu Peterson, Warren Weckesser, Jonathan Bright, Stéfan J

Pauli Virtanen, Ralf Gommers, Travis E. Oliphant, Matt Haberland, Tyler Reddy, David Cournapeau, Evgeni Burovski, Pearu Peterson, Warren Weckesser, Jonathan Bright, Stéfan J. van der Walt, Matthew Brett, Joshua Wilson, K. Jarrod Millman, Nikolay Mayorov, Andrew R. J. Nelson, Eric Jones, Robert Kern, Eric Larson, C J Carey, ˙Ilhan Polat, Yu Feng, Eric W. M...

work page 2020
[38]

On the limited memory bfgs method for large scale optimization

Dong C Liu and Jorge Nocedal. On the limited memory bfgs method for large scale optimization. Mathematical programming, 45(1):503–528, 1989

work page 1989
[39]

Kriging is well-suited to parallelize optimization

David Ginsbourger, Rodolphe Le Riche, and Laurent Carraro. Kriging is well-suited to parallelize optimization. In Computational intelligence in expensive optimization problems, pages 131–162. Springer, 2010

work page 2010
[40]

Differentiable expected hypervolume improvement for parallel multi-objective bayesian optimization

Samuel Daulton, Maximilian Balandat, and Eytan Bakshy. Differentiable expected hypervolume improvement for parallel multi-objective bayesian optimization. Advances in Neural Information Processing Systems, 33:9851– 9864, 2020

work page 2020
[41]

Parallel Bayesian Global Optimization of Expensive Functions

Jialei Wang, Scott C Clark, Eric Liu, and Peter I Frazier. Parallel bayesian global optimization of expensive functions. arXiv preprint arXiv:1602.05149, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[43]

py-pde: A python package for solving partial differential equations

David Zwicker. py-pde: A python package for solving partial differential equations. Journal of Open Source Software, 5(48):2158, 2020

work page 2020
[44]

Interferobot: aligning an optical interferometer by a reinforcement learning agent, 2021

Dmitry Sorokin, Alexander Ulanov, Ekaterina Sazhina, and Alexander Lvovsky. Interferobot: aligning an optical interferometer by a reinforcement learning agent, 2021

work page 2021
[45]

JAX: composable transforma- tions of Python+NumPy programs, 2018

James Bradbury, Roy Frostig, Peter Hawkins, Matthew James Johnson, Chris Leary, Dougal Maclaurin, George Necula, Adam Paszke, Jake VanderPlas, Skye Wanderman-Milne, and Qiao Zhang. JAX: composable transforma- tions of Python+NumPy programs, 2018

work page 2018
[46]

Flax: A neural network library and ecosystem for JAX, 2023

Jonathan Heek, Anselm Levskaya, Avital Oliver, Marvin Ritter, Bertrand Rondepierre, Andreas Steiner, and Marc van Zee. Flax: A neural network library and ecosystem for JAX, 2023

work page 2023
[47]

J. D. Hunter. Matplotlib: A 2d graphics environment. Computing in Science & Engineering, 9(3):90–95, 2007. L. Ferreira Guilhoto & P. Pedikaris 12 A Preprint - April 5, 2024 Composite Bayesian Optimization In Function Spaces Using NEON - Neural Epistemic Operator Networks

work page 2007
[48]

Harris, K

Charles R. Harris, K. Jarrod Millman, Stéfan J. van der Walt, Ralf Gommers, Pauli Virtanen, David Cournapeau, Eric Wieser, Julian Taylor, Sebastian Berg, Nathaniel J. Smith, Robert Kern, Matti Picus, Stephan Hoyer, Marten H. van Kerkwijk, Matthew Brett, Allan Haldane, Jaime Fernández del Río, Mark Wiebe, Pearu Peterson, Pierre Gérard-Marchant, Kevin Shepp...

work page 2020
[49]

Bayesian calibration and uncertainty analysis for computationally expensive models using optimization and radial basis function approximation

Nikolay Bliznyuk, David Ruppert, Christine Shoemaker, Rommel Regis, Stefan Wild, and Pradeep Mugunthan. Bayesian calibration and uncertainty analysis for computationally expensive models using optimization and radial basis function approximation. Journal of Computational and Graphical Statistics, 17(2):270–294, 2008

work page 2008
[50]

Kingma and Jimmy Ba

Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization, 2017

work page 2017
[51]

Wilson, Frank Hutter, and Marc Peter Deisenroth

James T. Wilson, Frank Hutter, and Marc Peter Deisenroth. Maximizing acquisition functions for bayesian optimization, 2018. Author contributions statement L.F.G. and P.P. conceived the methodology. L.F.G. conducted the experiments and analysed the results. P.P. provided funding and supervised this study. All authors reviewed the manuscript. Competing Inte...

work page 2018
[52]

We trained this network for 4,000 steps using full batch and the Adam[50] optimizer and exponential learning rate decay

The EpiNet architecture we used consisted of a trainable MLP with two hidden layers of dimension 32, and for the prior component an ensemble of 16 MLPs with 2 hidden layers of width 5 each and a scale parameter of 1. We trained this network for 4,000 steps using full batch and the Adam[50] optimizer and exponential learning rate decay. Figure 6: Full expe...

work page 2024

[1] [1]

Carl Edward Rasmussen and Christopher K. I. Williams. Gaussian processes for machine learning. Adaptive computation and machine learning. MIT Press, 2006

work page 2006

[2] [2]

Bayesian neural networks: An introduction and survey

Ethan Goan and Clinton Fookes. Bayesian neural networks: An introduction and survey. In Case Studies in Applied Bayesian Data Science, pages 45–87. Springer International Publishing, 2020

work page 2020

[3] [3]

Simple and scalable predictive uncertainty estimation using deep ensembles, 2016

Balaji Lakshminarayanan, Alexander Pritzel, and Charles Blundell. Simple and scalable predictive uncertainty estimation using deep ensembles, 2016

work page 2016

[4] [4]

Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators

Lu Lu, Pengzhan Jin, Guofei Pang, Zhongqiang Zhang, and George Em Karniadakis. Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators. Nature Machine Intelligence, 3(3):218–229, mar 2021. L. Ferreira Guilhoto & P. Pedikaris 10 A Preprint - April 5, 2024 Composite Bayesian Optimization In Function Spaces Using N...

work page 2021

[5] [5]

Fourier neural operator for parametric partial differential equations, 2021

Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, and Anima Anandkumar. Fourier neural operator for parametric partial differential equations, 2021

work page 2021

[6] [6]

Neural operator: Learning maps between function spaces with applications to pdes

Nikola Kovachki, Zongyi Li, Burigede Liu, Kamyar Azizzadenesheli, Kaushik Bhattacharya, Andrew Stuart, and Anima Anandkumar. Neural operator: Learning maps between function spaces with applications to pdes. Journal of Machine Learning Research, 24(89):1–97, 2023

work page 2023

[7] [7]

Learning operators with coupled attention

Georgios Kissas, Jacob H Seidman, Leonardo Ferreira Guilhoto, Victor M Preciado, George J Pappas, and Paris Perdikaris. Learning operators with coupled attention. The Journal of Machine Learning Research, 23(1):9636– 9698, 2022

work page 2022

[8] [8]

Learning the solution operator of parametric partial differential equations with physics-informed deeponets

Sifan Wang, Hanwen Wang, and Paris Perdikaris. Learning the solution operator of parametric partial differential equations with physics-informed deeponets. Science advances, 7(40):eabi8605, 2021

work page 2021

[9] [9]

Improved architectures and training algorithms for deep operator networks

Sifan Wang, Hanwen Wang, and Paris Perdikaris. Improved architectures and training algorithms for deep operator networks. Journal of Scientific Computing, 92(2):35, 2022

work page 2022

[10] [10]

Scalable uncertainty quantification for deep operator networks using randomized priors

Yibo Yang, Georgios Kissas, and Paris Perdikaris. Scalable uncertainty quantification for deep operator networks using randomized priors. Computer Methods in Applied Mechanics and Engineering, 399:115399, 2022

work page 2022

[11] [11]

Uncertainty quantification in scientific machine learning: Methods, metrics, and comparisons.Journal of Computational Physics, 477:111902, 2023

Apostolos F Psaros, Xuhui Meng, Zongren Zou, Ling Guo, and George Em Karniadakis. Uncertainty quantification in scientific machine learning: Methods, metrics, and comparisons.Journal of Computational Physics, 477:111902, 2023

work page 2023

[12] [12]

Gomez, Tim G

Angelos Filos, Sebastian Farquhar, Aidan N. Gomez, Tim G. J. Rudner, Zachary Kenton, Lewis Smith, Milad Alizadeh, Arnoud de Kroon, and Yarin Gal. A systematic comparison of bayesian deep learning robustness in diabetic retinopathy tasks, 2019

work page 2019

[13] [13]

Novoa, Justin Ko, Susan M

Andre Esteva, Brett Kuprel, Roberto A. Novoa, Justin Ko, Susan M. Swetter, Helen M. Blau, and Sebastian Thrun. Dermatologist-level classification of skin cancer with deep neural networks. Nature, 542(7639):115–118, January 2017

work page 2017

[14] [14]

Autonomous driving with deep learning: A survey of state-of-art technologies, 2020

Yu Huang and Yue Chen. Autonomous driving with deep learning: A survey of state-of-art technologies, 2020

work page 2020

[15] [15]

Bayesian active learning for classification and preference learning, 2011

Neil Houlsby, Ferenc Huszár, Zoubin Ghahramani, and Máté Lengyel. Bayesian active learning for classification and preference learning, 2011

work page 2011

[16] [16]

Batchbald: Efficient and diverse batch acquisition for deep bayesian active learning, 2019

Andreas Kirsch, Joost van Amersfoort, and Yarin Gal. Batchbald: Efficient and diverse batch acquisition for deep bayesian active learning, 2019

work page 2019

[17] [17]

Epistemic neural networks

Ian Osband, Zheng Wen, Mohammad Asghari, Morteza Ibrahimi, Xiyuan Lu, and Benjamin Van Roy. Epistemic neural networks. CoRR, abs/2107.08924, 2021

work page arXiv 2021

[18] [18]

Recent advances in bayesian optimization, 2022

Xilu Wang, Yaochu Jin, Sebastian Schmitt, and Markus Olhofer. Recent advances in bayesian optimization, 2022

work page 2022

[19] [19]

Jiang, Samuel Daulton, Benjamin Letham, Andrew Gordon Wilson, and Eytan Bakshy

Maximilian Balandat, Brian Karrer, Daniel R. Jiang, Samuel Daulton, Benjamin Letham, Andrew Gordon Wilson, and Eytan Bakshy. Botorch: Programmable bayesian optimization in pytorch. CoRR, abs/1910.06403, 2019

work page arXiv 1910

[20] [20]

Dropout as a bayesian approximation: Representing model uncertainty in deep learning, 2016

Yarin Gal and Zoubin Ghahramani. Dropout as a bayesian approximation: Representing model uncertainty in deep learning, 2016

work page 2016

[21] [21]

Seidman, Georgios Kissas, Paris Perdikaris, and George J

Jacob H. Seidman, Georgios Kissas, Paris Perdikaris, and George J. Pappas. Nomad: Nonlinear manifold decoders for operator learning, 2022

work page 2022

[22] [22]

Scalable bayesian optimization with randomized prior networks

Mohamed Aziz Bhouri, Michael Joly, Robert Yu, Soumalya Sarkar, and Paris Perdikaris. Scalable bayesian optimization with randomized prior networks. Computer Methods in Applied Mechanics and Engineering , 417:116428, 2023

work page 2023

[23] [23]

Bayesian optimization with high-dimensional outputs

Wesley J Maddox, Maximilian Balandat, Andrew G Wilson, and Eytan Bakshy. Bayesian optimization with high-dimensional outputs. Advances in neural information processing systems, 34:19274–19287, 2021

work page 2021

[24] [24]

Universal approximation to nonlinear operators by neural networks with arbitrary activation functions and its application to dynamical systems

Tianping Chen and Hong Chen. Universal approximation to nonlinear operators by neural networks with arbitrary activation functions and its application to dynamical systems. IEEE Transactions on Neural Networks, 6(4):911–917, 1995

work page 1995

[25] [25]

Neural operator prediction of linear instability waves in high-speed boundary layers

Patricio Clark Di Leoni, Lu Lu, Charles Meneveau, George Em Karniadakis, and Tamer A Zaki. Neural operator prediction of linear instability waves in high-speed boundary layers. Journal of Computational Physics, 474:111793, 2023

work page 2023

[26] [26]

Mionet: Learning multiple-input operators via tensor product, 2022

Pengzhan Jin, Shuai Meng, and Lu Lu. Mionet: Learning multiple-input operators via tensor product, 2022

work page 2022

[27] [27]

Raul Astudillo and Peter I. Frazier. Bayesian optimization of composite functions, 2019. L. Ferreira Guilhoto & P. Pedikaris 11 A Preprint - April 5, 2024 Composite Bayesian Optimization In Function Spaces Using NEON - Neural Epistemic Operator Networks

work page 2019

[28] [28]

Joint composite latent space bayesian optimization

Natalie Maus, Zhiyuan Jerry Lin, Maximilian Balandat, and Eytan Bakshy. Joint composite latent space bayesian optimization. arXiv preprint arXiv:2311.02213, 2023

work page arXiv 2023

[29] [29]

Optimizing coverage and capacity in cellular networks using machine learning

Ryan M Dreifuerst, Samuel Daulton, Yuchen Qian, Paul Varkey, Maximilian Balandat, Sanjay Kasturia, Anoop Tomar, Ali Yazdan, Vish Ponnampalam, and Robert W Heath. Optimizing coverage and capacity in cellular networks using machine learning. In ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 8138–814...

work page 2021

[30] [30]

Deep learning for bayesian optimization of scientific problems with high-dimensional structure

Samuel Kim, Peter Y Lu, Charlotte Loh, Jamie Smith, Jasper Snoek, and Marin Solja ˇci´c. Deep learning for bayesian optimization of scientific problems with high-dimensional structure. Transactions on Machine Learning Research, 2022

work page 2022

[31] [31]

Srinivasan, Ben Mildenhall, Sara Fridovich-Keil, Nithin Raghavan, Utkarsh Singhal, Ravi Ramamoorthi, Jonathan T

Matthew Tancik, Pratul P. Srinivasan, Ben Mildenhall, Sara Fridovich-Keil, Nithin Raghavan, Utkarsh Singhal, Ravi Ramamoorthi, Jonathan T. Barron, and Ren Ng. Fourier features let networks learn high frequency functions in low dimensional domains. NeurIPS, 2020

work page 2020

[32] [32]

Matthews, Kwang Moo Yi, Gopal Sharma, Dmitry Lagun, and Andrea Tagliasacchi

Daniel Rebain, Mark J. Matthews, Kwang Moo Yi, Gopal Sharma, Dmitry Lagun, and Andrea Tagliasacchi. Attention beats concatenation for conditioning neural fields, 2022

work page 2022

[33] [33]

On the difficulty of training Recurrent Neural Networks

Razvan Pascanu, Tomás Mikolov, and Yoshua Bengio. Understanding the exploding gradient problem.CoRR, abs/1211.5063, 2012

work page internal anchor Pith review Pith/arXiv arXiv 2012

[34] [34]

Rectifier nonlinearities improve neural network acoustic models

Andrew L Maas, Awni Y Hannun, Andrew Y Ng, et al. Rectifier nonlinearities improve neural network acoustic models. In Proc. icml, volume 30-1, page 3. Atlanta, GA, 2013

work page 2013

[35] [35]

Unexpected improvements to expected improvement for bayesian optimization

Sebastian Ament, Samuel Daulton, David Eriksson, Maximilian Balandat, and Eytan Bakshy. Unexpected improvements to expected improvement for bayesian optimization. Advances in Neural Information Processing Systems, 36, 2024

work page 2024

[36] [36]

Gaussian Process Optimization in the Bandit Setting: No Regret and Experimental Design

Niranjan Srinivas, Andreas Krause, Sham M. Kakade, and Matthias W. Seeger. Gaussian process bandits without regret: An experimental design approach. CoRR, abs/0912.3995, 2009

work page internal anchor Pith review Pith/arXiv arXiv 2009

[37] [37]

Oliphant, Matt Haberland, Tyler Reddy, David Cournapeau, Evgeni Burovski, Pearu Peterson, Warren Weckesser, Jonathan Bright, Stéfan J

Pauli Virtanen, Ralf Gommers, Travis E. Oliphant, Matt Haberland, Tyler Reddy, David Cournapeau, Evgeni Burovski, Pearu Peterson, Warren Weckesser, Jonathan Bright, Stéfan J. van der Walt, Matthew Brett, Joshua Wilson, K. Jarrod Millman, Nikolay Mayorov, Andrew R. J. Nelson, Eric Jones, Robert Kern, Eric Larson, C J Carey, ˙Ilhan Polat, Yu Feng, Eric W. M...

work page 2020

[38] [38]

On the limited memory bfgs method for large scale optimization

Dong C Liu and Jorge Nocedal. On the limited memory bfgs method for large scale optimization. Mathematical programming, 45(1):503–528, 1989

work page 1989

[39] [39]

Kriging is well-suited to parallelize optimization

David Ginsbourger, Rodolphe Le Riche, and Laurent Carraro. Kriging is well-suited to parallelize optimization. In Computational intelligence in expensive optimization problems, pages 131–162. Springer, 2010

work page 2010

[40] [40]

Differentiable expected hypervolume improvement for parallel multi-objective bayesian optimization

Samuel Daulton, Maximilian Balandat, and Eytan Bakshy. Differentiable expected hypervolume improvement for parallel multi-objective bayesian optimization. Advances in Neural Information Processing Systems, 33:9851– 9864, 2020

work page 2020

[41] [41]

Parallel Bayesian Global Optimization of Expensive Functions

Jialei Wang, Scott C Clark, Eric Liu, and Peter I Frazier. Parallel bayesian global optimization of expensive functions. arXiv preprint arXiv:1602.05149, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016

[42] [43]

py-pde: A python package for solving partial differential equations

David Zwicker. py-pde: A python package for solving partial differential equations. Journal of Open Source Software, 5(48):2158, 2020

work page 2020

[43] [44]

Interferobot: aligning an optical interferometer by a reinforcement learning agent, 2021

Dmitry Sorokin, Alexander Ulanov, Ekaterina Sazhina, and Alexander Lvovsky. Interferobot: aligning an optical interferometer by a reinforcement learning agent, 2021

work page 2021

[44] [45]

JAX: composable transforma- tions of Python+NumPy programs, 2018

James Bradbury, Roy Frostig, Peter Hawkins, Matthew James Johnson, Chris Leary, Dougal Maclaurin, George Necula, Adam Paszke, Jake VanderPlas, Skye Wanderman-Milne, and Qiao Zhang. JAX: composable transforma- tions of Python+NumPy programs, 2018

work page 2018

[45] [46]

Flax: A neural network library and ecosystem for JAX, 2023

Jonathan Heek, Anselm Levskaya, Avital Oliver, Marvin Ritter, Bertrand Rondepierre, Andreas Steiner, and Marc van Zee. Flax: A neural network library and ecosystem for JAX, 2023

work page 2023

[46] [47]

J. D. Hunter. Matplotlib: A 2d graphics environment. Computing in Science & Engineering, 9(3):90–95, 2007. L. Ferreira Guilhoto & P. Pedikaris 12 A Preprint - April 5, 2024 Composite Bayesian Optimization In Function Spaces Using NEON - Neural Epistemic Operator Networks

work page 2007

[47] [48]

Harris, K

Charles R. Harris, K. Jarrod Millman, Stéfan J. van der Walt, Ralf Gommers, Pauli Virtanen, David Cournapeau, Eric Wieser, Julian Taylor, Sebastian Berg, Nathaniel J. Smith, Robert Kern, Matti Picus, Stephan Hoyer, Marten H. van Kerkwijk, Matthew Brett, Allan Haldane, Jaime Fernández del Río, Mark Wiebe, Pearu Peterson, Pierre Gérard-Marchant, Kevin Shepp...

work page 2020

[48] [49]

Bayesian calibration and uncertainty analysis for computationally expensive models using optimization and radial basis function approximation

Nikolay Bliznyuk, David Ruppert, Christine Shoemaker, Rommel Regis, Stefan Wild, and Pradeep Mugunthan. Bayesian calibration and uncertainty analysis for computationally expensive models using optimization and radial basis function approximation. Journal of Computational and Graphical Statistics, 17(2):270–294, 2008

work page 2008

[49] [50]

Kingma and Jimmy Ba

Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization, 2017

work page 2017

[50] [51]

Wilson, Frank Hutter, and Marc Peter Deisenroth

James T. Wilson, Frank Hutter, and Marc Peter Deisenroth. Maximizing acquisition functions for bayesian optimization, 2018. Author contributions statement L.F.G. and P.P. conceived the methodology. L.F.G. conducted the experiments and analysed the results. P.P. provided funding and supervised this study. All authors reviewed the manuscript. Competing Inte...

work page 2018

[51] [52]

We trained this network for 4,000 steps using full batch and the Adam[50] optimizer and exponential learning rate decay

The EpiNet architecture we used consisted of a trainable MLP with two hidden layers of dimension 32, and for the prior component an ensemble of 16 MLPs with 2 hidden layers of width 5 each and a scale parameter of 1. We trained this network for 4,000 steps using full batch and the Adam[50] optimizer and exponential learning rate decay. Figure 6: Full expe...

work page 2024