Mixtures of Neural Operators Reduce Active Complexity in Operator Learning

Anastasis Kratsios; Jose Antonio Lara Benitez; Maarten de Hoop; Matti Lassas; Takashi Furuya

arxiv: 2404.09101 · v3 · pith:Z24JV6B4new · submitted 2024-04-13 · 💻 cs.LG · cs.AI· cs.NA· math.NA· stat.ML

Mixtures of Neural Operators Reduce Active Complexity in Operator Learning

Anastasis Kratsios , Takashi Furuya , Jose Antonio Lara Benitez , Matti Lassas , Maarten de Hoop This is my paper

classification 💻 cs.LG cs.AIcs.NAmath.NAstat.ML

keywords expertactiveapproximationneuraloperatorstheoremboundedcomparison

0 comments

read the original abstract

Operator-learning systems are not governed solely by total parameter count; for one query, the relevant bottleneck can be the model that must be loaded and evaluated. We study this distinction for classical neural operators on compact Sobolev subsets through a constructive comparison between routed mixtures of neural operators (MoNOs) and a fixed single-neural-operator construction. The comparison concerns expert-active complexity relative to that baseline, with total stored size and routing search accounted separately. A MoNO routes each input function through a tree to one expert. Our main theorem shows that every scalar uniformly continuous nonlinear operator with bounded output Sobolev radius on the approximation set admits a MoNO approximation whose active expert has smaller depth, width, and rank scaling than the analyzed single-neural-operator construction; for Lipschitz targets these expert quantities are bounded by $\mathcal{O}(\varepsilon^{-1})$. The theorem turns localization into an operator-level accounting of active expert size, routing depth, and number of experts. We also prove a quantitative universal approximation theorem for the underlying neural-operator architecture, with explicit dependence on compact-set diameter and modulus of continuity.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Approximation Theory of Laplacian-Based Neural Operators for Reaction-Diffusion System
cs.LG 2026-05 unverdicted novelty 7.0

Laplacian eigenfunction-based neural operators approximate the solution operator of the generalized Gierer-Meinhardt reaction-diffusion system with error bounds that imply only polynomial growth in parameters as accur...
Neural equilibria for long-term prediction of nonlinear conservation laws
cs.LG 2025-01 unverdicted novelty 6.0

NeurDE learns the equilibrium closure within a kinetic solver to outperform larger neural models on long-term predictions of nonlinear conservation laws including shocks.
Upper Approximation Bounds for Neural Oscillators
cs.LG 2025-11 unverdicted novelty 5.0

Upper bounds are derived showing that neural oscillator approximation errors for causal operators and stable second-order dynamical systems scale polynomially with the reciprocals of the widths of the two MLPs.