Harness In-Context Operator Learning with Chain of Operators

Ling Guo; Liu Yang; Minghui Yang

arxiv: 2606.12318 · v1 · pith:SI3HKI2Enew · submitted 2026-06-10 · 💻 cs.LG · cs.AI

Harness In-Context Operator Learning with Chain of Operators

Minghui Yang , Ling Guo , Liu Yang This is my paper

Pith reviewed 2026-06-27 10:43 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords in-context learningneural operatorschain of operatorsout-of-distribution generalizationoperator learningPDE approximationICON

0 comments

The pith

A chain of explicit transformations around a frozen in-context operator network cuts error on out-of-distribution tasks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents Chain of Operators (CHOP) as a way to extend In-Context Operator Networks (ICON) beyond their usual limits. ICON adapts to new operators from numerical prompts without retraining, yet still fails on out-of-distribution cases. CHOP inserts a sequence of closed-form elementary transformations before and after the frozen ICON so the overall mapping handles the new task. Tests on a scalar conservation law and a mean-field control problem record lower relative inference error than direct ICON use, and the same chain style transfers from one PDE family to another.

Core claim

CHOP constructs a chain consisting of explicit elementary transformations and the frozen ICON to harness it for OOD operator tasks without updating parameters. This reduces relative inference error compared to direct ICON evaluation on a scalar conservation law and a mean-field control problem. Each operator in the chain remains interpretable and in closed form. A chain built on one PDE family generalizes to a different family.

What carries the argument

The Chain of Operators (CHOP) framework, which composes explicit elementary transformations with a frozen ICON to form an interpretable chain that adapts the model to new operator tasks.

If this is right

CHOP achieves lower relative inference error than direct ICON evaluation on the tested OOD tasks.
Each operator inside the chain stays in closed form and remains interpretable.
A chain identified for one PDE family transfers directly to operators from a different family.
No parameter updates or task-specific fine-tuning of the underlying ICON are required.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same chaining pattern could be tested on other in-context models that map between function spaces.
If chain construction can be automated, the method might extend to wider classes of scientific operator problems.
Shared mechanisms across PDE families point to possible common structures that future work could exploit.

Load-bearing premise

Suitable chains of explicit elementary transformations exist and can be identified for arbitrary out-of-distribution operator tasks such that the composition yields lower error without any parameter change.

What would settle it

An OOD operator task where no chain of explicit transformations composed with the frozen ICON produces lower error than direct ICON evaluation, or where no such chain can be identified.

Figures

Figures reproduced from arXiv: 2606.12318 by Ling Guo, Liu Yang, Minghui Yang.

**Figure 1.** Figure 1: Chain of Operators (Chop). A frozen in-context operator network (Icon) is wrapped by a discovered prompt-side operator F = Fm◦· · ·◦F1 and prediction-side operator G = Gr◦· · ·◦G1, forming the chain F → Icon → G. Given an in-context prompt of demonstration pairs C = {(xi , yi)} D i=1 and a query x ∗ , F rewrites the prompt toward the regime where the frozen Icon model is reliable, Icon predicts, and G retu… view at source ↗

**Figure 2.** Figure 2: Induced-operator route of Chop. Raw Icon attempts to infer T directly from the original prompt. Chop instead maps the prompt to an induced operator T ′ , predicts in the induced space, and returns the result through G. This expression shows the full Chop operator chain. Some prediction-side operators may reverse prompt-side transformations, such as coordinate changes or value rescaling, but they are not re… view at source ↗

**Figure 3.** Figure 3: Ten-step autoregressive rollout error on three out-of-distribution flux functions. [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗

**Figure 4.** Figure 4: Relative L 2 error across input length scales ℓ ∈ {0.5, 0.3, 0.1} for the five MFC operators. Although the full chain does not improve the ρ-parameter tasks, this does not imply that all operators in the chain fail on this family. This pattern comes from the value normalization. Fvalue applies the same rescaling to the condition and the target in each context pair. For the g-parameter tasks, this is compat… view at source ↗

**Figure 5.** Figure 5: g-parameter operator (1, 1) across input length scales ℓ ∈ {0.5, 0.3, 0.1}. 13 [PITH_FULL_IMAGE:figures/full_fig_p013_5.png] view at source ↗

**Figure 6.** Figure 6: MFC spacetime prediction results at input length scale [PITH_FULL_IMAGE:figures/full_fig_p014_6.png] view at source ↗

**Figure 7.** Figure 7: Residual transfer on the ρ-parameter operator (1, 2) across input length scales ℓ ∈ {0.5, 0.3, 0.1}. test. The MFC experiment includes two parameter families, five operator configurations, and both one-dimensional and spacetime targets. In contrast, the conservation-law chain in equation (14) contains translation alignment and mass projection, both closely matched to the conservation-law structure. Accordi… view at source ↗

**Figure 8.** Figure 8: Additional spacetime MFC predictions for the [PITH_FULL_IMAGE:figures/full_fig_p020_8.png] view at source ↗

**Figure 9.** Figure 9: Additional spacetime MFC predictions for the [PITH_FULL_IMAGE:figures/full_fig_p021_9.png] view at source ↗

**Figure 10.** Figure 10: Additional residual-transfer predictions for the [PITH_FULL_IMAGE:figures/full_fig_p022_10.png] view at source ↗

**Figure 11.** Figure 11: Additional residual-transfer predictions for the [PITH_FULL_IMAGE:figures/full_fig_p023_11.png] view at source ↗

**Figure 12.** Figure 12: Ten-step autoregressive rollout for the MFC chain transferred to the conservation-law. [PITH_FULL_IMAGE:figures/full_fig_p023_12.png] view at source ↗

read the original abstract

Neural operators approximate mappings between function spaces, but often generalize poorly to other operators and usually require fine-tuning or retraining. In-Context Operator Networks (ICON) addresses this issue by prompting the model with numerical context so that the model learns specific operators from prompts and adapt to different operators without fine-tuning. However, ICON may still fail to generalize to out-of-distribution (OOD) operator tasks. Inpired by the success of harness engineering of Large Language models (LLMs), we introduce Chain of Operators (CHOP), a framework that harness a frozen ICON to OOD operator tasks without updating its parameters. Specifically, CHOP constructs a chain of operators consisting of explicit elementary transformations and the frozen ICON. Experiments on a scalar conservation law and a mean-field control problem show that CHOP reduces relative inference error over direct ICON evaluation, while each operator in the chain remains interpretable and in closed form. A chain constructed on one PDE family further generalizes to a different family, indicating shared mechanisms across harness systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

CHOP shows a workable composition trick for frozen ICON on two OOD cases but gives no general way to build the chains.

read the letter

The core new thing here is the CHOP idea: take a frozen ICON and wrap it with a short chain of explicit elementary operators so the whole thing handles out-of-distribution operator tasks without any parameter updates. The experiments claim this cuts relative error on a scalar conservation law and a mean-field control problem, and that a chain built for one PDE family still helps on another. That cross-family bit is the modest new observation.

The paper does the basic job of showing the composition can be useful and that the added operators stay closed-form and readable. That keeps the approach interpretable, which matters in scientific machine learning.

The main gap is exactly what the stress-test note flags. The abstract says CHOP “constructs a chain” but never says how—hand selection, some search, domain knowledge, or what. Without a procedure or even a clear recipe, the no-fine-tuning claim only really applies to the two problems they already solved. If chain building stays manual or problem-specific, the practical payoff shrinks. The abstract also skips any numbers, baselines, or error bars, so it is hard to gauge the size of the improvement.

This is narrow-scope work inside operator learning. A reader already following ICON papers might want to see the full experiments and any details on chain selection that the abstract omits. It is not ready for broad use yet.

I would send it to review if the full manuscript supplies a reproducible way to pick the operators; otherwise it stays a limited demonstration.

Referee Report

2 major / 2 minor

Summary. The paper proposes Chain of Operators (CHOP) as a framework to adapt a frozen In-Context Operator Network (ICON) to out-of-distribution (OOD) operator tasks by composing it with a chain of explicit, closed-form elementary transformations. Experiments on a scalar conservation law and a mean-field control problem are reported to show reduced relative inference error versus direct ICON evaluation; a chain derived on one PDE family is further claimed to generalize to a different family while preserving interpretability.

Significance. If the empirical improvements and cross-family generalization hold under a reproducible chain-construction procedure, the approach would offer a parameter-free, interpretable route to extend neural-operator generalization without retraining or fine-tuning. The explicit closed-form nature of the elementary operators is a methodological strength that distinguishes it from black-box adaptation methods.

major comments (2)

[Abstract, §3] Abstract and §3 (method): the central claim that CHOP 'constructs a chain' enabling OOD generalization 'without any parameter change or task-specific fine-tuning' is load-bearing, yet no algorithm, search procedure, selection criteria, or automated method for identifying the elementary transformations is described. If chain construction is manual or relies on per-task domain knowledge, the no-fine-tuning property does not extend beyond the two demonstrated cases.
[Abstract] Abstract: the statement that 'CHOP reduces relative inference error' is presented without any quantitative values, baselines, error bars, number of trials, or description of how the chains were constructed for the scalar conservation law and mean-field control experiments, preventing assessment of effect size or reproducibility.

minor comments (2)

[Abstract] The abstract uses 'Inpired' (typo for 'Inspired').
[§2, §3] Notation for the elementary operators and the composition rule should be introduced with explicit mathematical definitions early in §2 or §3 to clarify how the chain is applied to the ICON output.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive review. We address the major comments point by point below, proposing revisions to improve clarity and completeness where the comments identify gaps in the current manuscript.

read point-by-point responses

Referee: [Abstract, §3] Abstract and §3 (method): the central claim that CHOP 'constructs a chain' enabling OOD generalization 'without any parameter change or task-specific fine-tuning' is load-bearing, yet no algorithm, search procedure, selection criteria, or automated method for identifying the elementary transformations is described. If chain construction is manual or relies on per-task domain knowledge, the no-fine-tuning property does not extend beyond the two demonstrated cases.

Authors: We agree that the manuscript does not describe an automated algorithm or search procedure for chain construction. The chains presented in the experiments were identified manually by leveraging domain knowledge of the underlying scalar conservation laws and mean-field control problems to select explicit elementary operators that compose with the frozen ICON. This is a genuine limitation of the current work: the no-fine-tuning property holds once a suitable chain is provided, but the paper does not claim or demonstrate an automated discovery method. We will revise §3 to explicitly state that chain construction is currently manual and based on interpretability-driven inspection, while clarifying that the framework itself requires no parameter updates. We will also add a discussion of this as a direction for future work on automated harness engineering. revision: yes
Referee: [Abstract] Abstract: the statement that 'CHOP reduces relative inference error' is presented without any quantitative values, baselines, error bars, number of trials, or description of how the chains were constructed for the scalar conservation law and mean-field control experiments, preventing assessment of effect size or reproducibility.

Authors: We agree that the abstract lacks the quantitative details needed for immediate assessment. The full experimental results, including relative error reductions, baselines (direct ICON evaluation), and trial counts, are reported in the experiments section, but the abstract does not summarize them. In the revision we will update the abstract to include specific quantitative improvements (e.g., the observed error reductions on the two tasks), mention the number of trials, and briefly note that chains were constructed via manual domain-informed selection. This will make the abstract self-contained while preserving its length constraints. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical framework with explicit operators

full rationale

The paper introduces CHOP as a composition of explicit, closed-form elementary transformations with a frozen ICON model. Reported error reductions are shown via experiments on scalar conservation laws and mean-field control problems, with generalization demonstrated across PDE families. No equations, fitted parameters, or derivations reduce the claimed improvements to inputs by construction. The chain construction is presented as feasible for the tested cases without any self-referential definition or load-bearing self-citation that collapses the result. The framework remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the prior existence and prompting capability of ICON plus the unstated assumption that useful elementary operator chains can be found for OOD cases; no new free parameters or invented entities are introduced in the abstract.

axioms (1)

domain assumption ICON can learn specific operators from numerical context prompts and adapt without fine-tuning
The paper builds directly on this property of ICON as stated in the abstract.

pith-pipeline@v0.9.1-grok · 5700 in / 1292 out tokens · 25036 ms · 2026-06-27T10:43:42.616323+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

40 extracted references · 2 linked inside Pith

[1]

Neural operator: Learning maps between function spaces with applications to PDEs.Journal of Machine Learning Research, 24(89):1–97, 2023

Nikola Kovachki, Zongyi Li, Burigede Liu, Kamyar Azizzadenesheli, Kaushik Bhattacharya, Andrew Stuart, and Anima Anandkumar. Neural operator: Learning maps between function spaces with applications to PDEs.Journal of Machine Learning Research, 24(89):1–97, 2023

2023
[2]

Physics-informed deep neural operator networks

Somdatta Goswami, Aniruddha Bora, Yue Yu, and George Em Karniadakis. Physics-informed deep neural operator networks. InMachine learning in modeling and simulation: methods and applications, pages 219–254. Springer, 2023

2023
[3]

Physics-informed machine learning.Nature Reviews Physics, 3(6):422–440, 2021

George Em Karniadakis, Ioannis G Kevrekidis, Lu Lu, Paris Perdikaris, Sifan Wang, and Liu Yang. Physics-informed machine learning.Nature Reviews Physics, 3(6):422–440, 2021

2021
[4]

Learning nonlinear operators via deeponet based on the universal approximation theorem of operators

Lu Lu, Pengzhan Jin, Guofei Pang, Zhongqiang Zhang, and George Em Karniadakis. Learning nonlinear operators via deeponet based on the universal approximation theorem of operators. Nature machine intelligence, 3(3):218–229, 2021

2021
[5]

Fourier neural operator for parametric partial dif- ferential equations

Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, and Anima Anandkumar. Fourier neural operator for parametric partial dif- ferential equations. InInternational Conference on Learning Representations, 2021

2021
[6]

Lu Lu, Xuhui Meng, Shengze Cai, Zhiping Mao, Somdatta Goswami, Zhongqiang Zhang, and George Em Karniadakis. A comprehensive and fair comparison of two neural operators 16 (with practical extensions) based on fair data.Computer Methods in Applied Mechanics and Engineering, 393:114778, 2022

2022
[7]

Physics-informed neural operator for learning partial differential equations.ACM/IMS Journal of Data Science, 1(3):1–27, 2024

Zongyi Li, Hongkai Zheng, Nikola Kovachki, David Jin, Haoxuan Chen, Burigede Liu, Kam- yar Azizzadenesheli, and Anima Anandkumar. Physics-informed neural operator for learning partial differential equations.ACM/IMS Journal of Data Science, 1(3):1–27, 2024

2024
[8]

Fourier neural op- erator with learned deformations for pdes on general geometries.Journal of Machine Learning Research, 24(388):1–26, 2023

Zongyi Li, Daniel Zhengyu Huang, Burigede Liu, and Anima Anandkumar. Fourier neural op- erator with learned deformations for pdes on general geometries.Journal of Machine Learning Research, 24(388):1–26, 2023

2023
[9]

Learning the solution operator of para- metric partial differential equations with physics-informed deeponets.Science advances, 7(40):eabi8605, 2021

Sifan Wang, Hanwen Wang, and Paris Perdikaris. Learning the solution operator of para- metric partial differential equations with physics-informed deeponets.Science advances, 7(40):eabi8605, 2021

2021
[10]

Learn- ing nonlinear operators in latent spaces for real-time predictions of complex dynamics in phys- ical systems.Nature Communications, 15(1):5101, 2024

KatianaKontolati, SomdattaGoswami, GeorgeEmKarniadakis, andMichaelDShields. Learn- ing nonlinear operators in latent spaces for real-time predictions of complex dynamics in phys- ical systems.Nature Communications, 15(1):5101, 2024

2024
[11]

Convolutional neural operators

Bogdan Raonic, Roberto Molinaro, Tobias Rohner, Siddhartha Mishra, and Emmanuel de Bezenac. Convolutional neural operators. InICLR 2023 workshop on physics for machine learning, 2023

2023
[12]

Tapas Tripura and Souvik Chakraborty. Wavelet neural operator for solving parametric partial differential equations in computational mechanics problems.Computer Methods in Applied Mechanics and Engineering, 404:115783, 2023

2023
[13]

Reliableextrapolation of deep neural operators informed by physics or sparse observations.Computer Methods in Applied Mechanics and Engineering, 412:116064, 2023

MinZhu, HandiZhang, AnranJiao, GeorgeEmKarniadakis, andLuLu. Reliableextrapolation of deep neural operators informed by physics or sparse observations.Computer Methods in Applied Mechanics and Engineering, 412:116064, 2023

2023
[14]

Deep transfer operator learning for partial differential equations under conditional shift.Nature Machine Intelligence, 4(12):1155–1164, 2022

Somdatta Goswami, Katiana Kontolati, Michael D Shields, and George Em Karniadakis. Deep transfer operator learning for partial differential equations under conditional shift.Nature Machine Intelligence, 4(12):1155–1164, 2022

2022
[15]

Modno: Multi-operator learning with distributed neural operators.Computer Methods in Applied Mechanics and Engineering, 431:117229, 2024

Zecheng Zhang. Modno: Multi-operator learning with distributed neural operators.Computer Methods in Applied Mechanics and Engineering, 431:117229, 2024

2024
[16]

Lemon: Learning to learn multi-operator networks.arXiv preprint arXiv:2408.16168, 2024

Jingmin Sun, Zecheng Zhang, and Hayden Schaeffer. Lemon: Learning to learn multi-operator networks.arXiv preprint arXiv:2408.16168, 2024

arXiv 2024
[17]

Data-efficient operator learning via unsupervised pretraining and in-context learning

Wuyang Chen, Jialin Song, Pu Ren, Shashank Subramanian, Dmitriy Morozov, and Michael W Mahoney. Data-efficient operator learning via unsupervised pretraining and in-context learning. Advances in Neural Information Processing Systems, 37:6213–6245, 2024

2024
[18]

Poseidon: Efficient foundation models for pdes

Maximilian Herde, Bogdan Raonić, Tobias Rohner, Roger Käppeli, Roberto Molinaro, Em- manuel De Bezenac, and Siddhartha Mishra. Poseidon: Efficient foundation models for pdes. Advances in Neural Information Processing Systems, 37:72525–72624, 2024

2024
[19]

Deeponet as a multi-operator extrapolation model: Distributed pretraining with physics-informed fine-tuning

Zecheng Zhang, Christian Moya, Lu Lu, Guang Lin, and Hayden Schaeffer. Deeponet as a multi-operator extrapolation model: Distributed pretraining with physics-informed fine-tuning. Journal of Computational Physics, page 114537, 2026. 17

2026
[20]

Parameter-efficient fine-tuning of large-scale pre-trained language models.Nature machine intelligence, 5(3):220–235, 2023

Ning Ding, Yujia Qin, Guang Yang, Fuchao Wei, Zonghan Yang, Yusheng Su, Shengding Hu, Yulin Chen, Chi-Min Chan, Weize Chen, et al. Parameter-efficient fine-tuning of large-scale pre-trained language models.Nature machine intelligence, 5(3):220–235, 2023

2023
[21]

Llm-adapters: An adapter family for parameter-efficient fine-tuning of large language models

Zhiqiang Hu, Lei Wang, Yihuai Lan, Wanyu Xu, Ee-Peng Lim, Lidong Bing, Xing Xu, Soujanya Poria, and Roy Lee. Llm-adapters: An adapter family for parameter-efficient fine-tuning of large language models. InProceedings of the 2023 conference on empirical methods in natural language processing, pages 5254–5276, 2023

2023
[22]

Towards foundation models for scientific machine learning: Characterizing scaling and transfer behavior.Advances in Neural Information Pro- cessing Systems, 36:71242–71262, 2023

Shashank Subramanian, Peter Harrington, Kurt Keutzer, Wahid Bhimji, Dmitriy Morozov, Michael W Mahoney, and Amir Gholami. Towards foundation models for scientific machine learning: Characterizing scaling and transfer behavior.Advances in Neural Information Pro- cessing Systems, 36:71242–71262, 2023

2023
[23]

PDEformer: Towards a foundation model for one-dimensional partial differential equations

Zhanhong Ye, Xiang Huang, Leheng Chen, Hongsheng Liu, Zidong Wang, and Bin Dong. PDEformer: Towards a foundation model for one-dimensional partial differential equations. arXiv preprint arXiv:2402.12652, 2024

arXiv 2024
[24]

DPOT: Auto-regressive denoising operator transformer for large-scale pde pre-training.arXiv preprint arXiv:2403.03542, 2024

Zhongkai Hao, Chang Su, Songming Liu, Julius Berner, Chengyang Ying, Hang Su, Anima Anandkumar, Jian Song, and Jun Zhu. DPOT: Auto-regressive denoising operator transformer for large-scale pde pre-training.arXiv preprint arXiv:2403.03542, 2024

arXiv 2024
[25]

Pretraining codomain attention neural operators for solving multiphysics pdes.Advances in Neural Information Processing Systems, 37:104035–104064, 2024

Ashiqur Rahman, Robert J George, Mogab Elleithy, Daniel Leibovici, Zongyi Li, Boris Bonev, Colin White, Julius Berner, Raymond A Yeh, Jean Kossaifi, et al. Pretraining codomain attention neural operators for solving multiphysics pdes.Advances in Neural Information Processing Systems, 37:104035–104064, 2024

2024
[26]

Prose-fd: A multimodal pde foundation model for learning multiple operators for forecasting fluid dynamics.arXiv preprint arXiv:2409.09811, 2024

Yuxuan Liu, Jingmin Sun, Xinjie He, Griffin Pinney, Zecheng Zhang, and Hayden Schaeffer. Prose-fd: A multimodal pde foundation model for learning multiple operators for forecasting fluid dynamics.arXiv preprint arXiv:2409.09811, 2024

arXiv 2024
[27]

In-context operator learning with data prompts for differential equation problems.Proceedings of the National Academy of Sci- ences, 120(39):e2310142120, 2023

Liu Yang, Siting Liu, Tingwei Meng, and Stanley J Osher. In-context operator learning with data prompts for differential equation problems.Proceedings of the National Academy of Sci- ences, 120(39):e2310142120, 2023

2023
[28]

VICON: Vision in-context operator networks for multi-physics fluid dynamics prediction.arXiv preprint arXiv:2411.16063, 2024

Yadi Cao, Yuxuan Liu, Liu Yang, Rose Yu, Hayden Schaeffer, and Stanley Osher. VICON: Vision in-context operator networks for multi-physics fluid dynamics prediction.arXiv preprint arXiv:2411.16063, 2024

arXiv 2024
[29]

Zebra: In-context generative pretraining for solving parametric pdes.arXiv preprint arXiv:2410.03437, 2024

Louis Serrano, Armand Kassaï Koupaï, Thomas X Wang, Pierre Erbacher, and Patrick Gal- linari. Zebra: In-context generative pretraining for solving parametric pdes.arXiv preprint arXiv:2410.03437, 2024

arXiv 2024
[30]

Enma: To- kenwise autoregression for continuous neural pde operators.Advances in Neural Information Processing Systems, 38:127341–127409, 2026

Armand Kassaï Koupaï, Lise Le Boudec, Louis Serrano, and Patrick Gallinari. Enma: To- kenwise autoregression for continuous neural pde operators.Advances in Neural Information Processing Systems, 38:127341–127409, 2026

2026
[31]

Probabilistic operator learning: generative modeling and uncertainty quantification for foundation models of differential equations.arXiv preprint arXiv:2509.05186, 2025

Benjamin J Zhang, Siting Liu, Stanley J Osher, and Markos A Katsoulakis. Probabilistic operator learning: generative modeling and uncertainty quantification for foundation models of differential equations.arXiv preprint arXiv:2509.05186, 2025

arXiv 2025
[32]

Graph in-context operator networks for generalizable spatiotemporal prediction.arXiv preprint arXiv:2603.12725, 2026

Chenghan Wu, Zongmin Yu, Boai Sun, and Liu Yang. Graph in-context operator networks for generalizable spatiotemporal prediction.arXiv preprint arXiv:2603.12725, 2026. 18

Pith/arXiv arXiv 2026
[33]

Solving optimal execution problems via in-context operator networks.arXiv preprint arXiv:2501.15106, 2025

Tingwei Meng, Moritz Voss, Nils Detering, Giulio Farolfi, Stanley Osher, and Georg Menz. Solving optimal execution problems via in-context operator networks.arXiv preprint arXiv:2501.15106, 2025

arXiv 2025
[34]

In-context operator learning on the space of probability measures.arXiv preprint arXiv:2601.09979, 2026

Frank Cole, Dixi Wang, Yineng Chen, Yulong Lu, and Rongjie Lai. In-context operator learning on the space of probability measures.arXiv preprint arXiv:2601.09979, 2026

arXiv 2026
[35]

Does in-context operator learning generalize to domain-shifted settings? InThe symbiosis of deep learning and differential equations III, 2023

Jerry Weihong Liu, N Benjamin Erichson, Kush Bhatia, Michael W Mahoney, and Christopher Re. Does in-context operator learning generalize to domain-shifted settings? InThe symbiosis of deep learning and differential equations III, 2023

2023
[36]

In-context learning of linear systems: Generalization theory and applications to operator learning.arXiv preprint arXiv:2409.12293, 2024

Frank Cole, Yulong Lu, Wuzhe Xu, and Tianhao Zhang. In-context learning of linear systems: Generalization theory and applications to operator learning.arXiv preprint arXiv:2409.12293, 2024

arXiv 2024
[37]

PDE generalization of in-context operator networks: A study on 1d scalar nonlinear conservation laws.Journal of Computational Physics, 519:113379, 2024

Liu Yang and Stanley J Osher. PDE generalization of in-context operator networks: A study on 1d scalar nonlinear conservation laws.Journal of Computational Physics, 519:113379, 2024

2024
[38]

Fine-tune language models as multi-modal differ- ential equation solvers.Neural Networks, 188:107455, 2025

Liu Yang, Siting Liu, and Stanley J Osher. Fine-tune language models as multi-modal differ- ential equation solvers.Neural Networks, 188:107455, 2025

2025
[39]

Evolutionary ensemble of agents.arXiv preprint arXiv:2605.09018, 2026

Zongmin Yu and Liu Yang. Evolutionary ensemble of agents.arXiv preprint arXiv:2605.09018, 2026

Pith/arXiv arXiv 2026
[40]

Efficient implementation of weighted eno schemes

Guang-Shan Jiang and Chi-Wang Shu. Efficient implementation of weighted eno schemes. Journal of computational physics, 126(1):202–228, 1996. 19 A Supplementary numerical details A.1 Additional MFC examples at harder smoothness levels Figures 8 and 9 provide additional MFC predictions at the harder length scalesℓ∈ {0.3,0.1}. Figures 10 and 11 supplement th...

1996

[1] [1]

Neural operator: Learning maps between function spaces with applications to PDEs.Journal of Machine Learning Research, 24(89):1–97, 2023

Nikola Kovachki, Zongyi Li, Burigede Liu, Kamyar Azizzadenesheli, Kaushik Bhattacharya, Andrew Stuart, and Anima Anandkumar. Neural operator: Learning maps between function spaces with applications to PDEs.Journal of Machine Learning Research, 24(89):1–97, 2023

2023

[2] [2]

Physics-informed deep neural operator networks

Somdatta Goswami, Aniruddha Bora, Yue Yu, and George Em Karniadakis. Physics-informed deep neural operator networks. InMachine learning in modeling and simulation: methods and applications, pages 219–254. Springer, 2023

2023

[3] [3]

Physics-informed machine learning.Nature Reviews Physics, 3(6):422–440, 2021

George Em Karniadakis, Ioannis G Kevrekidis, Lu Lu, Paris Perdikaris, Sifan Wang, and Liu Yang. Physics-informed machine learning.Nature Reviews Physics, 3(6):422–440, 2021

2021

[4] [4]

Learning nonlinear operators via deeponet based on the universal approximation theorem of operators

Lu Lu, Pengzhan Jin, Guofei Pang, Zhongqiang Zhang, and George Em Karniadakis. Learning nonlinear operators via deeponet based on the universal approximation theorem of operators. Nature machine intelligence, 3(3):218–229, 2021

2021

[5] [5]

Fourier neural operator for parametric partial dif- ferential equations

Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, and Anima Anandkumar. Fourier neural operator for parametric partial dif- ferential equations. InInternational Conference on Learning Representations, 2021

2021

[6] [6]

Lu Lu, Xuhui Meng, Shengze Cai, Zhiping Mao, Somdatta Goswami, Zhongqiang Zhang, and George Em Karniadakis. A comprehensive and fair comparison of two neural operators 16 (with practical extensions) based on fair data.Computer Methods in Applied Mechanics and Engineering, 393:114778, 2022

2022

[7] [7]

Physics-informed neural operator for learning partial differential equations.ACM/IMS Journal of Data Science, 1(3):1–27, 2024

Zongyi Li, Hongkai Zheng, Nikola Kovachki, David Jin, Haoxuan Chen, Burigede Liu, Kam- yar Azizzadenesheli, and Anima Anandkumar. Physics-informed neural operator for learning partial differential equations.ACM/IMS Journal of Data Science, 1(3):1–27, 2024

2024

[8] [8]

Fourier neural op- erator with learned deformations for pdes on general geometries.Journal of Machine Learning Research, 24(388):1–26, 2023

Zongyi Li, Daniel Zhengyu Huang, Burigede Liu, and Anima Anandkumar. Fourier neural op- erator with learned deformations for pdes on general geometries.Journal of Machine Learning Research, 24(388):1–26, 2023

2023

[9] [9]

Learning the solution operator of para- metric partial differential equations with physics-informed deeponets.Science advances, 7(40):eabi8605, 2021

Sifan Wang, Hanwen Wang, and Paris Perdikaris. Learning the solution operator of para- metric partial differential equations with physics-informed deeponets.Science advances, 7(40):eabi8605, 2021

2021

[10] [10]

Learn- ing nonlinear operators in latent spaces for real-time predictions of complex dynamics in phys- ical systems.Nature Communications, 15(1):5101, 2024

KatianaKontolati, SomdattaGoswami, GeorgeEmKarniadakis, andMichaelDShields. Learn- ing nonlinear operators in latent spaces for real-time predictions of complex dynamics in phys- ical systems.Nature Communications, 15(1):5101, 2024

2024

[11] [11]

Convolutional neural operators

Bogdan Raonic, Roberto Molinaro, Tobias Rohner, Siddhartha Mishra, and Emmanuel de Bezenac. Convolutional neural operators. InICLR 2023 workshop on physics for machine learning, 2023

2023

[12] [12]

Tapas Tripura and Souvik Chakraborty. Wavelet neural operator for solving parametric partial differential equations in computational mechanics problems.Computer Methods in Applied Mechanics and Engineering, 404:115783, 2023

2023

[13] [13]

Reliableextrapolation of deep neural operators informed by physics or sparse observations.Computer Methods in Applied Mechanics and Engineering, 412:116064, 2023

MinZhu, HandiZhang, AnranJiao, GeorgeEmKarniadakis, andLuLu. Reliableextrapolation of deep neural operators informed by physics or sparse observations.Computer Methods in Applied Mechanics and Engineering, 412:116064, 2023

2023

[14] [14]

Deep transfer operator learning for partial differential equations under conditional shift.Nature Machine Intelligence, 4(12):1155–1164, 2022

Somdatta Goswami, Katiana Kontolati, Michael D Shields, and George Em Karniadakis. Deep transfer operator learning for partial differential equations under conditional shift.Nature Machine Intelligence, 4(12):1155–1164, 2022

2022

[15] [15]

Modno: Multi-operator learning with distributed neural operators.Computer Methods in Applied Mechanics and Engineering, 431:117229, 2024

Zecheng Zhang. Modno: Multi-operator learning with distributed neural operators.Computer Methods in Applied Mechanics and Engineering, 431:117229, 2024

2024

[16] [16]

Lemon: Learning to learn multi-operator networks.arXiv preprint arXiv:2408.16168, 2024

Jingmin Sun, Zecheng Zhang, and Hayden Schaeffer. Lemon: Learning to learn multi-operator networks.arXiv preprint arXiv:2408.16168, 2024

arXiv 2024

[17] [17]

Data-efficient operator learning via unsupervised pretraining and in-context learning

Wuyang Chen, Jialin Song, Pu Ren, Shashank Subramanian, Dmitriy Morozov, and Michael W Mahoney. Data-efficient operator learning via unsupervised pretraining and in-context learning. Advances in Neural Information Processing Systems, 37:6213–6245, 2024

2024

[18] [18]

Poseidon: Efficient foundation models for pdes

Maximilian Herde, Bogdan Raonić, Tobias Rohner, Roger Käppeli, Roberto Molinaro, Em- manuel De Bezenac, and Siddhartha Mishra. Poseidon: Efficient foundation models for pdes. Advances in Neural Information Processing Systems, 37:72525–72624, 2024

2024

[19] [19]

Deeponet as a multi-operator extrapolation model: Distributed pretraining with physics-informed fine-tuning

Zecheng Zhang, Christian Moya, Lu Lu, Guang Lin, and Hayden Schaeffer. Deeponet as a multi-operator extrapolation model: Distributed pretraining with physics-informed fine-tuning. Journal of Computational Physics, page 114537, 2026. 17

2026

[20] [20]

Parameter-efficient fine-tuning of large-scale pre-trained language models.Nature machine intelligence, 5(3):220–235, 2023

Ning Ding, Yujia Qin, Guang Yang, Fuchao Wei, Zonghan Yang, Yusheng Su, Shengding Hu, Yulin Chen, Chi-Min Chan, Weize Chen, et al. Parameter-efficient fine-tuning of large-scale pre-trained language models.Nature machine intelligence, 5(3):220–235, 2023

2023

[21] [21]

Llm-adapters: An adapter family for parameter-efficient fine-tuning of large language models

Zhiqiang Hu, Lei Wang, Yihuai Lan, Wanyu Xu, Ee-Peng Lim, Lidong Bing, Xing Xu, Soujanya Poria, and Roy Lee. Llm-adapters: An adapter family for parameter-efficient fine-tuning of large language models. InProceedings of the 2023 conference on empirical methods in natural language processing, pages 5254–5276, 2023

2023

[22] [22]

Towards foundation models for scientific machine learning: Characterizing scaling and transfer behavior.Advances in Neural Information Pro- cessing Systems, 36:71242–71262, 2023

Shashank Subramanian, Peter Harrington, Kurt Keutzer, Wahid Bhimji, Dmitriy Morozov, Michael W Mahoney, and Amir Gholami. Towards foundation models for scientific machine learning: Characterizing scaling and transfer behavior.Advances in Neural Information Pro- cessing Systems, 36:71242–71262, 2023

2023

[23] [23]

PDEformer: Towards a foundation model for one-dimensional partial differential equations

Zhanhong Ye, Xiang Huang, Leheng Chen, Hongsheng Liu, Zidong Wang, and Bin Dong. PDEformer: Towards a foundation model for one-dimensional partial differential equations. arXiv preprint arXiv:2402.12652, 2024

arXiv 2024

[24] [24]

DPOT: Auto-regressive denoising operator transformer for large-scale pde pre-training.arXiv preprint arXiv:2403.03542, 2024

Zhongkai Hao, Chang Su, Songming Liu, Julius Berner, Chengyang Ying, Hang Su, Anima Anandkumar, Jian Song, and Jun Zhu. DPOT: Auto-regressive denoising operator transformer for large-scale pde pre-training.arXiv preprint arXiv:2403.03542, 2024

arXiv 2024

[25] [25]

Pretraining codomain attention neural operators for solving multiphysics pdes.Advances in Neural Information Processing Systems, 37:104035–104064, 2024

Ashiqur Rahman, Robert J George, Mogab Elleithy, Daniel Leibovici, Zongyi Li, Boris Bonev, Colin White, Julius Berner, Raymond A Yeh, Jean Kossaifi, et al. Pretraining codomain attention neural operators for solving multiphysics pdes.Advances in Neural Information Processing Systems, 37:104035–104064, 2024

2024

[26] [26]

Prose-fd: A multimodal pde foundation model for learning multiple operators for forecasting fluid dynamics.arXiv preprint arXiv:2409.09811, 2024

Yuxuan Liu, Jingmin Sun, Xinjie He, Griffin Pinney, Zecheng Zhang, and Hayden Schaeffer. Prose-fd: A multimodal pde foundation model for learning multiple operators for forecasting fluid dynamics.arXiv preprint arXiv:2409.09811, 2024

arXiv 2024

[27] [27]

In-context operator learning with data prompts for differential equation problems.Proceedings of the National Academy of Sci- ences, 120(39):e2310142120, 2023

Liu Yang, Siting Liu, Tingwei Meng, and Stanley J Osher. In-context operator learning with data prompts for differential equation problems.Proceedings of the National Academy of Sci- ences, 120(39):e2310142120, 2023

2023

[28] [28]

VICON: Vision in-context operator networks for multi-physics fluid dynamics prediction.arXiv preprint arXiv:2411.16063, 2024

Yadi Cao, Yuxuan Liu, Liu Yang, Rose Yu, Hayden Schaeffer, and Stanley Osher. VICON: Vision in-context operator networks for multi-physics fluid dynamics prediction.arXiv preprint arXiv:2411.16063, 2024

arXiv 2024

[29] [29]

Zebra: In-context generative pretraining for solving parametric pdes.arXiv preprint arXiv:2410.03437, 2024

Louis Serrano, Armand Kassaï Koupaï, Thomas X Wang, Pierre Erbacher, and Patrick Gal- linari. Zebra: In-context generative pretraining for solving parametric pdes.arXiv preprint arXiv:2410.03437, 2024

arXiv 2024

[30] [30]

Enma: To- kenwise autoregression for continuous neural pde operators.Advances in Neural Information Processing Systems, 38:127341–127409, 2026

Armand Kassaï Koupaï, Lise Le Boudec, Louis Serrano, and Patrick Gallinari. Enma: To- kenwise autoregression for continuous neural pde operators.Advances in Neural Information Processing Systems, 38:127341–127409, 2026

2026

[31] [31]

Probabilistic operator learning: generative modeling and uncertainty quantification for foundation models of differential equations.arXiv preprint arXiv:2509.05186, 2025

Benjamin J Zhang, Siting Liu, Stanley J Osher, and Markos A Katsoulakis. Probabilistic operator learning: generative modeling and uncertainty quantification for foundation models of differential equations.arXiv preprint arXiv:2509.05186, 2025

arXiv 2025

[32] [32]

Graph in-context operator networks for generalizable spatiotemporal prediction.arXiv preprint arXiv:2603.12725, 2026

Chenghan Wu, Zongmin Yu, Boai Sun, and Liu Yang. Graph in-context operator networks for generalizable spatiotemporal prediction.arXiv preprint arXiv:2603.12725, 2026. 18

Pith/arXiv arXiv 2026

[33] [33]

Solving optimal execution problems via in-context operator networks.arXiv preprint arXiv:2501.15106, 2025

Tingwei Meng, Moritz Voss, Nils Detering, Giulio Farolfi, Stanley Osher, and Georg Menz. Solving optimal execution problems via in-context operator networks.arXiv preprint arXiv:2501.15106, 2025

arXiv 2025

[34] [34]

In-context operator learning on the space of probability measures.arXiv preprint arXiv:2601.09979, 2026

Frank Cole, Dixi Wang, Yineng Chen, Yulong Lu, and Rongjie Lai. In-context operator learning on the space of probability measures.arXiv preprint arXiv:2601.09979, 2026

arXiv 2026

[35] [35]

Does in-context operator learning generalize to domain-shifted settings? InThe symbiosis of deep learning and differential equations III, 2023

Jerry Weihong Liu, N Benjamin Erichson, Kush Bhatia, Michael W Mahoney, and Christopher Re. Does in-context operator learning generalize to domain-shifted settings? InThe symbiosis of deep learning and differential equations III, 2023

2023

[36] [36]

In-context learning of linear systems: Generalization theory and applications to operator learning.arXiv preprint arXiv:2409.12293, 2024

Frank Cole, Yulong Lu, Wuzhe Xu, and Tianhao Zhang. In-context learning of linear systems: Generalization theory and applications to operator learning.arXiv preprint arXiv:2409.12293, 2024

arXiv 2024

[37] [37]

PDE generalization of in-context operator networks: A study on 1d scalar nonlinear conservation laws.Journal of Computational Physics, 519:113379, 2024

Liu Yang and Stanley J Osher. PDE generalization of in-context operator networks: A study on 1d scalar nonlinear conservation laws.Journal of Computational Physics, 519:113379, 2024

2024

[38] [38]

Fine-tune language models as multi-modal differ- ential equation solvers.Neural Networks, 188:107455, 2025

Liu Yang, Siting Liu, and Stanley J Osher. Fine-tune language models as multi-modal differ- ential equation solvers.Neural Networks, 188:107455, 2025

2025

[39] [39]

Evolutionary ensemble of agents.arXiv preprint arXiv:2605.09018, 2026

Zongmin Yu and Liu Yang. Evolutionary ensemble of agents.arXiv preprint arXiv:2605.09018, 2026

Pith/arXiv arXiv 2026

[40] [40]

Efficient implementation of weighted eno schemes

Guang-Shan Jiang and Chi-Wang Shu. Efficient implementation of weighted eno schemes. Journal of computational physics, 126(1):202–228, 1996. 19 A Supplementary numerical details A.1 Additional MFC examples at harder smoothness levels Figures 8 and 9 provide additional MFC predictions at the harder length scalesℓ∈ {0.3,0.1}. Figures 10 and 11 supplement th...

1996