Black-Box Optimization From Small Offline Datasets via Meta Learning with Synthetic Tasks

Azza Fadhel; Jana Doppa; The Hung Tran; Trong Nghia Hoang

arxiv: 2604.12325 · v3 · pith:F5JPWDPMnew · submitted 2026-04-14 · 💻 cs.LG · cs.AI

Black-Box Optimization From Small Offline Datasets via Meta Learning with Synthetic Tasks

Azza Fadhel , The Hung Tran , Trong Nghia Hoang , Jana Doppa This is my paper

Pith reviewed 2026-05-22 11:07 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords offline black-box optimizationmeta learningsynthetic task generationGaussian processsmall data regimessurrogate modeloptimization biasfine-tuning

0 comments

The pith

OptBias meta-learns optimization bias from Gaussian process synthetic tasks then fine-tunes the surrogate on small target data to improve offline black-box optimization.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tackles offline black-box optimization when only small or low-quality experimental datasets are available, as is common in molecule or material design. Existing surrogate models often fail to learn which candidate designs rank higher under these constraints. OptBias first trains on many synthetic tasks drawn from a Gaussian process to acquire a general ranking ability called optimization bias. It then adapts this pretrained surrogate to the specific small dataset of the real target task. Experiments across continuous and discrete benchmarks show consistent gains over prior methods in the small-data regime.

Core claim

The central claim is that a surrogate model can acquire a reusable optimization bias—its ability to correctly rank input designs—by meta-training on synthetic tasks generated from a Gaussian process, and that fine-tuning this biased surrogate on limited real data yields better identification of optimal designs in data-scarce offline black-box optimization settings.

What carries the argument

OptBias, the meta-learning framework that generates synthetic tasks from a Gaussian process to instill optimization bias in the surrogate before fine-tuning on the target task's small dataset.

If this is right

The fine-tuned surrogate more accurately ranks candidate designs, leading to higher-quality optima found from small offline datasets.
Performance advantages appear in both continuous and discrete optimization benchmarks when data is limited.
The approach offers a practical route to reuse meta-learned ranking knowledge across multiple scientific design problems that share similar data constraints.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The success implies that broad properties of ranking landscapes can transfer from simple Gaussian process priors to real-world discrete or structured problems.
One could test whether replacing the Gaussian process with other generative models for synthetic tasks further widens the performance gap on non-smooth objectives.
The same pretraining idea might reduce sample needs in sequential settings where new data points are acquired adaptively rather than given all at once.

Load-bearing premise

That synthetic tasks generated from a Gaussian process sufficiently capture the optimization bias (ranking ability) needed for effective fine-tuning on real small experimental datasets in target tasks.

What would settle it

A target benchmark whose objective landscape deviates strongly from Gaussian process assumptions, such as one with sharp discontinuities or extreme multimodality, where OptBias shows no improvement or degrades relative to a surrogate trained only on the small real dataset.

Figures

Figures reproduced from arXiv: 2604.12325 by Azza Fadhel, Jana Doppa, The Hung Tran, Trong Nghia Hoang.

**Figure 2.** Figure 2: Workflow overview of OptBias. First, we use a novel Sim4Opt procedure to generate synthetic functions which are similar to the oracle function. Then, we combine meta-learning with gradient matching (MatchOpt) to distill the common parts across gradient fields of the synthetic functions into a surrogate, thus implicitly aligning it with the (unobserved) gradient field of the oracle function. 3 OptBias Fra… view at source ↗

**Figure 3.** Figure 3: Empirical distribution of oracle-evaluated [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

read the original abstract

We consider the problem of offline black-box optimization, where the goal is to discover optimal designs (e.g., molecules or materials) from past experimental data. A key challenge in this setting is data scarcity: in many scientific applications, only small or poor-quality datasets are available, which severely limits the effectiveness of existing algorithms. Prior work has theoretically and empirically shown that performance of offline optimization algorithms depends on how well the surrogate model captures the optimization bias (i.e., ability to rank input designs correctly), which is challenging to accomplish with limited experimental data. This paper proposes Surrogate Learning with Optimization Bias via Synthetic Task Generation (OptBias), a meta-learning framework that directly tackles data scarcity. OptBias learns a reusable optimization bias by training on synthetic tasks generated from a Gaussian process, and then fine-tunes the surrogate model on the small data for the target task. Across diverse continuous and discrete offline optimization benchmarks, OptBias consistently outperforms state-of-the-art baselines in small data regimes. These results highlight OptBias as a robust and practical solution for offline optimization in realistic small data settings.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

OptBias meta-learns a ranking bias for offline BBO surrogates by pre-training on GP-generated synthetic tasks then fine-tuning on small real data, with reported gains in low-data regimes but transfer from smooth GP landscapes to structured targets is the main open question.

read the letter

The core idea here is that OptBias meta-trains a surrogate to pick up optimization bias—specifically the ability to rank designs—by running on tasks sampled from a Gaussian process prior, then fine-tunes that model on the small target dataset. This is meant to help when real experimental data for molecule or material design is too limited for standard offline optimizers to work well. The paper shows consistent outperformance over baselines on a mix of continuous and discrete benchmarks in those small-data settings, which is the practical payoff they emphasize. Using independent GP samples for the meta phase is a clean way to generate training signal without leaning on the scarce real data, and it directly addresses the ranking issue that earlier theory work flagged as central. That part feels like a useful, targeted extension of meta-learning to this bias-learning problem. The results look promising enough on the reported benchmarks to warrant attention from people who actually run these design loops. The softer part is whether the GP prior actually teaches a bias that carries over. Real scientific landscapes often have discrete jumps, sharp changes, or non-stationary behavior that a stationary GP does not reproduce. If the learned ranking skill is mostly tuned to GP-like smoothness, the fine-tuning step on the real data could be carrying the gains, and the meta-training might add less than the headline claims. I would want to see ablations that isolate the meta component against plain fine-tuning or alternative task generators, plus checks on how well the surrogate ranks on held-out real structures. The experimental section apparently covers diverse cases, but without full details on variance or significance tests it is hard to judge how robust the edge is. Citation choices look standard for the area and do not seem to overclaim prior coverage. This paper is aimed at applied folks working on offline black-box optimization for scientific design with limited data. A reader who needs concrete methods for small-data surrogate improvement would get something usable from the framework and the benchmark results. It deserves peer review because the problem is real, the method is straightforward to implement, and the empirical angle is there even if the transfer assumption needs more direct testing.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces OptBias, a meta-learning framework for offline black-box optimization from small datasets. It generates synthetic tasks from a Gaussian process prior to meta-train a surrogate model that learns an optimization bias (ranking ability over designs), then fine-tunes this surrogate on the limited target-task data. The approach is evaluated on diverse continuous and discrete benchmarks, where it reports consistent outperformance over state-of-the-art baselines in small-data regimes.

Significance. If the central transfer claim holds, the work offers a practical solution to data scarcity in scientific optimization tasks such as molecule and material design. It builds directly on prior theoretical results linking surrogate ranking performance to optimization success and provides a concrete mechanism (GP-based synthetic meta-tasks) to bootstrap that ranking ability before fine-tuning. The method is reproducible in principle via the described GP sampling and fine-tuning procedure, and the benchmark results, if statistically robust, would constitute a useful empirical advance for low-data offline optimization.

major comments (2)

[§3.2] §3.2 (Synthetic Task Generation): The claim that GP-sampled tasks produce a transferable optimization bias rests on the untested assumption that GP smoothness and stationarity suffice to capture ranking behavior on real benchmarks that may contain discrete structure, sharp discontinuities, or non-stationarity. No theoretical bound or ablation is provided showing that the learned bias is not an artifact of the GP prior; if it is, fine-tuning alone could explain the reported gains, weakening the meta-learning contribution.
[§5] §5 (Experiments): The reported outperformance on small-data regimes lacks ablations that isolate the meta-pretraining step from the fine-tuning step alone, and provides no statistical significance tests or error bars across multiple random seeds. Without these, it is impossible to determine whether the gains are attributable to the synthetic-task meta-learning or to variance in the small target datasets.

minor comments (2)

[Abstract] Abstract and §1: The specific benchmarks (continuous and discrete) are not enumerated; listing them explicitly would improve reproducibility.
[§3] Notation in §3: The distinction between the meta-trained surrogate parameters and the fine-tuned parameters is not always clear in the equations; a single consistent symbol table would help.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for their thoughtful and constructive comments. We address each major comment below and indicate the revisions made to strengthen the manuscript.

read point-by-point responses

Referee: [§3.2] §3.2 (Synthetic Task Generation): The claim that GP-sampled tasks produce a transferable optimization bias rests on the untested assumption that GP smoothness and stationarity suffice to capture ranking behavior on real benchmarks that may contain discrete structure, sharp discontinuities, or non-stationarity. No theoretical bound or ablation is provided showing that the learned bias is not an artifact of the GP prior; if it is, fine-tuning alone could explain the reported gains, weakening the meta-learning contribution.

Authors: We agree that the Gaussian process prior relies on assumptions of smoothness and stationarity that may not perfectly align with all real benchmarks exhibiting discrete structures or non-stationarities. Our empirical results across both continuous and discrete benchmarks show consistent transfer of the learned ranking bias, suggesting it captures practically useful optimization properties. To isolate the meta-learning contribution, we have added an ablation comparing OptBias to a fine-tuning-only baseline without synthetic-task pretraining. We do not provide a formal theoretical bound on transferability, as the work is primarily empirical in focus. revision: partial
Referee: [§5] §5 (Experiments): The reported outperformance on small-data regimes lacks ablations that isolate the meta-pretraining step from the fine-tuning step alone, and provides no statistical significance tests or error bars across multiple random seeds. Without these, it is impossible to determine whether the gains are attributable to the synthetic-task meta-learning or to variance in the small target datasets.

Authors: We acknowledge these limitations in the original experimental design. In the revised manuscript, we have added error bars representing standard deviations across multiple random seeds and included statistical significance tests (paired t-tests) to support the reported improvements. We have also incorporated an ablation study that applies only fine-tuning on the target-task data, without the meta-pretraining on GP synthetic tasks, to demonstrate that the gains stem from the meta-learning step rather than fine-tuning or dataset variance alone. revision: yes

standing simulated objections not resolved

Providing a formal theoretical bound on the transferability of the optimization bias learned from GP synthetic tasks to real benchmarks with arbitrary discrete or non-stationary properties.

Circularity Check

0 steps flagged

No significant circularity; method uses independent GP synthetic tasks

full rationale

The paper generates synthetic tasks from a Gaussian process prior that is independent of the target offline datasets, meta-trains a surrogate to capture optimization bias on those tasks, and then fine-tunes on the small real data before evaluating on external benchmarks. No derivation step reduces a claimed prediction or result to a fitted quantity or self-referential definition by construction. The reference to prior work establishing the importance of optimization bias is not load-bearing for the central claims, which rest on the empirical transfer from GP-generated tasks to real benchmarks rather than on any self-citation chain or ansatz smuggled from prior author work.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that GP-generated synthetic tasks can transfer useful optimization bias to real small datasets; no free parameters or invented entities are explicitly described in the abstract.

axioms (1)

domain assumption Gaussian process models can generate synthetic tasks that capture relevant optimization bias for real-world small experimental datasets
Invoked to create training data for the meta-learning stage before fine-tuning on target small data.

pith-pipeline@v0.9.0 · 5726 in / 1165 out tokens · 37253 ms · 2026-05-22T11:07:02.670805+00:00 · methodology

Review history (3 revisions) →

discussion (0)

Reference graph

Works this paper leans on

4 extracted references · 4 canonical work pages · 1 internal anchor

[1]

Can Sam Chen, Christopher Beckham, Zixuan Liu, Xue Liu, and Christopher Pal

URLhttps://arxiv.org/abs/2309.11592. Can Sam Chen, Christopher Beckham, Zixuan Liu, Xue Liu, and Christopher Pal. Robust guided dif- fusion for offline black-box optimization.arXiv preprint arXiv:2410.00983, 2024. Shankar Dara et al. Machine learning in drug discov- ery: A review.Archives of Computational Methods in Engineering, 2021. Aryan Deshwal and Ja...

work page arXiv 2024
[2]

Azza Fadhel, Yassine Chemingui, Minh Hoang, Aryan Deshwal, Trong Nghia Hoang, and Jana Doppa

PMLR, 2023. Azza Fadhel, Yassine Chemingui, Minh Hoang, Aryan Deshwal, Trong Nghia Hoang, and Jana Doppa. Nanoporous materials discovery via search bias- guided surrogate modeling. InProceedings of the AAAI Conference on Artificial Intelligence, vol- ume 40, pages 38422–38431, 2026a. Azza Fadhel, Nathaniel W Zuckschwerdt, Aryan Desh- wal, Susmita Bose, Am...

work page arXiv 2023
[3]

Neural Processes

PMLR, 2018a. Marta Garnelo, Jonathan Schwarz, Dan Rosenbaum, Fabio Viola, Danilo J Rezende, SM Eslami, and Yee Whye Teh. Neural processes.arXiv preprint arXiv:1807.01622, 2018b. Roman Garnett.Bayesian optimization. Cambridge University Press, 2023. Ethan Goan and Clinton Fookes. Bayesian neural net- works: An introduction and survey. InCase Stud- ies in A...

work page internal anchor Pith review Pith/arXiv arXiv 2023
[4]

Ye Yuan, Youyuan Zhang, Can Chen, Haolun Wu, Zix- uan Li, Jianmo Li, James J Clark, and Xue Liu

URLhttps://arxiv.org/abs/2309.11600. Ye Yuan, Youyuan Zhang, Can Chen, Haolun Wu, Zix- uan Li, Jianmo Li, James J Clark, and Xue Liu. Design editing for offline model-based optimization. arXiv preprint arXiv:2405.13964, 2024. Taeyoung Yun, Sujin Yun, Jaewoo Lee, and Jinkyoo Park. Guided trajectory generation with diffusion models for offline model-based o...

work page arXiv 2024

[1] [1]

Can Sam Chen, Christopher Beckham, Zixuan Liu, Xue Liu, and Christopher Pal

URLhttps://arxiv.org/abs/2309.11592. Can Sam Chen, Christopher Beckham, Zixuan Liu, Xue Liu, and Christopher Pal. Robust guided dif- fusion for offline black-box optimization.arXiv preprint arXiv:2410.00983, 2024. Shankar Dara et al. Machine learning in drug discov- ery: A review.Archives of Computational Methods in Engineering, 2021. Aryan Deshwal and Ja...

work page arXiv 2024

[2] [2]

Azza Fadhel, Yassine Chemingui, Minh Hoang, Aryan Deshwal, Trong Nghia Hoang, and Jana Doppa

PMLR, 2023. Azza Fadhel, Yassine Chemingui, Minh Hoang, Aryan Deshwal, Trong Nghia Hoang, and Jana Doppa. Nanoporous materials discovery via search bias- guided surrogate modeling. InProceedings of the AAAI Conference on Artificial Intelligence, vol- ume 40, pages 38422–38431, 2026a. Azza Fadhel, Nathaniel W Zuckschwerdt, Aryan Desh- wal, Susmita Bose, Am...

work page arXiv 2023

[3] [3]

Neural Processes

PMLR, 2018a. Marta Garnelo, Jonathan Schwarz, Dan Rosenbaum, Fabio Viola, Danilo J Rezende, SM Eslami, and Yee Whye Teh. Neural processes.arXiv preprint arXiv:1807.01622, 2018b. Roman Garnett.Bayesian optimization. Cambridge University Press, 2023. Ethan Goan and Clinton Fookes. Bayesian neural net- works: An introduction and survey. InCase Stud- ies in A...

work page internal anchor Pith review Pith/arXiv arXiv 2023

[4] [4]

Ye Yuan, Youyuan Zhang, Can Chen, Haolun Wu, Zix- uan Li, Jianmo Li, James J Clark, and Xue Liu

URLhttps://arxiv.org/abs/2309.11600. Ye Yuan, Youyuan Zhang, Can Chen, Haolun Wu, Zix- uan Li, Jianmo Li, James J Clark, and Xue Liu. Design editing for offline model-based optimization. arXiv preprint arXiv:2405.13964, 2024. Taeyoung Yun, Sujin Yun, Jaewoo Lee, and Jinkyoo Park. Guided trajectory generation with diffusion models for offline model-based o...

work page arXiv 2024