Black-Box Optimization From Small Offline Datasets via Meta Learning with Synthetic Tasks
Pith reviewed 2026-05-22 11:07 UTC · model grok-4.3
The pith
OptBias meta-learns optimization bias from Gaussian process synthetic tasks then fine-tunes the surrogate on small target data to improve offline black-box optimization.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that a surrogate model can acquire a reusable optimization bias—its ability to correctly rank input designs—by meta-training on synthetic tasks generated from a Gaussian process, and that fine-tuning this biased surrogate on limited real data yields better identification of optimal designs in data-scarce offline black-box optimization settings.
What carries the argument
OptBias, the meta-learning framework that generates synthetic tasks from a Gaussian process to instill optimization bias in the surrogate before fine-tuning on the target task's small dataset.
If this is right
- The fine-tuned surrogate more accurately ranks candidate designs, leading to higher-quality optima found from small offline datasets.
- Performance advantages appear in both continuous and discrete optimization benchmarks when data is limited.
- The approach offers a practical route to reuse meta-learned ranking knowledge across multiple scientific design problems that share similar data constraints.
Where Pith is reading between the lines
- The success implies that broad properties of ranking landscapes can transfer from simple Gaussian process priors to real-world discrete or structured problems.
- One could test whether replacing the Gaussian process with other generative models for synthetic tasks further widens the performance gap on non-smooth objectives.
- The same pretraining idea might reduce sample needs in sequential settings where new data points are acquired adaptively rather than given all at once.
Load-bearing premise
That synthetic tasks generated from a Gaussian process sufficiently capture the optimization bias (ranking ability) needed for effective fine-tuning on real small experimental datasets in target tasks.
What would settle it
A target benchmark whose objective landscape deviates strongly from Gaussian process assumptions, such as one with sharp discontinuities or extreme multimodality, where OptBias shows no improvement or degrades relative to a surrogate trained only on the small real dataset.
Figures
read the original abstract
We consider the problem of offline black-box optimization, where the goal is to discover optimal designs (e.g., molecules or materials) from past experimental data. A key challenge in this setting is data scarcity: in many scientific applications, only small or poor-quality datasets are available, which severely limits the effectiveness of existing algorithms. Prior work has theoretically and empirically shown that performance of offline optimization algorithms depends on how well the surrogate model captures the optimization bias (i.e., ability to rank input designs correctly), which is challenging to accomplish with limited experimental data. This paper proposes Surrogate Learning with Optimization Bias via Synthetic Task Generation (OptBias), a meta-learning framework that directly tackles data scarcity. OptBias learns a reusable optimization bias by training on synthetic tasks generated from a Gaussian process, and then fine-tunes the surrogate model on the small data for the target task. Across diverse continuous and discrete offline optimization benchmarks, OptBias consistently outperforms state-of-the-art baselines in small data regimes. These results highlight OptBias as a robust and practical solution for offline optimization in realistic small data settings.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces OptBias, a meta-learning framework for offline black-box optimization from small datasets. It generates synthetic tasks from a Gaussian process prior to meta-train a surrogate model that learns an optimization bias (ranking ability over designs), then fine-tunes this surrogate on the limited target-task data. The approach is evaluated on diverse continuous and discrete benchmarks, where it reports consistent outperformance over state-of-the-art baselines in small-data regimes.
Significance. If the central transfer claim holds, the work offers a practical solution to data scarcity in scientific optimization tasks such as molecule and material design. It builds directly on prior theoretical results linking surrogate ranking performance to optimization success and provides a concrete mechanism (GP-based synthetic meta-tasks) to bootstrap that ranking ability before fine-tuning. The method is reproducible in principle via the described GP sampling and fine-tuning procedure, and the benchmark results, if statistically robust, would constitute a useful empirical advance for low-data offline optimization.
major comments (2)
- [§3.2] §3.2 (Synthetic Task Generation): The claim that GP-sampled tasks produce a transferable optimization bias rests on the untested assumption that GP smoothness and stationarity suffice to capture ranking behavior on real benchmarks that may contain discrete structure, sharp discontinuities, or non-stationarity. No theoretical bound or ablation is provided showing that the learned bias is not an artifact of the GP prior; if it is, fine-tuning alone could explain the reported gains, weakening the meta-learning contribution.
- [§5] §5 (Experiments): The reported outperformance on small-data regimes lacks ablations that isolate the meta-pretraining step from the fine-tuning step alone, and provides no statistical significance tests or error bars across multiple random seeds. Without these, it is impossible to determine whether the gains are attributable to the synthetic-task meta-learning or to variance in the small target datasets.
minor comments (2)
- [Abstract] Abstract and §1: The specific benchmarks (continuous and discrete) are not enumerated; listing them explicitly would improve reproducibility.
- [§3] Notation in §3: The distinction between the meta-trained surrogate parameters and the fine-tuned parameters is not always clear in the equations; a single consistent symbol table would help.
Simulated Author's Rebuttal
We thank the referee for their thoughtful and constructive comments. We address each major comment below and indicate the revisions made to strengthen the manuscript.
read point-by-point responses
-
Referee: [§3.2] §3.2 (Synthetic Task Generation): The claim that GP-sampled tasks produce a transferable optimization bias rests on the untested assumption that GP smoothness and stationarity suffice to capture ranking behavior on real benchmarks that may contain discrete structure, sharp discontinuities, or non-stationarity. No theoretical bound or ablation is provided showing that the learned bias is not an artifact of the GP prior; if it is, fine-tuning alone could explain the reported gains, weakening the meta-learning contribution.
Authors: We agree that the Gaussian process prior relies on assumptions of smoothness and stationarity that may not perfectly align with all real benchmarks exhibiting discrete structures or non-stationarities. Our empirical results across both continuous and discrete benchmarks show consistent transfer of the learned ranking bias, suggesting it captures practically useful optimization properties. To isolate the meta-learning contribution, we have added an ablation comparing OptBias to a fine-tuning-only baseline without synthetic-task pretraining. We do not provide a formal theoretical bound on transferability, as the work is primarily empirical in focus. revision: partial
-
Referee: [§5] §5 (Experiments): The reported outperformance on small-data regimes lacks ablations that isolate the meta-pretraining step from the fine-tuning step alone, and provides no statistical significance tests or error bars across multiple random seeds. Without these, it is impossible to determine whether the gains are attributable to the synthetic-task meta-learning or to variance in the small target datasets.
Authors: We acknowledge these limitations in the original experimental design. In the revised manuscript, we have added error bars representing standard deviations across multiple random seeds and included statistical significance tests (paired t-tests) to support the reported improvements. We have also incorporated an ablation study that applies only fine-tuning on the target-task data, without the meta-pretraining on GP synthetic tasks, to demonstrate that the gains stem from the meta-learning step rather than fine-tuning or dataset variance alone. revision: yes
- Providing a formal theoretical bound on the transferability of the optimization bias learned from GP synthetic tasks to real benchmarks with arbitrary discrete or non-stationary properties.
Circularity Check
No significant circularity; method uses independent GP synthetic tasks
full rationale
The paper generates synthetic tasks from a Gaussian process prior that is independent of the target offline datasets, meta-trains a surrogate to capture optimization bias on those tasks, and then fine-tunes on the small real data before evaluating on external benchmarks. No derivation step reduces a claimed prediction or result to a fitted quantity or self-referential definition by construction. The reference to prior work establishing the importance of optimization bias is not load-bearing for the central claims, which rest on the empirical transfer from GP-generated tasks to real benchmarks rather than on any self-citation chain or ansatz smuggled from prior author work.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Gaussian process models can generate synthetic tasks that capture relevant optimization bias for real-world small experimental datasets
Reference graph
Works this paper leans on
-
[1]
Can Sam Chen, Christopher Beckham, Zixuan Liu, Xue Liu, and Christopher Pal
URLhttps://arxiv.org/abs/2309.11592. Can Sam Chen, Christopher Beckham, Zixuan Liu, Xue Liu, and Christopher Pal. Robust guided dif- fusion for offline black-box optimization.arXiv preprint arXiv:2410.00983, 2024. Shankar Dara et al. Machine learning in drug discov- ery: A review.Archives of Computational Methods in Engineering, 2021. Aryan Deshwal and Ja...
-
[2]
Azza Fadhel, Yassine Chemingui, Minh Hoang, Aryan Deshwal, Trong Nghia Hoang, and Jana Doppa
PMLR, 2023. Azza Fadhel, Yassine Chemingui, Minh Hoang, Aryan Deshwal, Trong Nghia Hoang, and Jana Doppa. Nanoporous materials discovery via search bias- guided surrogate modeling. InProceedings of the AAAI Conference on Artificial Intelligence, vol- ume 40, pages 38422–38431, 2026a. Azza Fadhel, Nathaniel W Zuckschwerdt, Aryan Desh- wal, Susmita Bose, Am...
-
[3]
PMLR, 2018a. Marta Garnelo, Jonathan Schwarz, Dan Rosenbaum, Fabio Viola, Danilo J Rezende, SM Eslami, and Yee Whye Teh. Neural processes.arXiv preprint arXiv:1807.01622, 2018b. Roman Garnett.Bayesian optimization. Cambridge University Press, 2023. Ethan Goan and Clinton Fookes. Bayesian neural net- works: An introduction and survey. InCase Stud- ies in A...
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[4]
Ye Yuan, Youyuan Zhang, Can Chen, Haolun Wu, Zix- uan Li, Jianmo Li, James J Clark, and Xue Liu
URLhttps://arxiv.org/abs/2309.11600. Ye Yuan, Youyuan Zhang, Can Chen, Haolun Wu, Zix- uan Li, Jianmo Li, James J Clark, and Xue Liu. Design editing for offline model-based optimization. arXiv preprint arXiv:2405.13964, 2024. Taeyoung Yun, Sujin Yun, Jaewoo Lee, and Jinkyoo Park. Guided trajectory generation with diffusion models for offline model-based o...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.