Tree-Structured Parzen Estimator: Understanding Its Algorithm Components and Their Roles for Better Empirical Performance
read the original abstract
Recent scientific advances require complex experiment design, necessitating the meticulous tuning of many experiment parameters. Tree-structured Parzen estimator (TPE) is a widely used Bayesian optimization method in recent parameter tuning frameworks such as Hyperopt and Optuna. Despite its popularity, the roles of each control parameter in TPE and the algorithm intuition have not been discussed so far. The goal of this paper is to identify the roles of each control parameter and their impacts on parameter tuning based on the ablation studies using diverse benchmark datasets. The recommended setting concluded from the ablation studies is demonstrated to improve the performance of TPE. Our TPE implementation used in this paper is available at https://github.com/nabenabe0928/tpe/tree/single-opt. OptunaHub now provides our standalone TPE implementation at https://hub.optuna.org/samplers/tpe_tutorial/.
This paper has not been read by Pith yet.
Forward citations
Cited by 16 Pith papers
-
ArgBench: Benchmarking LLMs on Computational Argumentation Tasks
ArgBench unifies 33 existing datasets into a standardized benchmark for testing LLMs across 46 argumentation tasks and analyzes the impact of prompting techniques and model factors on performance.
-
Toto 2.0: Time Series Forecasting Enters the Scaling Era
Toto 2.0 is a family of open time series foundation models that demonstrates reliable scaling and sets new state-of-the-art results on three forecasting benchmarks.
-
Concise and Logically Consistent Conformal Sets for Neuro-Symbolic Concept-Based Models
COCOCO is a conformal framework for NeSy-CBMs that jointly conformalizes concepts and labels, reconciles them via deduction-abduction revision, and satisfies consistency, coverage, and conciseness while retaining dist...
-
PEML: Parameter-efficient Multi-Task Learning with Optimized Continuous Prompts
PEML co-optimizes continuous prompts and low-rank adaptations to deliver up to 6.67% average accuracy gains over existing multi-task PEFT methods on GLUE, SuperGLUE, and other benchmarks.
-
From Clever Hans to Scientific Discovery: Interpreting EEG Foundational Transformers with LRP
LRP on EEG transformers reveals Clever Hans artifacts in motor imagery tasks and a recurring central electrode cluster as a candidate sensorimotor signature of arousal.
-
Generative Flow Networks for Model Adaptation in Digital Twins of Natural Systems
GFlowNets sample multiple valid mechanistic simulator configurations for digital twin adaptation, recovering main parameter regions and preserving uncertainty in a tomato model case study.
-
FluidFlow: a flow-matching generative model for fluid dynamics surrogates on unstructured meshes
FluidFlow uses conditional flow-matching with U-Net and DiT architectures to predict pressure and friction coefficients on airfoils and 3D aircraft meshes, outperforming MLP baselines with better generalization.
-
PENEX: AdaBoost-Inspired Neural Network Regularization
PENEX is a new formulation of the multi-class exponential loss for neural networks that supports first-order optimization and improves generalization in low-data regimes.
-
A Leaf-Level Dataset for Soybean-Cotton Detection and Segmentation
A new leaf-instance dataset for soybean-cotton detection and segmentation collected across growth stages and conditions from commercial farms is presented and validated with YOLOv11.
-
Inferring identified hadron production in $pp$ collisions with physics-informed machine learning at the LHC
A physics-informed neural network infers pT spectra of pi, K, p, Lambda, and Ks in unmeasured rapidity regions from PYTHIA8 pp collisions at 13.6 TeV, achieving 1.5-5.83% yield uncertainties while reproducing yield ra...
-
ORTHOBO: Orthogonal Bayesian Hyperparameter Optimization
OrthoBO introduces an orthogonal acquisition estimator subtracting an optimally weighted score-function control variate to reduce Monte Carlo variance, preserve the acquisition target, and improve ranking stability in...
-
Combating the Memory Walls: Optimization Pathways for Long-Context Agentic LLM Inference
PLENA introduces a co-designed system with three optimization pathways for long-context agentic LLM inference, claiming up to 2.23x throughput over A100 and 4.04x energy efficiency.
-
Survival of the Cheapest: Cost-Aware Hardware Adaptation for Adversarial Robustness
A decision-support framework applies AFT models to show Nvidia L4 GPUs yield 20% longer adversarial survival time at 75% lower cost than V100, with inference latency as the strongest robustness predictor.
-
Optimizing Memory Allocation in Distributed Clusters with Predictive Modeling
A quantile-regression ensemble with safety factor reduces under-allocated jobs from 4.17% to 2.89% and average overallocation from 148% to 44.51% on SAP build data.
-
Data-Driven Reduction of Fault Location Errors in Onshore Wind Farm Collectors
A Gated Residual Network correction model reduces fault location error by 76% in simulated onshore wind farm collector networks compared to state-of-the-art methods.
-
Minimal Data, Maximum Clarity: A Heuristic for Explaining Optimization
EZR combines active Naive Bayes sampling and decision-tree distillation to reach over 90% of best-known multi-objective optimization performance on 60 datasets while producing clearer explanations than LIME, SHAP or B...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.