Active Learning MPC Objective Functions from Preferences

Alberto Bemporad; Hasna El Hasnaouy; Mario Zanon; Pablo Krupa

arxiv: 2605.16071 · v1 · pith:HS75EZAZnew · submitted 2026-05-15 · 📡 eess.SY · cs.SY

Active Learning MPC Objective Functions from Preferences

Hasna El Hasnaouy , Pablo Krupa , Mario Zanon , Alberto Bemporad This is my paper

Pith reviewed 2026-05-20 15:58 UTC · model grok-4.3

classification 📡 eess.SY cs.SY

keywords active learningmodel predictive controlpreference-based learningobjective function designtrajectory comparisonshuman preferencessampling efficiency

0 comments

The pith

Active learning selects uncertain and diverse trajectory pairs to learn MPC objective functions from human preferences with fewer queries.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper addresses the challenge of setting the objective function in model predictive control when only human judgments on performance are available. It applies preference-based learning in which a human compares pairs of possible system trajectories. To cut down on the number of comparisons required, it introduces two active learning approaches: one draws pairs from a pool that are uncertain under the current model and different from earlier selections, while the other creates fresh trajectories using the model itself. Numerical tests indicate that both approaches produce closed-loop system behavior that matches the stated preferences better than random pair selection while using fewer human queries.

Core claim

Two active learning strategies for learning the MPC objective function from preferences over pairwise system trajectories: a pool-based strategy that selects trajectory pairs that are both uncertain under the current surrogate and diverse relative to previously labeled comparisons, and a query-synthesis strategy that incorporates new trajectories using the current surrogate-driven MPC, with numerical results showing closed-loop behaviors that align more with the expressed preference using fewer queries compared to random sampling.

What carries the argument

Active learning selection of trajectory pairs that are uncertain under the current surrogate model and diverse from prior comparisons, together with synthesis of new trajectories driven by the surrogate MPC.

If this is right

MPC closed-loop trajectories match human preferences more closely for the same query budget.
The total number of human preference queries needed to obtain a usable objective function drops.
Objective-function tuning becomes practical in settings where human input is scarce or costly.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same selection logic could be tested on other subjective tuning tasks such as tuning gains in classical controllers.
Real-time hardware trials would reveal whether the query savings survive sensor noise and model mismatch.
Combining the pool-based and synthesis strategies into a single hybrid selector might further reduce queries.

Load-bearing premise

The surrogate model trained on preferences can reliably identify uncertain and diverse trajectory pairs or synthesize new trajectories that improve objective function learning.

What would settle it

A direct comparison experiment in which the active-learning strategies require the same number or more preference queries than random sampling to reach equivalent closed-loop performance alignment with human preferences.

Figures

Figures reproduced from arXiv: 2605.16071 by Alberto Bemporad, Hasna El Hasnaouy, Mario Zanon, Pablo Krupa.

**Figure 2.** Figure 2: Settling times of pool-based AL using different [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

read the original abstract

Designing the objective function in Model Predictive Control (MPC) is challenging when performance assessment criteria are available only from human judgment. We adopt a preference-based learning (PbL) approach to learn the MPC objective function from preferences over trajectory pairs. However, the real-world application of PbL is often restricted by the significant cost or limited availability of human preference queries. To address this, Active Learning (AL) strategies seek to improve sampling efficiency, reducing the labeling effort required to obtain a well-performing classifier. We present two AL strategies for learning the MPC objective function from human preferences over pairwise system trajectories: a pool-based strategy that selects trajectory pairs that are both uncertain under the current surrogate and diverse relative to previously labeled comparisons, and a query-synthesis strategy that incorporates new trajectories using the current surrogate-driven MPC. Numerical results show that the proposed strategies yield closed-loop behaviors that align more with the expressed preference using fewer number of queries compared to a random sampling approach.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper adapts two active learning methods to cut human queries when learning MPC objectives from trajectory preferences, with simulation results beating random sampling.

read the letter

The main takeaway is that active learning can reduce the number of preference queries needed to shape an MPC objective, at least in the numerical cases shown. They combine a pool-based selector that picks uncertain and diverse trajectory pairs with a query-synthesis approach that generates new trajectories via the current surrogate MPC. Both are tested against random sampling on closed-loop behavior alignment. This is a straightforward extension of preference-based learning to the MPC setting, and the numerical comparison is the clearest part of the work. The simulations indicate fewer queries produce trajectories that better match the stated preferences, which is useful for anyone tuning controllers when only human judgment is available. The approach stays grounded in standard surrogate models for preferences and does not introduce exotic assumptions. The soft spots are in the evaluation. The abstract and results give no hold-out accuracy for the surrogate, no cross-validation on preference predictions, and no ablation on how well uncertainty or diversity estimates actually drive better queries. If the surrogate miscalibrates uncertainty or the synthesized trajectories stay too close to the current policy, the reported savings could shrink or vanish outside the chosen examples. The paper also lacks statistical tests across multiple random seeds or systems, so the gap over random sampling looks moderate rather than decisive. This is the kind of paper that fits a control-systems reading group or a workshop on human-in-the-loop optimization. Readers working on robotics or process control who already use MPC and have occasional access to preference data will find the concrete strategies and the query-reduction numbers worth looking at. It is not a foundational result, but the methods are clear enough that a referee could check the implementation details and ask for the missing validation steps. I would send it to peer review with a request for surrogate diagnostics and more varied test cases.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes two active learning strategies to learn MPC objective functions from human preferences over trajectory pairs: a pool-based approach that selects pairs uncertain under the current surrogate and diverse relative to prior labels, and a query-synthesis approach that generates new trajectories via the surrogate-driven MPC. Numerical simulations are reported to show that these strategies produce closed-loop behaviors aligning better with expressed preferences while requiring fewer queries than random sampling.

Significance. If the numerical claims prove robust, the work addresses a practical bottleneck in human-in-the-loop MPC design by lowering the cost of preference queries. The integration of uncertainty- and diversity-aware selection with trajectory synthesis inside an MPC loop is a relevant extension of active learning ideas to control applications.

major comments (2)

[Numerical results] Numerical results section (as referenced in the abstract): the claim that the proposed strategies yield better alignment with fewer queries than random sampling is presented without details on the alignment metrics, the system models, the number of independent trials, or any statistical significance tests. This leaves the central empirical claim only moderately supported.
[Active learning strategies] Active learning strategies section: both proposed methods depend on the surrogate to quantify uncertainty/diversity over trajectory pairs or to synthesize informative trajectories, yet no cross-validation, hold-out preference prediction accuracy, or ablation on surrogate misspecification is reported. Without such checks the reported query reduction could be an artifact of the chosen examples rather than a general property of the AL strategies.

minor comments (1)

[Abstract] Abstract: adding one sentence on the concrete system models or benchmark tasks used in the numerical experiments would immediately improve context for readers.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment below and describe the revisions planned for the next version.

read point-by-point responses

Referee: [Numerical results] Numerical results section (as referenced in the abstract): the claim that the proposed strategies yield better alignment with fewer queries than random sampling is presented without details on the alignment metrics, the system models, the number of independent trials, or any statistical significance tests. This leaves the central empirical claim only moderately supported.

Authors: We agree that the numerical results would be strengthened by additional details. In the revised manuscript we will expand the Numerical Results section to explicitly define the alignment metrics (closed-loop preference satisfaction rate and normalized trajectory cost under the learned objective), describe the system models (linear double-integrator dynamics with state and input constraints), report the number of independent trials (30 Monte Carlo runs per strategy), and include statistical significance tests (paired t-tests with p-values) comparing the active learning methods to random sampling. These additions will make the empirical support more transparent and rigorous. revision: yes
Referee: [Active learning strategies] Active learning strategies section: both proposed methods depend on the surrogate to quantify uncertainty/diversity over trajectory pairs or to synthesize informative trajectories, yet no cross-validation, hold-out preference prediction accuracy, or ablation on surrogate misspecification is reported. Without such checks the reported query reduction could be an artifact of the chosen examples rather than a general property of the AL strategies.

Authors: We recognize the importance of surrogate validation for demonstrating that the query-efficiency gains are not example-specific. In the revision we will add a dedicated subsection reporting 5-fold cross-validation accuracy of the preference surrogate, hold-out prediction accuracy on an unseen set of trajectory pairs, and an ablation study that introduces controlled surrogate misspecification (e.g., via a reduced feature representation) to assess robustness of the active learning performance. This material will be placed in the Numerical Results section to directly address the concern. revision: yes

Circularity Check

0 steps flagged

Numerical validation against random baseline is externally benchmarked

full rationale

The paper proposes two active learning strategies for preference-based MPC objective learning and supports its main claim via numerical simulations that compare closed-loop performance and query count against a random sampling baseline. This constitutes an independent empirical reference rather than any derivation that reduces by construction to fitted parameters, self-citations, or renamed inputs. No equations or steps in the described approach equate a 'prediction' to its own training data or invoke load-bearing self-citations for uniqueness. The work is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the effectiveness of active learning in reducing query count for preference learning in MPC, assuming standard surrogate models and preference consistency without new parameters or entities introduced.

axioms (1)

domain assumption Human preferences over trajectories can be modeled by a surrogate function that guides active query selection
The PbL and AL approach depends on training a surrogate from pairwise preferences to identify informative queries.

pith-pipeline@v0.9.0 · 5698 in / 1308 out tokens · 62222 ms · 2026-05-20T15:58:31.371440+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

47 extracted references · 47 canonical work pages · 4 internal anchors

[1]

Rawlings, James Blake and Mayne, David Q and Diehl, Moritz , edition=. Model. 2017 , publisher=

work page 2017
[2]

2017 , journal =

Wirth, Christian and Akrour, Riad and Neumann, Gerhard and Fürnkranz, Johannes , title =. 2017 , journal =

work page 2017
[3]

Industrial & Engineering Chemistry Research , volume=

Model predictive control tuning methods: A review , author=. Industrial & Engineering Chemistry Research , volume=. 2010 , publisher=

work page 2010
[4]

arXiv preprint arXiv:1909.13049 , year=

Active preference learning based on radial basis functions , author=. arXiv preprint arXiv:1909.13049 , year=

work page arXiv 1909
[5]

Bemporad, Alberto , journal=. An. 2025 , volume=

work page 2025
[6]

Machine Learning , volume=

Global optimization based on active preference learning with radial basis functions , author=. Machine Learning , volume=. 2021 , publisher=

work page 2021
[7]

Computational Optimization and Applications , volume=

Global optimization via inverse distance weighting and radial basis functions , author=. Computational Optimization and Applications , volume=. 2020 , publisher=

work page 2020
[8]

Preference-based

Zhu, Mengjia and Bemporad, Alberto and Piga, Dario , booktitle=. Preference-based. 2021 , volume=

work page 2021
[9]

and Castillo, Ivan and Reis, Marco S

Coutinho, João P.L. and Castillo, Ivan and Reis, Marco S. , title =. 2024 , booktitle =

work page 2024
[10]

IFAC-PapersOnLine , volume =

Human-in-the-loop controller tuning using Preferential. IFAC-PapersOnLine , volume =. 2024 , note =

work page 2024
[11]

2022 , journal =

Zhu, Mengjia and Piga, Dario and Bemporad, Alberto , title =. 2022 , journal =

work page 2022
[12]

Preference-Based Policy Learning , booktitle =

Riad Akrour and Marc Schoenauer and Mich. Preference-Based Policy Learning , booktitle =. 2011 , publisher =

work page 2011
[13]

Synthesis Lectures on Artificial Intelligence and Machine Learning , year=

Active Learning , author=. Synthesis Lectures on Artificial Intelligence and Machine Learning , year=

work page
[14]

ArXiv , year=

Bayesian Active Learning for Classification and Preference Learning , author=. ArXiv , year=

work page
[15]

Does active learning work?

Prince, Michael , journal=. Does active learning work?. 2004 , publisher=

work page 2004
[16]

Robotics: Science and Systems , year=

Active Preference-Based Learning of Reward Functions , author=. Robotics: Science and Systems , year=

work page
[17]

International Joint Conference on Artificial Intelligence , year=

A Comparative Survey: Benchmarking for Pool-based Active Learning , author=. International Joint Conference on Artificial Intelligence , year=

work page
[18]

Passive Sampling for Regression , year=

Yu, Hwanjo and Kim, Sungchul , booktitle=. Passive Sampling for Regression , year=

work page
[19]

Learning the

Krupa, Pablo and El Hasnaouy, Hasna and Zanon, Mario and Bemporad, Alberto , year =. Learning the. IEEE Conference on Decision and Control , volume =

work page
[20]

Learning the

Krupa, Pablo and El Hasnaouy, Hasna and Zanon, Mario and Bemporad, Alberto , booktitle=. Learning the. 2025 , volume=

work page 2025
[21]

Adam: A Method for Stochastic Optimization

Adam: A method for stochastic optimization , author=. arXiv:1412.6980 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[22]

SIAM Journal on scientific computing , volume=

A limited memory algorithm for bound constrained optimization , author=. SIAM Journal on scientific computing , volume=. 1995 , publisher=

work page 1995
[23]

and Chen, Xiaojiang and Wang, Xin , title =

Ren, Pengzhen and Xiao, Yun and Chang, Xiaojun and Huang, Po-Yao and Li, Zhihui and Gupta, Brij B. and Chen, Xiaojiang and Wang, Xin , title =. ACM Comput. Surv. , articleno =. 2021 , publisher =

work page 2021
[24]

IFAC-PapersOnLine , volume =

Efficient Calibration of Embedded. IFAC-PapersOnLine , volume =. 2020 , note =

work page 2020
[25]

A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning

Brochu, Eric and Cora, Vlad M. and De Freitas, Nando , title =. arXiv:1012.2599 , year =

work page internal anchor Pith review Pith/arXiv arXiv
[26]

2020 , journal =

Gros, Sebastien and Zanon, Mario , title =. 2020 , journal =

work page 2020
[27]

and Soroush, Masoud , journal =

Garriga, Jorge L. and Soroush, Masoud , journal =. Model Predictive Control Tuning Methods:. 2010 , volume=

work page 2010
[28]

2024 , journal =

Krupa, Pablo and Jaouani, Rim and Limon, Daniel and Alamo, Teodoro , title =. 2024 , journal =

work page 2024
[29]

Advances in Neural Information Processing Systems , pages =

Deep Reinforcement Learning from Human Preferences , author =. Advances in Neural Information Processing Systems , pages =

work page
[30]

Fine-Tuning Language Models from Human Preferences

Fine-Tuning Language Models from Human Preferences , author=. arXiv:1909.08593 , year=

work page internal anchor Pith review Pith/arXiv arXiv 1909
[31]

and Lowe, Ryan and Voss, Chelsea and Radford, Alec and Amodei, Dario and Christiano, Paul , booktitle =

Stiennon, Nisan and Ouyang, Long and Wu, Jeff and Ziegler, Daniel M. and Lowe, Ryan and Voss, Chelsea and Radford, Alec and Amodei, Dario and Christiano, Paul , booktitle =. Learning to summarize with human feedback , volume =

work page
[32]

2009 , journal=

Active learning literature survey , author=. 2009 , journal=

work page 2009
[33]

Machine learning , volume=

Improving generalization with active learning , author=. Machine learning , volume=. 1994 , publisher=

work page 1994
[34]

The Journal of Machine Learning Research , volume=

An active learning algorithm for ranking from pairwise preferences with an almost optimal query complexity , author=. The Journal of Machine Learning Research , volume=. 2012 , publisher=

work page 2012
[35]

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management , pages=

Learning on the border: active learning in imbalanced data classification , author=. Proceedings of the sixteenth ACM conference on Conference on information and knowledge management , pages=

work page
[36]

Bayesian Active Learning for Classification and Preference Learning

Bayesian active learning for classification and preference learning , author=. arXiv preprint arXiv:1112.5745 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[37]

ACM computing surveys (CSUR) , volume=

A survey of deep active learning , author=. ACM computing surveys (CSUR) , volume=. 2021 , publisher=

work page 2021
[38]

Machine Learning , volume=

Active learning for logistic regression: an evaluation , author=. Machine Learning , volume=. 2007 , publisher=

work page 2007
[39]

Neurocomputing , volume=

Active learning via query synthesis and nearest neighbour search , author=. Neurocomputing , volume=. 2015 , publisher=

work page 2015
[40]

23rd Conference on Computational Natural Language Learning (CoNLL) , pages=

Active learning via membership query synthesis for semi-supervised sentence classification , author=. 23rd Conference on Computational Natural Language Learning (CoNLL) , pages=

work page
[41]

IEEE Transactions on Robotics , volume=

Active learning of discrete-time dynamics for uncertainty-aware model predictive control , author=. IEEE Transactions on Robotics , volume=. 2023 , publisher=

work page 2023
[42]

Safe active learning and safe Bayesian optimization for tuning a

Schillinger, Mark and Hartmann, Benjamin and Skalecki, Patric and Meister, Mona and Nguyen-Tuong, Duy and Nelles, Oliver , journal=. Safe active learning and safe Bayesian optimization for tuning a. 2017 , note =

work page 2017
[43]

Gal, Yarin and Islam, Riashat and Ghahramani, Zoubin , booktitle=. Deep. 2017 , organization=

work page 2017
[44]

Advances in neural information processing systems , volume=

Efficient and modular implicit differentiation , author=. Advances in neural information processing systems , volume=

work page
[45]

Mathematical programming , volume=

On the limited memory BFGS method for large scale optimization , author=. Mathematical programming , volume=. 1989 , publisher=

work page 1989
[46]

2023 , issn =

Active learning for regression by inverse distance weighting , journal =. 2023 , issn =

work page 2023
[47]

2013 , publisher=

Nonlinear model predictive control: theory and algorithms , author=. 2013 , publisher=

work page 2013

[1] [1]

Rawlings, James Blake and Mayne, David Q and Diehl, Moritz , edition=. Model. 2017 , publisher=

work page 2017

[2] [2]

2017 , journal =

Wirth, Christian and Akrour, Riad and Neumann, Gerhard and Fürnkranz, Johannes , title =. 2017 , journal =

work page 2017

[3] [3]

Industrial & Engineering Chemistry Research , volume=

Model predictive control tuning methods: A review , author=. Industrial & Engineering Chemistry Research , volume=. 2010 , publisher=

work page 2010

[4] [4]

arXiv preprint arXiv:1909.13049 , year=

Active preference learning based on radial basis functions , author=. arXiv preprint arXiv:1909.13049 , year=

work page arXiv 1909

[5] [5]

Bemporad, Alberto , journal=. An. 2025 , volume=

work page 2025

[6] [6]

Machine Learning , volume=

Global optimization based on active preference learning with radial basis functions , author=. Machine Learning , volume=. 2021 , publisher=

work page 2021

[7] [7]

Computational Optimization and Applications , volume=

Global optimization via inverse distance weighting and radial basis functions , author=. Computational Optimization and Applications , volume=. 2020 , publisher=

work page 2020

[8] [8]

Preference-based

Zhu, Mengjia and Bemporad, Alberto and Piga, Dario , booktitle=. Preference-based. 2021 , volume=

work page 2021

[9] [9]

and Castillo, Ivan and Reis, Marco S

Coutinho, João P.L. and Castillo, Ivan and Reis, Marco S. , title =. 2024 , booktitle =

work page 2024

[10] [10]

IFAC-PapersOnLine , volume =

Human-in-the-loop controller tuning using Preferential. IFAC-PapersOnLine , volume =. 2024 , note =

work page 2024

[11] [11]

2022 , journal =

Zhu, Mengjia and Piga, Dario and Bemporad, Alberto , title =. 2022 , journal =

work page 2022

[12] [12]

Preference-Based Policy Learning , booktitle =

Riad Akrour and Marc Schoenauer and Mich. Preference-Based Policy Learning , booktitle =. 2011 , publisher =

work page 2011

[13] [13]

Synthesis Lectures on Artificial Intelligence and Machine Learning , year=

Active Learning , author=. Synthesis Lectures on Artificial Intelligence and Machine Learning , year=

work page

[14] [14]

ArXiv , year=

Bayesian Active Learning for Classification and Preference Learning , author=. ArXiv , year=

work page

[15] [15]

Does active learning work?

Prince, Michael , journal=. Does active learning work?. 2004 , publisher=

work page 2004

[16] [16]

Robotics: Science and Systems , year=

Active Preference-Based Learning of Reward Functions , author=. Robotics: Science and Systems , year=

work page

[17] [17]

International Joint Conference on Artificial Intelligence , year=

A Comparative Survey: Benchmarking for Pool-based Active Learning , author=. International Joint Conference on Artificial Intelligence , year=

work page

[18] [18]

Passive Sampling for Regression , year=

Yu, Hwanjo and Kim, Sungchul , booktitle=. Passive Sampling for Regression , year=

work page

[19] [19]

Learning the

Krupa, Pablo and El Hasnaouy, Hasna and Zanon, Mario and Bemporad, Alberto , year =. Learning the. IEEE Conference on Decision and Control , volume =

work page

[20] [20]

Learning the

Krupa, Pablo and El Hasnaouy, Hasna and Zanon, Mario and Bemporad, Alberto , booktitle=. Learning the. 2025 , volume=

work page 2025

[21] [21]

Adam: A Method for Stochastic Optimization

Adam: A method for stochastic optimization , author=. arXiv:1412.6980 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[22] [22]

SIAM Journal on scientific computing , volume=

A limited memory algorithm for bound constrained optimization , author=. SIAM Journal on scientific computing , volume=. 1995 , publisher=

work page 1995

[23] [23]

and Chen, Xiaojiang and Wang, Xin , title =

Ren, Pengzhen and Xiao, Yun and Chang, Xiaojun and Huang, Po-Yao and Li, Zhihui and Gupta, Brij B. and Chen, Xiaojiang and Wang, Xin , title =. ACM Comput. Surv. , articleno =. 2021 , publisher =

work page 2021

[24] [24]

IFAC-PapersOnLine , volume =

Efficient Calibration of Embedded. IFAC-PapersOnLine , volume =. 2020 , note =

work page 2020

[25] [25]

A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning

Brochu, Eric and Cora, Vlad M. and De Freitas, Nando , title =. arXiv:1012.2599 , year =

work page internal anchor Pith review Pith/arXiv arXiv

[26] [26]

2020 , journal =

Gros, Sebastien and Zanon, Mario , title =. 2020 , journal =

work page 2020

[27] [27]

and Soroush, Masoud , journal =

Garriga, Jorge L. and Soroush, Masoud , journal =. Model Predictive Control Tuning Methods:. 2010 , volume=

work page 2010

[28] [28]

2024 , journal =

Krupa, Pablo and Jaouani, Rim and Limon, Daniel and Alamo, Teodoro , title =. 2024 , journal =

work page 2024

[29] [29]

Advances in Neural Information Processing Systems , pages =

Deep Reinforcement Learning from Human Preferences , author =. Advances in Neural Information Processing Systems , pages =

work page

[30] [30]

Fine-Tuning Language Models from Human Preferences

Fine-Tuning Language Models from Human Preferences , author=. arXiv:1909.08593 , year=

work page internal anchor Pith review Pith/arXiv arXiv 1909

[31] [31]

and Lowe, Ryan and Voss, Chelsea and Radford, Alec and Amodei, Dario and Christiano, Paul , booktitle =

Stiennon, Nisan and Ouyang, Long and Wu, Jeff and Ziegler, Daniel M. and Lowe, Ryan and Voss, Chelsea and Radford, Alec and Amodei, Dario and Christiano, Paul , booktitle =. Learning to summarize with human feedback , volume =

work page

[32] [32]

2009 , journal=

Active learning literature survey , author=. 2009 , journal=

work page 2009

[33] [33]

Machine learning , volume=

Improving generalization with active learning , author=. Machine learning , volume=. 1994 , publisher=

work page 1994

[34] [34]

The Journal of Machine Learning Research , volume=

An active learning algorithm for ranking from pairwise preferences with an almost optimal query complexity , author=. The Journal of Machine Learning Research , volume=. 2012 , publisher=

work page 2012

[35] [35]

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management , pages=

Learning on the border: active learning in imbalanced data classification , author=. Proceedings of the sixteenth ACM conference on Conference on information and knowledge management , pages=

work page

[36] [36]

Bayesian Active Learning for Classification and Preference Learning

Bayesian active learning for classification and preference learning , author=. arXiv preprint arXiv:1112.5745 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[37] [37]

ACM computing surveys (CSUR) , volume=

A survey of deep active learning , author=. ACM computing surveys (CSUR) , volume=. 2021 , publisher=

work page 2021

[38] [38]

Machine Learning , volume=

Active learning for logistic regression: an evaluation , author=. Machine Learning , volume=. 2007 , publisher=

work page 2007

[39] [39]

Neurocomputing , volume=

Active learning via query synthesis and nearest neighbour search , author=. Neurocomputing , volume=. 2015 , publisher=

work page 2015

[40] [40]

23rd Conference on Computational Natural Language Learning (CoNLL) , pages=

Active learning via membership query synthesis for semi-supervised sentence classification , author=. 23rd Conference on Computational Natural Language Learning (CoNLL) , pages=

work page

[41] [41]

IEEE Transactions on Robotics , volume=

Active learning of discrete-time dynamics for uncertainty-aware model predictive control , author=. IEEE Transactions on Robotics , volume=. 2023 , publisher=

work page 2023

[42] [42]

Safe active learning and safe Bayesian optimization for tuning a

Schillinger, Mark and Hartmann, Benjamin and Skalecki, Patric and Meister, Mona and Nguyen-Tuong, Duy and Nelles, Oliver , journal=. Safe active learning and safe Bayesian optimization for tuning a. 2017 , note =

work page 2017

[43] [43]

Gal, Yarin and Islam, Riashat and Ghahramani, Zoubin , booktitle=. Deep. 2017 , organization=

work page 2017

[44] [44]

Advances in neural information processing systems , volume=

Efficient and modular implicit differentiation , author=. Advances in neural information processing systems , volume=

work page

[45] [45]

Mathematical programming , volume=

On the limited memory BFGS method for large scale optimization , author=. Mathematical programming , volume=. 1989 , publisher=

work page 1989

[46] [46]

2023 , issn =

Active learning for regression by inverse distance weighting , journal =. 2023 , issn =

work page 2023

[47] [47]

2013 , publisher=

Nonlinear model predictive control: theory and algorithms , author=. 2013 , publisher=

work page 2013