pith. sign in

arxiv: 2606.25743 · v1 · pith:YJRPGJHBnew · submitted 2026-06-24 · 💻 cs.LG

Black-Box Assisted Regression: Phase Transitions and Minimax Optimality

Pith reviewed 2026-06-25 21:06 UTC · model grok-4.3

classification 💻 cs.LG
keywords black-box assisted regressionphase transitionminimax optimalitysafe residual estimatornonparametric regressionnegative transferfoundation models
0
0 comments X

The pith

In nonparametric regression assisted by a fixed black-box predictor, the minimax risk transitions at a critical closeness radius from the black-box error to the standard rate.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper studies regression where a learner observes labeled data and can query a fixed predictor f0, with the target f* known only to lie within L2 distance δ of f0 for unknown δ. It derives a finite-sample minimax characterization showing the risk switches behavior at δ_c(n) proportional to n to the power of minus beta over 2 beta plus d, taking the leading value min of delta squared and the usual nonparametric rate n to the power of minus 2 beta over 2 beta plus d. A Safe Residual Estimator is introduced that fits a correction to f0 starting from zero initialization and uses holdout validation to revert to f0 alone if the correction is unsupported, thereby matching the optimal leading term up to a small additive cost while avoiding negative transfer. This setup is relevant when foundation models serve as black-box predictors on limited labeled data, where blind reliance can degrade performance.

Core claim

In the black-box assisted nonparametric regression setting, samples are observed along with access to a fixed predictor f0, and the target satisfies an L2 closeness bound of radius δ to f0. The minimax risk admits the finite-sample characterization with a phase transition at δ_c(n) ≍ n^{-β/(2β+d)}, where the leading risk term equals min{δ², n^{-2β/(2β+d)}}. The Safe Residual Estimator learns a residual correction initialized at zero (so it begins equal to f0) and applies holdout selection to revert to f0 when the correction lacks validation support, attaining the leading minimax term up to an additive validation-selection cost.

What carries the argument

The Safe Residual Estimator, which initializes its residual correction at zero and uses holdout selection to decide whether to apply the correction or revert to the black-box predictor alone.

If this is right

  • When δ exceeds the critical radius the minimax risk equals δ², so the black-box alone is rate-optimal.
  • When δ lies below the critical radius the minimax risk equals the standard nonparametric rate n^{-2β/(2β+d)}.
  • The safe estimator matches the minimax leading term while guaranteeing it never exceeds the black-box risk.
  • The same residual-correction tradeoff appears in experiments on CIFAR-100 with CLIP and AG News with Qwen3-8B.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The phase-transition structure may extend to classification or other loss functions with analogous closeness assumptions.
  • The validation-selection overhead could be reduced by alternative model-selection procedures not analyzed here.
  • Similar safe-correction mechanisms might apply when multiple black-box predictors are available rather than one fixed f0.

Load-bearing premise

The unknown target function lies within some fixed but unknown L2 distance δ of the given black-box predictor.

What would settle it

Empirical risk measurements across a grid of δ values and sample sizes n that fail to exhibit the predicted switch in leading term from δ² to n^{-2β/(2β+d)} at the stated critical radius, or instances where the safe estimator performs worse than the black-box alone.

Figures

Figures reproduced from arXiv: 2606.25743 by Yan Zhou.

Figure 1
Figure 1. Figure 1: Theory in Action: The Phase Transition. (a) Mechanism: Unlike WEIGHTED ensembles (orange) that are constrained to a linear subspace spanned by the prior and data, our RESIDUAL estimator (blue) performs a non-linear correction, effectively capturing the complex residual r ∗ (x). (b) Risk Profile (Key Theoretical Result): We identify a critical threshold δc(n). Left of dashed line (Prior-Dominated): The blac… view at source ↗
Figure 2
Figure 2. Figure 2: Sample Efficiency on CIFAR-100. Comparison of test accuracy as the number of labeled samples n increases. SCRATCH (gray) and CONCAT (green) degrade severely in the few-shot regime. In contrast, our SAFE RESIDUAL estimator (blue) con￾sistently outperforms the strong WEIGHTED ensemble baseline (orange). Notably, at n = 2000, our method achieves a 2.5% gain over the weighted ensemble [PITH_FULL_IMAGE:figures… view at source ↗
Figure 3
Figure 3. Figure 3: Ablation studies. Left: the validation split is stable for ρ ∈ [0.15, 0.3]. Right: zero-initialization ensures a smooth departure from the prior f0. 23 [PITH_FULL_IMAGE:figures/full_fig_p023_3.png] view at source ↗
read the original abstract

Foundation models are often used as fixed black-box predictors for downstream tasks with limited labeled data, but their predictions may be biased and unsafe to trust blindly. We study this setting through black-box assisted nonparametric regression: a learner observes labeled samples and can query a fixed predictor $f_0$, while the target $f^*$ is close to $f_0$ in $L_2(P_X)$ up to an unknown radius $\delta$. We give a finite-sample minimax characterization showing a phase transition at $\delta_c(n) \asymp n^{-\beta/(2\beta+d)}$, with leading risk $\min\{\delta^2, n^{-2\beta/(2\beta+d)}\}$. We then analyze a Safe Residual Estimator: it learns a correction around $f_0$, initializes the residual head at zero so the initial predictor equals $f_0$, and uses holdout selection to revert to $f_0$ when the learned correction is not supported by validation data. Here, "safe" means avoiding negative transfer, i.e., performing worse than the black-box predictor alone. The estimator matches the leading minimax term up to an additive validation-selection cost. Synthetic regression experiments verify the predicted phase transition, while CIFAR-100 with CLIP and AG News with Qwen3-8B provide practice-facing evidence that the same residual-correction tradeoff is useful beyond the formal squared-loss regression setting.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The paper analyzes black-box assisted nonparametric regression, where labeled samples are available and a fixed predictor f0 can be queried, under the model that the target f* satisfies ||f* - f0||_{L2(P_X)} ≤ δ for unknown δ. It derives a finite-sample minimax characterization with a phase transition at δ_c(n) ≍ n^{-β/(2β+d)} and leading risk min{δ², n^{-2β/(2β+d)}}. A Safe Residual Estimator is introduced that learns a correction to f0 (initialized at zero residual), uses holdout validation to select whether to apply the correction or revert to f0, and is shown to match the minimax leading term up to an additive validation cost. Synthetic experiments confirm the phase transition, and real-data examples (CIFAR-100 with CLIP, AG News with Qwen3-8B) illustrate utility beyond squared loss.

Significance. If the finite-sample minimax bounds and matching upper bound for the Safe Residual Estimator hold, the work supplies a precise theoretical account of when and how much a fixed black-box predictor can safely assist nonparametric regression. The explicit phase transition and the guarantee against negative transfer are directly relevant to foundation-model-assisted learning with limited labels. The matching of the leading minimax term (up to validation cost) and the provision of both synthetic verification and practice-facing experiments strengthen the contribution.

minor comments (3)
  1. The abstract refers to 'finite-sample minimax characterization' and 'matching the leading minimax term'; the precise statements of the lower and upper bounds (including dependence on β, d, n, and δ) should be stated explicitly in the introduction or theorem statements for immediate readability.
  2. Notation for the holdout selection procedure and the validation-selection cost should be introduced with a short display equation or algorithm box, as the current description leaves the exact form of the additive cost implicit.
  3. The real-data experiments are described only at high level; a brief table or paragraph clarifying the precise loss, the black-box model, and how δ is implicitly realized would help readers assess transfer beyond squared loss.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their careful summary of our work and for the positive assessment of its significance in providing a finite-sample minimax characterization and safe estimator for black-box assisted regression. No major comments were listed in the report, so we have no specific points requiring response or revision at this time. We remain available to address any additional questions or to incorporate clarifications if the editor or referee requests them.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The provided abstract and description present a standard finite-sample minimax analysis for nonparametric regression under an L2 closeness assumption to a fixed black-box predictor. The phase transition at δ_c(n) ≍ n^{-β/(2β+d)} and leading risk min{δ², n^{-2β/(2β+d)}} follow from conventional bias-variance and minimax lower-bound techniques in the field; the Safe Residual Estimator is defined explicitly via residual correction, zero initialization, and holdout selection, with performance guarantees stated relative to the minimax term plus an additive validation cost. No self-definitional equations, fitted parameters renamed as predictions, or load-bearing self-citations appear in the given text. The derivation chain is self-contained against external nonparametric regression benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The analysis rests on standard nonparametric regression assumptions for the rate n^{-2β/(2β+d)}; δ is treated as an unknown parameter defining the closeness regime. No new entities are postulated.

free parameters (1)
  • δ
    Unknown radius measuring L2 closeness between f* and f0; determines the phase transition regime.
axioms (1)
  • domain assumption Target belongs to a nonparametric function class (e.g., Holder with smoothness β in dimension d) for which the standard minimax rate is n^{-2β/(2β+d)}.
    Invoked to obtain the nonparametric baseline rate against which the assisted risk is compared.

pith-pipeline@v0.9.1-grok · 5776 in / 1358 out tokens · 36572 ms · 2026-06-25T21:06:18.185718+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

90 extracted references · 10 canonical work pages

  1. [1]

    Self-training: A survey

    Amini, M.-R., Feofanov, V., Pauletto, L., Hadjadj, L., Devijver, \'E ., and Maximov, Y. Self-training: A survey. Neurocomputing, 616: 0 128904, 2025. doi:10.1016/j.neucom.2024.128904

  2. [2]

    and Bates, Stephen and Fannjiang, Clara and Jordan, Michael I

    Angelopoulos, A. N., Bates, S., Fannjiang, C., Jordan, M. I., and Zrnic, T. Prediction-powered inference. Science, 382 0 (6671): 0 669--674, 2023. doi:10.1126/science.adi6000

  3. [3]

    N., Duchi, J

    Angelopoulos, A. N., Duchi, J. C., and Zrnic, T. Ppi++: Efficient prediction-powered inference, 2024

  4. [4]

    A survey of cross-validation procedures for model selection , volume =

    Arlot, S. and Celisse, A. A survey of cross-validation procedures for model selection. Statistics Surveys, 4: 0 40--79, 2010. doi:10.1214/09-SS054

  5. [5]

    Qwen technical report, 2023

    Bai, J., Bai, S., Chu, Y., et al. Qwen technical report, 2023

  6. [6]

    A theory of label propagation for subpopulation shift

    Cai, T., Gao, R., Lee, J., and Lei, Q. A theory of label propagation for subpopulation shift. In Meila, M. and Zhang, T. (eds.), Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pp.\ 1170--1182. PMLR, 18--24 Jul 2021. URL https://proceedings.mlr.press/v139/cai21b.html

  7. [7]

    Cai, T. T. and Pu, H. Transfer learning for nonparametric regression: Non-asymptotic minimax analysis and adaptive procedure, 2024

  8. [8]

    URL https: //doi.org/10.1007/s10208-006-0196-8

    Caponnetto, A. and De Vito, E. Optimal rates for the regularized least-squares algorithm. Foundations of Computational Mathematics, 7 0 (3): 0 331--368, 2007. doi:10.1007/s10208-006-0196-8

  9. [9]

    and Caron, F

    Cortinovis, S. and Caron, F. Fab-ppi: Frequentist, assisted by bayes, prediction-powered inference, 2025

  10. [10]

    Local Polynomial Modelling and Its Applications

    Fan, J. Local Polynomial Modelling and Its Applications. Routledge, 1996. ISBN 9780203748725

  11. [11]

    A., Dhingra, B., Globerson, A., and Cohen, W

    Fisch, A., Maynez, J., Hofer, R. A., Dhingra, B., Globerson, A., and Cohen, W. W. Stratified prediction-powered inference for hybrid language model evaluation, 2024

  12. [12]

    Deep learning, volume 1

    Goodfellow, I., Bengio, Y., Courville, A., and Bengio, Y. Deep learning, volume 1. MIT press Cambridge, 2016

  13. [13]

    URLhttps://doi.org/10.1109/ICCV51070.2023.00387

    Gu, Z., Xu, C., Yang, J., and Cui, Z. Few-shot continual infomax learning. In 2023 IEEE/CVF International Conference on Computer Vision (ICCV), pp.\ 19167--19176, Paris, France, 2023. doi:10.1109/ICCV51070.2023.01761

  14. [14]

    The Elements of Statistical Learning

    Hastie, T., Tibshirani, R., and Friedman, J. The Elements of Statistical Learning. Springer-Verlag, 2001

  15. [15]

    Deep residual learning for image recognition

    He, K., Zhang, X., Ren, S., and Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016

  16. [16]

    A., Maynez, J., Dhingra, B., Fisch, A., Globerson, A., and Cohen, W

    Hofer, R. A., Maynez, J., Dhingra, B., Fisch, A., Globerson, A., and Cohen, W. W. Bayesian prediction-powered inference, 2024

  17. [18]

    C., Andreadis, P., and Diochnos, D

    Kage, P., Rothenberger, J. C., Andreadis, P., and Diochnos, D. I. A review of pseudo-labeling for computer vision, 2025

  18. [19]

    Transformers are minimax optimal nonparametric in-context learners

    Kim, J., Nakamaki, T., and Suzuki, T. Transformers are minimax optimal nonparametric in-context learners. In Advances in Neural Information Processing Systems, volume 37, pp.\ 106667--106713, 2024

  19. [20]

    M., Lu, K., Zrnic, T., Wang, S., and Bates, S

    Kluger, D. M., Lu, K., Zrnic, T., Wang, S., and Bates, S. Prediction-powered inference with imputed covariates and nonuniform sampling, 2025

  20. [21]

    and Hinton, G

    Krizhevsky, A. and Hinton, G. Learning multiple layers of features from tiny images. Technical report, University of Toronto, 2009

  21. [22]

    and Orabona, F

    Kuzborskij, I. and Orabona, F. Stability and hypothesis transfer learning. In Proceedings of the 30th International Conference on Machine Learning, volume 28, pp.\ 942--950. PMLR, 2013

  22. [23]

    Pseudo-label : The simple and efficient semi-supervised learning method for deep neural networks

    Lee, D.-H. Pseudo-label : The simple and efficient semi-supervised learning method for deep neural networks. In ICML 2013 Workshop on Challenges in Representation Learning, 2013

  23. [24]

    Tripartite weight-space ensemble for few-shot class-incremental learning

    Lee, J., Hayat, M., and Yun, S. Tripartite weight-space ensemble for few-shot class-incremental learning. In Proceedings of the Computer Vision and Pattern Recognition Conference, pp.\ 15329--15338, 2025

  24. [25]

    Do we really need to access the source data? S ource hypothesis transfer for unsupervised domain adaptation

    Liang, J., Hu, D., and Feng, J. Do we really need to access the source data? S ource hypothesis transfer for unsupervised domain adaptation. In Proceedings of the 37th International Conference on Machine Learning, volume 119, pp.\ 6028--6039. PMLR, 2020

  25. [26]

    A comprehensive survey on test-time adaptation under distribution shifts

    Liang, J., He, R., and Tan, T. A comprehensive survey on test-time adaptation under distribution shifts. International Journal of Computer Vision, 133 0 (1): 0 31--64, 2025

  26. [27]

    and Reimherr, M

    Lin, H. and Reimherr, M. Model-robust and adaptive-optimal transfer learning for tackling concept shifts in nonparametric regression, 2025

  27. [28]

    C., and Oberst, M

    Mani, P., Xu, P., Lipton, Z. C., and Oberst, M. No free lunch: Non-asymptotic analysis of prediction-powered inference, 2025

  28. [29]

    Density Estimation via Model Selection, pp.\ 201--277

    Massart, P. Density Estimation via Model Selection, pp.\ 201--277. Springer Berlin Heidelberg, 2007. doi:10.1007/978-3-540-48503-2_7

  29. [30]

    Pan, S. J. and Yang, Q. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22 0 (10): 0 1345--1359, 2010. doi:10.1109/TKDE.2009.191

  30. [31]

    and Smola, A

    Sch \"o lkopf, B. and Smola, A. J. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. The MIT Press, 2001

  31. [32]

    Siegel, J. W. Optimal approximation rates for deep ReLU neural networks on sobolev and besov spaces. Journal of Machine Learning Research, 24 0 (357): 0 1--52, 2023

  32. [33]

    Fixmatch: Simplifying semi-supervised learning with consistency and confidence

    Sohn, K., Berthelot, D., Carlini, N., et al. Fixmatch: Simplifying semi-supervised learning with consistency and confidence. In Advances in Neural Information Processing Systems, volume 33, pp.\ 596--608, 2020

  33. [34]

    Stone, C. J. Optimal global rates of convergence for nonparametric regression. The Annals of Statistics, 10 0 (4): 0 1040--1053, 1982

  34. [35]

    Tsybakov, A. B. Introduction to Nonparametric Estimation. Springer, 2009

  35. [36]

    Tent: Fully test-time adaptation by entropy minimization

    Wang, D., Shelhamer, E., Liu, S., Olshausen, B., and Darrell, T. Tent: Fully test-time adaptation by entropy minimization. In International Conference on Learning Representations, 2021

  36. [37]

    Continual test-time domain adaptation

    Wang, Q., Fink, O., Van Gool, L., and Dai, D. Continual test-time domain adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 7201--7211, 2022

  37. [38]

    Weiss, T

    Weiss, K., Khoshgoftaar, T. M., and Wang, D. A survey of transfer learning. Journal of Big Data, 3 0 (1): 0 9, 2016. doi:10.1186/s40537-016-0043-6

  38. [39]

    Xie, Q., Luong, M.-T., Hovy, E., and Le, Q. V. Self-training with noisy student improves imagenet classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 10687--10698, 2020

  39. [40]

    Qwen3 technical report, 2025

    Yang, A., Li, A., Yang, B., et al. Qwen3 technical report, 2025

  40. [41]

    On the optimal approximation of sobolev and besov functions using deep relu neural networks

    Yang, Y. On the optimal approximation of sobolev and besov functions using deep relu neural networks. Applied and Computational Harmonic Analysis, 79: 0 101797, 2025. doi:10.1016/j.acha.2025.101797

  41. [42]

    Unsupervised word sense disambiguation rivaling supervised methods

    Yarowsky, D. Unsupervised word sense disambiguation rivaling supervised methods. In 33rd Annual Meeting of the Association for Computational Linguistics, pp.\ 189--196, 1995

  42. [43]

    N., and Ma, T

    Zhang, H., Dauphin, Y. N., and Ma, T. Residual learning without normalization via better initialization. In International Conference on Learning Representations, 2019

  43. [44]

    and Cand \`e s, E

    Zrnic, T. and Cand \`e s, E. J. Cross-prediction-powered inference, 2024

  44. [45]

    Science , year =

    Prediction-powered inference , author =. Science , year =

  45. [46]

    2024 , eprint =

    PPI++: Efficient Prediction-Powered Inference , author =. 2024 , eprint =

  46. [47]

    2024 , eprint =

    Cross-Prediction-Powered Inference , author =. 2024 , eprint =

  47. [48]

    2024 , eprint =

    Stratified Prediction-Powered Inference for Hybrid Language Model Evaluation , author =. 2024 , eprint =

  48. [49]

    2024 , eprint =

    Bayesian Prediction-Powered Inference , author =. 2024 , eprint =

  49. [50]

    2025 , eprint =

    FAB-PPI: Frequentist, Assisted by Bayes, Prediction-Powered Inference , author =. 2025 , eprint =

  50. [51]

    2025 , eprint =

    No Free Lunch: Non-Asymptotic Analysis of Prediction-Powered Inference , author =. 2025 , eprint =

  51. [52]

    2025 , eprint =

    Prediction-Powered Inference with Imputed Covariates and Nonuniform Sampling , author =. 2025 , eprint =

  52. [53]

    Proceedings of the 38th International Conference on Machine Learning , year =

    Learning Transferable Visual Models From Natural Language Supervision , author =. Proceedings of the 38th International Conference on Machine Learning , year =

  53. [54]

    2025 , eprint =

    Qwen3 Technical Report , author =. 2025 , eprint =

  54. [55]

    2023 , eprint =

    Qwen Technical Report , author =. 2023 , eprint =

  55. [56]

    Introduction to Nonparametric Estimation , author =

  56. [57]

    High-Dimensional Statistics: A Non-Asymptotic Viewpoint , author =

  57. [58]

    All of nonparametric statistics , author =

  58. [59]

    The Annals of Statistics , year =

    Optimal Global Rates of Convergence for Nonparametric Regression , author =. The Annals of Statistics , year =

  59. [60]

    Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , author =

  60. [61]

    2001 , pages =

    The Elements of Statistical Learning , author =. 2001 , pages =

  61. [62]

    Deep learning , author =

  62. [63]

    , journal =

    Siegel, Jonathan W. , journal =. Optimal Approximation Rates for Deep. 2023 , volume =

  63. [64]

    Advances in Neural Information Processing Systems , volume =

    Transformers are Minimax Optimal Nonparametric In-Context Learners , author =. Advances in Neural Information Processing Systems , volume =

  64. [65]

    Applied and Computational Harmonic Analysis , year =

    On the optimal approximation of Sobolev and Besov functions using deep ReLU neural networks , author =. Applied and Computational Harmonic Analysis , year =

  65. [66]

    2024 , eprint =

    Transfer Learning for Nonparametric Regression: Non-asymptotic Minimax Analysis and Adaptive Procedure , author =. 2024 , eprint =

  66. [67]

    2025 , eprint =

    Model-Robust and Adaptive-Optimal Transfer Learning for Tackling Concept Shifts in Nonparametric Regression , author =. 2025 , eprint =

  67. [68]

    Proceedings of the 30th International Conference on Machine Learning , year =

    Stability and Hypothesis Transfer Learning , author =. Proceedings of the 30th International Conference on Machine Learning , year =

  68. [69]

    Do We Really Need to Access the Source Data?

    Liang, Jian and Hu, Dapeng and Feng, Jiashi , booktitle =. Do We Really Need to Access the Source Data?. 2020 , pages =

  69. [70]

    ICML 2013 Workshop on Challenges in Representation Learning , year =

    Pseudo-Label : The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks , author =. ICML 2013 Workshop on Challenges in Representation Learning , year =

  70. [71]

    33rd Annual Meeting of the Association for Computational Linguistics , year =

    Unsupervised Word Sense Disambiguation Rivaling Supervised Methods , author =. 33rd Annual Meeting of the Association for Computational Linguistics , year =

  71. [72]

    Advances in Neural Information Processing Systems , volume =

    FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence , author =. Advances in Neural Information Processing Systems , volume =

  72. [73]

    Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , year =

    Self-Training With Noisy Student Improves ImageNet Classification , author =. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , year =

  73. [74]

    Neurocomputing , year =

    Self-training: A survey , author =. Neurocomputing , year =

  74. [75]

    2025 , eprint =

    A Review of Pseudo-Labeling for Computer Vision , author =. 2025 , eprint =

  75. [76]

    International Journal of Computer Vision , year =

    A comprehensive survey on test-time adaptation under distribution shifts , author =. International Journal of Computer Vision , year =

  76. [77]

    International Conference on Learning Representations , year =

    Tent: Fully Test-Time Adaptation by Entropy Minimization , author =. International Conference on Learning Representations , year =

  77. [78]

    Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , year =

    Continual Test-Time Domain Adaptation , author =. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , year =

  78. [79]

    Statistics Surveys , year =

    A survey of cross-validation procedures for model selection , author =. Statistics Surveys , year =

  79. [80]

    Concentration Inequalities and Model Selection , year =

    Density Estimation via Model Selection , author =. Concentration Inequalities and Model Selection , year =

  80. [81]

    Foundations of Computational Mathematics , year =

    Optimal Rates for the Regularized Least-Squares Algorithm , author =. Foundations of Computational Mathematics , year =

Showing first 80 references.