Adaptive Calibration in Non-Stationary Environments

Haipeng Luo; Junyan Liu; Lillian J. Ratliff

arxiv: 2605.11490 · v2 · pith:Z3MN3YHFnew · submitted 2026-05-12 · 💻 cs.LG · stat.ML

Adaptive Calibration in Non-Stationary Environments

Junyan Liu , Haipeng Luo , Lillian J. Ratliff This is my paper

Pith reviewed 2026-05-25 06:35 UTC · model grok-4.3

classification 💻 cs.LG stat.ML

keywords adaptive calibrationonline predictionnon-stationary environmentscalibration errorepoch schedulingnon-uniform partitioni.i.d. segmentsadversarial robustness

0 comments

The pith

Algorithms achieve calibration error bounds that adapt to unknown non-stationarity measured by K segments and C deviation without prior knowledge of either.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a suite of online prediction algorithms whose calibration error automatically adjusts to the unknown degree of non-stationarity in the environment. It establishes rates that match the optimal stationary i.i.d. case when there is one segment and no deviation, while recovering known adversarial bounds when the environment changes frequently or deviates substantially. This addresses the gap between conservative adversarial methods and methods that assume fixed distributions, by using an epoch-based scheduling approach. The algorithms require no input about the number of stationary segments or the deviation measure. A non-uniform partition of the prediction space supports the adaptation across multiple error measures.

Core claim

The central claim is that the algorithms attain Õ(min{√T+(TC)^{1/3},√(KT)}) for ℓ1 calibration error and Õ(min{(1+C)^{1/3},K}) for both ℓ2 and pseudo KL calibration error, where T is the horizon, K is the unknown number of i.i.d. segments, and C is the unknown minimal ℓ1 deviation of mean outcomes. These bounds hold without knowledge of K or C. The construction builds on prior work by introducing epoch-based scheduling together with a novel non-uniform partition of the prediction space that allocates finer resolution near the underlying ground truth.

What carries the argument

Epoch-based scheduling combined with a non-uniform partition of the prediction space that allocates finer resolution near the ground truth.

If this is right

The bounds reduce to optimal rates for stationary i.i.d. environments when C=0 and K=1.
The bounds recover existing adversarial guarantees when C and K are linear in T.
The same algorithms work for ℓ1, ℓ2, and pseudo KL calibration measures.
No prior knowledge of K or C is needed to attain the adaptive rates.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The non-uniform partition technique could be tested for improving resolution in other online decision problems that involve quantizing predictions.
The adaptation mechanism suggests possible extensions to settings with continuous drift rather than abrupt segment changes.
Empirical runs on data with measured change points would clarify whether the theoretical interpolation appears in practice.
The approach may connect to regret bounds in other non-stationary online learning tasks that track similar segment counts.

Load-bearing premise

The non-stationarity of the environment is fully captured by the two unknown quantities K and C, allowing the algorithms to achieve the stated rates without prior knowledge of these quantities.

What would settle it

Run the algorithm on a constructed sequence with known fixed K and C, then check whether the observed calibration error exceeds the claimed bound by more than logarithmic factors over a range of T.

Figures

Figures reproduced from arXiv: 2605.11490 by Haipeng Luo, Junyan Liu, Lillian J. Ratliff.

**Figure 1.** Figure 1: Illustration of the non-uniform partition created by [PITH_FULL_IMAGE:figures/full_fig_p010_1.png] view at source ↗

read the original abstract

Making calibrated online predictions is a central challenge in modern AI systems. Much of the existing literature focuses on fully adversarial environments where outcomes may be arbitrary, leading to conservative algorithms that can perform suboptimally in more benign settings, such as when outcomes are nearly stationary. This gap raises a natural question: can we design online prediction algorithms whose calibration error automatically adapts to the degree of non-stationarity in the environment, smoothly interpolating between i.i.d. and adversarial regimes? We answer this question in the affirmative and develop a suite of algorithms that achieve adaptive calibration guarantees under multiple calibration measures. Specifically, with $T$ being the number of rounds, $K$ being the unknown number of i.i.d. segments of the environment, and $C\in[0,T]$ being another unknown non-stationary measure defined as the minimal $\ell_1$ deviation of the mean outcomes, our algorithms attain $\widetilde{O}(\min\{\sqrt{T}+(TC)^{\frac{1}{3}}, \sqrt{KT}\})$ for $\ell_1$ calibration error and $\widetilde{O}(\min\{(1+C)^{\frac{1}{3}}, K\})$ for both $\ell_2$ and pseudo KL calibration error. These bounds match the optimal rates in the stationary case ($C=0$ and $K=1$) and recover known guarantees in the fully adversarial regime ($C, K=\Omega(T)$). Our approach builds on and extends prior work [Hu et al., 2026, Luo et al., 2025], introducing an epoch-based scheduling together with a novel non-uniform partition of the prediction space that allocates finer resolution near the underlying ground truth.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper provides adaptive online calibration algorithms whose bounds interpolate between stationary and adversarial regimes using epoch scheduling and non-uniform partitioning.

read the letter

The central point is that these algorithms attain calibration guarantees that get better automatically as the environment becomes more stationary, without any prior knowledge of the number of segments or the amount of change. The work introduces an epoch-based scheduling rule together with a non-uniform partition of the prediction space. The partition gives finer resolution near the ground truth, which helps control the error terms. This combination produces the stated min bounds for l1 calibration error and the corresponding ones for l2 and pseudo KL. The rates match the known stationary optima when C equals zero and K equals one, and they recover the adversarial bounds when both parameters grow with T. That interpolation is the main contribution. The fact that the algorithms do not need to know K or C in advance is also practically relevant, since those quantities are usually unavailable. It builds directly on the two earlier papers by the same group, but the scheduling and partition steps look like the new elements that enable the adaptation. On the downside, the model of non-stationarity is quite specific: it assumes the environment consists of K i.i.d. blocks whose means differ by a total l1 amount C. Real data streams often have smoother drifts or other patterns that might not fit this exactly, so the practical scope could be narrower than the abstract suggests. Because the full derivations are not in the abstract, it is also not possible to check whether the third-root terms arise cleanly or whether they require additional technical conditions on the partition or the epoch lengths. This paper is aimed at people working on online learning theory, particularly those interested in calibration under distribution shift. A reader who follows the literature on adaptive online algorithms would get the most out of it. The result is specific enough and the approach is novel enough relative to the cited work that it deserves a serious referee, even if the proofs turn out to need some polishing. I would recommend sending it to peer review.

Referee Report

0 major / 0 minor

Summary. The manuscript develops a suite of online prediction algorithms for calibration that adapt to unknown non-stationarity in the environment. It defines K as the unknown number of i.i.d. segments and C as the minimal ℓ1 deviation of mean outcomes, and claims that the algorithms achieve Õ(min{√T + (TC)^{1/3}, √KT}) for ℓ1 calibration error and Õ(min{(1+C)^{1/3}, K}) for both ℓ2 and pseudo KL calibration error. These rates match the optimal stationary bounds when C=0 and K=1, and recover known adversarial guarantees when C,K=Ω(T). The approach extends prior work via epoch-based scheduling combined with a novel non-uniform partition of the prediction space that allocates finer resolution near the ground truth.

Significance. If the stated rates are achieved without knowledge of K or C, the work provides a meaningful advance by bridging stationary and adversarial calibration regimes with a single set of algorithms. The automatic adaptation is practically relevant for AI systems facing mixed environments. The epoch-based scheduling and non-uniform partition are presented as the key technical contributions extending Hu et al. (2026) and Luo et al. (2025). No machine-checked proofs or reproducible code are mentioned in the provided description.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their accurate summary of the manuscript and for recommending acceptance. The referee correctly identifies the adaptive rates and the technical extensions via epoch-based scheduling and non-uniform partitioning.

Circularity Check

0 steps flagged

No significant circularity; derivation self-contained via novel techniques

full rationale

The paper's central adaptive bounds are obtained via epoch-based scheduling and a novel non-uniform partition of the prediction space. These mechanisms are introduced as new contributions that enable interpolation between regimes without prior knowledge of K and C. The cited prior works (Hu et al. 2026, Luo et al. 2025) provide background but are not load-bearing for the new rates; the derivation does not reduce by construction to fitted inputs, self-definitions, or self-citation chains. Bounds are shown to match known extremes (C=0/K=1 and C,K=Ω(T)) through the new analysis rather than by renaming or smuggling ansatzes. No quoted step exhibits the required reduction to inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract only; full details on parameters, axioms, and entities unavailable. No free parameters, invented entities, or non-standard axioms are mentioned in the provided text.

axioms (1)

domain assumption Standard online learning assumptions on bounded outcomes and convex loss
Implicit in any calibration guarantee for online prediction

pith-pipeline@v0.9.0 · 5837 in / 1236 out tokens · 25098 ms · 2026-05-25T06:35:37.890209+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Instance-Adaptive Online Multicalibration
cs.LG 2026-05 conditional novelty 8.0

A single online multicalibration algorithm adaptively refines a dyadic grid and achieves instance-dependent rates: O(T^{2/3}) worst-case, O(sqrt T) for marginal stochastic data, and O(sqrt(JT)) for J-piecewise station...
Instance-Adaptive Online Multicalibration
cs.LG 2026-05 unverdicted novelty 7.0

A single algorithm for online multicalibration achieves instance-adaptive rates by dynamically refining a dyadic prediction grid, recovering the worst-case Õ(T^{2/3}) bound and improving to Õ(√T) in marginal stochasti...

Reference graph

Works this paper leans on

20 extracted references · 20 canonical work pages · cited by 1 Pith paper · 1 internal anchor

[1]

B asiok, P

J. B asiok, P. Gopalan, L. Hu, and P. Nakkiran. A unifying theory of distance from calibration. In Proceedings of the 55th Annual ACM Symposium on Theory of Computing, pages 1727--1740, 2023

work page 2023
[2]

L. Chen, H. Luo, and C.-Y. Wei. Impossible tuning made possible: A new expert algorithm and its applications. In Conference on Learning Theory, 2021

work page 2021
[3]

Dagan, C

Y. Dagan, C. Daskalakis, M. Fishelson, N. Golowich, R. Kleinberg, and P. Okoroafor. Breaking the T^ 2/3 barrier for sequential calibration. In Proceedings of the 57th Annual ACM Symposium on Theory of Computing, 2025

work page 2025
[4]

A. P. Dawid. The well-calibrated bayesian. Journal of the American statistical Association, 77 0 (379): 0 605--610, 1982

work page 1982
[5]

Fishelson, R

M. Fishelson, R. Kleinberg, P. Okoroafor, R. P. Leme, J. Schneider, and Y. Teng. Full swap regret and discretized calibration. arXiv preprint arXiv:2502.09332, 2025

work page arXiv 2025
[6]

calibeating

D. P. Foster and S. Hart. “calibeating”: Beating forecasters at their own game. Theoretical Economics, 18 0 (4): 0 1441--1474, 2023

work page 2023
[7]

D. P. Foster and R. V. Vohra. Asymptotic calibration. Biometrika, 85 0 (2): 0 379--390, 1998

work page 1998
[8]

Haghtalab, M

N. Haghtalab, M. Qiao, K. Yang, and E. Zhao. Truthfulness of calibration measures. In Advances in Neural Information Processing Systems, 2024

work page 2024
[9]

Hazan, A

E. Hazan, A. Agarwal, and S. Kale. Logarithmic regret algorithms for online convex optimization. Machine Learning, 69 0 (2): 0 169--192, 2007

work page 2007
[10]

Hu and Y

L. Hu and Y. Wu. Calibration error for decision making. arXiv preprint arXiv:2404.13503, 2024

work page arXiv 2024
[11]

L. Hu, H. Luo, S. Senapati, and V. Sharan. Efficient swap multicalibration of elicitable properties. Conference on Learning Theory, 2026

work page 2026
[12]

Instance-Adaptive Online Multicalibration

Z. Huang, J. Morgenstern, A. Roth, and C. J. Zhang. Instance-adaptive online multicalibration. arXiv preprint arXiv:2605.09273, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[13]

T. Jin, J. Liu, C. Rouyer, W. Chang, C.-Y. Wei, and H. Luo. No-regret online reinforcement learning with adversarial losses and transitions. In Advances in Neural Information Processing Systems, 2023

work page 2023
[14]

S. M. Kakade and D. P. Foster. Deterministic calibration and nash equilibrium. Journal of Computer and System Sciences, 74 0 (1): 0 115--130, 2008

work page 2008
[15]

Kleinberg, R

B. Kleinberg, R. P. Leme, J. Schneider, and Y. Teng. U-calibration: Forecasting for an unknown agent. In Conference on Learning Theory, 2023

work page 2023
[16]

J. Liu, H. Luo, Z. Zhang, and L. J. Ratliff. Online learning for uninformed markov games: Empirical nash-value regret and non-stationarity adaptation. In Conference on Learning Theory, 2026

work page 2026
[17]

H. Luo, S. Senapati, and V. Sharan. Optimal multiclass u-calibration error and beyond. Advances in Neural Information Processing Systems, 2024

work page 2024
[18]

H. Luo, S. Senapati, and V. Sharan. Simultaneous swap regret minimization via kl-calibration. In Advances in Neural Information Processing Systems, 2025

work page 2025
[19]

Qiao and G

M. Qiao and G. Valiant. Stronger calibration lower bounds via sidestepping. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing, 2021

work page 2021
[20]

Qiao and L

M. Qiao and L. Zheng. On the distance from calibration in sequential prediction. In Conference on Learning Theory, 2024

work page 2024

[1] [1]

B asiok, P

J. B asiok, P. Gopalan, L. Hu, and P. Nakkiran. A unifying theory of distance from calibration. In Proceedings of the 55th Annual ACM Symposium on Theory of Computing, pages 1727--1740, 2023

work page 2023

[2] [2]

L. Chen, H. Luo, and C.-Y. Wei. Impossible tuning made possible: A new expert algorithm and its applications. In Conference on Learning Theory, 2021

work page 2021

[3] [3]

Dagan, C

Y. Dagan, C. Daskalakis, M. Fishelson, N. Golowich, R. Kleinberg, and P. Okoroafor. Breaking the T^ 2/3 barrier for sequential calibration. In Proceedings of the 57th Annual ACM Symposium on Theory of Computing, 2025

work page 2025

[4] [4]

A. P. Dawid. The well-calibrated bayesian. Journal of the American statistical Association, 77 0 (379): 0 605--610, 1982

work page 1982

[5] [5]

Fishelson, R

M. Fishelson, R. Kleinberg, P. Okoroafor, R. P. Leme, J. Schneider, and Y. Teng. Full swap regret and discretized calibration. arXiv preprint arXiv:2502.09332, 2025

work page arXiv 2025

[6] [6]

calibeating

D. P. Foster and S. Hart. “calibeating”: Beating forecasters at their own game. Theoretical Economics, 18 0 (4): 0 1441--1474, 2023

work page 2023

[7] [7]

D. P. Foster and R. V. Vohra. Asymptotic calibration. Biometrika, 85 0 (2): 0 379--390, 1998

work page 1998

[8] [8]

Haghtalab, M

N. Haghtalab, M. Qiao, K. Yang, and E. Zhao. Truthfulness of calibration measures. In Advances in Neural Information Processing Systems, 2024

work page 2024

[9] [9]

Hazan, A

E. Hazan, A. Agarwal, and S. Kale. Logarithmic regret algorithms for online convex optimization. Machine Learning, 69 0 (2): 0 169--192, 2007

work page 2007

[10] [10]

Hu and Y

L. Hu and Y. Wu. Calibration error for decision making. arXiv preprint arXiv:2404.13503, 2024

work page arXiv 2024

[11] [11]

L. Hu, H. Luo, S. Senapati, and V. Sharan. Efficient swap multicalibration of elicitable properties. Conference on Learning Theory, 2026

work page 2026

[12] [12]

Instance-Adaptive Online Multicalibration

Z. Huang, J. Morgenstern, A. Roth, and C. J. Zhang. Instance-adaptive online multicalibration. arXiv preprint arXiv:2605.09273, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026

[13] [13]

T. Jin, J. Liu, C. Rouyer, W. Chang, C.-Y. Wei, and H. Luo. No-regret online reinforcement learning with adversarial losses and transitions. In Advances in Neural Information Processing Systems, 2023

work page 2023

[14] [14]

S. M. Kakade and D. P. Foster. Deterministic calibration and nash equilibrium. Journal of Computer and System Sciences, 74 0 (1): 0 115--130, 2008

work page 2008

[15] [15]

Kleinberg, R

B. Kleinberg, R. P. Leme, J. Schneider, and Y. Teng. U-calibration: Forecasting for an unknown agent. In Conference on Learning Theory, 2023

work page 2023

[16] [16]

J. Liu, H. Luo, Z. Zhang, and L. J. Ratliff. Online learning for uninformed markov games: Empirical nash-value regret and non-stationarity adaptation. In Conference on Learning Theory, 2026

work page 2026

[17] [17]

H. Luo, S. Senapati, and V. Sharan. Optimal multiclass u-calibration error and beyond. Advances in Neural Information Processing Systems, 2024

work page 2024

[18] [18]

H. Luo, S. Senapati, and V. Sharan. Simultaneous swap regret minimization via kl-calibration. In Advances in Neural Information Processing Systems, 2025

work page 2025

[19] [19]

Qiao and G

M. Qiao and G. Valiant. Stronger calibration lower bounds via sidestepping. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing, 2021

work page 2021

[20] [20]

Qiao and L

M. Qiao and L. Zheng. On the distance from calibration in sequential prediction. In Conference on Learning Theory, 2024

work page 2024