Adaptive Pricing in Insurance: Generalized Linear Models and Gaussian Process Regression Approaches
Pith reviewed 2026-05-25 11:00 UTC · model grok-4.3
The pith
If insurance prices are chosen with suitably decreasing variability, maximum quasi-likelihood estimates converge to true parameter values and prices converge to the revenue-maximizing level.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
In the adaptive GLM design, if prices are chosen with suitably decreasing variability, the MQLE parameters eventually exist and converge to the correct values, which in turn implies that the sequence of chosen prices will also converge to the optimal price. The adaptive GP regression model samples demand and claims from Gaussian Processes and selects selling prices by the upper confidence bound rule. Both algorithms are analyzed under delayed claims.
What carries the argument
maximum quasi-likelihood estimation (MQLE) inside an adaptive pricing loop whose price sequence is constructed with suitably decreasing variability
If this is right
- The chosen prices converge to the revenue-maximizing price.
- The same convergence holds when claim information is received with delay.
- An alternative Gaussian process policy achieves comparable learning and exploitation balance without parametric assumptions.
- Regret, measured as expected revenue loss relative to the optimal price, is controlled through the exploration schedule.
Where Pith is reading between the lines
- The convergence argument may extend to other revenue-management domains that already use GLM demand models.
- Real-world insurance data could be used to test whether the required rate of variability reduction produces acceptable short-term revenue loss.
- Hybrid policies that switch from GP exploration to GLM exploitation once parameters stabilize could reduce long-run regret.
Load-bearing premise
Demand and claims follow generalized linear models and the pricing rule reduces variability at a rate sufficient for the estimates to exist and converge.
What would settle it
A sequence of observed demands and claims generated under a pricing policy whose variability does not decrease sufficiently, such that the MQLE estimates fail to converge or the prices fail to approach the revenue optimum.
Figures
read the original abstract
We study the application of dynamic pricing to insurance. We view this as an online revenue management problem where the insurance company looks to set prices to optimize the long-run revenue from selling a new insurance product. We develop two pricing models: an adaptive Generalized Linear Model (GLM) and an adaptive Gaussian Process (GP) regression model. Both balance between exploration, where we choose prices in order to learn the distribution of demands & claims for the insurance product, and exploitation, where we myopically choose the best price from the information gathered so far. The performance of the pricing policies is measured in terms of regret: the expected revenue loss caused by not using the optimal price. As is commonplace in insurance, we model demand and claims by GLMs. In our adaptive GLM design, we use the maximum quasi-likelihood estimation (MQLE) to estimate the unknown parameters. We show that, if prices are chosen with suitably decreasing variability, the MQLE parameters eventually exist and converge to the correct values, which in turn implies that the sequence of chosen prices will also converge to the optimal price. In the adaptive GP regression model, we sample demand and claims from Gaussian Processes and then choose selling prices by the upper confidence bound rule. We also analyze these GLM and GP pricing algorithms with delayed claims. Although similar results exist in other domains, this is among the first works to consider dynamic pricing problems in the field of insurance. We also believe this is the first work to consider Gaussian Process regression in the context of insurance pricing. These initial findings suggest that online machine learning algorithms could be a fruitful area of future investigation and application in insurance.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript develops two adaptive pricing policies for insurance products where demand and claims follow GLMs. The first uses maximum quasi-likelihood estimation (MQLE) under a pricing policy with suitably decreasing variability, proving eventual existence and consistency of the MQLE estimates and consequent convergence of prices to the revenue-maximizing price. The second employs Gaussian process regression with an upper-confidence-bound rule for price selection. Both are analyzed for regret, and the GLM and GP approaches are extended to the case of delayed claims. The work positions these as among the first applications of online learning methods to insurance pricing.
Significance. If the convergence and regret results are rigorously established, the paper makes a useful contribution by importing standard adaptive-design techniques for GLMs and introducing GP regression into insurance pricing, a domain where such dynamic, data-driven policies have received limited attention. The explicit link between controlled exploration (decreasing variability) and consistency of MQLE is a standard but practically relevant technical step; the GP-UCB analysis and delayed-claims extension broaden the scope.
major comments (2)
- [theoretical results on MQLE convergence] The central claim (abstract and theoretical results section) that a policy with suitably decreasing variability guarantees eventual existence and consistency of the MQLE, which in turn implies price convergence, rests on standard GLM adaptive-design conditions. The manuscript should state the precise rate condition on price variability (e.g., the summability or decay rate required for the design matrix to accumulate full rank) and supply the key steps or error bounds that close the implication from parameter consistency to price convergence.
- [Gaussian process regression model and regret analysis] For the GP regression approach, the regret analysis and UCB rule require explicit assumptions on the kernel, the sub-Gaussian noise, and the resulting regret bound (e.g., O(sqrt(T log T)) or similar). Without these, it is difficult to compare the GP policy's performance guarantees with the GLM policy or with existing UCB results in the literature.
minor comments (2)
- Notation for the decreasing-variability condition and for the delayed-claims observation process should be introduced once and used consistently; a short table summarizing the two policies (assumptions, estimation method, exploration mechanism, regret order) would aid readability.
- [Introduction] The abstract states that 'similar results exist in other domains'; adding two or three key references from the adaptive-design or online-learning literature would strengthen the positioning without lengthening the introduction.
Simulated Author's Rebuttal
We thank the referee for the careful reading and constructive comments. The suggestions help clarify the theoretical conditions and assumptions. We address each major comment below and have revised the manuscript accordingly.
read point-by-point responses
-
Referee: [theoretical results on MQLE convergence] The central claim (abstract and theoretical results section) that a policy with suitably decreasing variability guarantees eventual existence and consistency of the MQLE, which in turn implies price convergence, rests on standard GLM adaptive-design conditions. The manuscript should state the precise rate condition on price variability (e.g., the summability or decay rate required for the design matrix to accumulate full rank) and supply the key steps or error bounds that close the implication from parameter consistency to price convergence.
Authors: We agree that the precise rate condition and key steps should be stated explicitly for rigor. In the revised manuscript, we will add that the price variability sequence must satisfy sum_t var(p_t) = infinity (ensuring the design matrix accumulates full rank asymptotically, per standard adaptive GLM results such as those in Lai and Wei (1982)). We will include the key steps: (i) the MQLE exists eventually under this condition by showing the information matrix diverges; (ii) consistency follows from the martingale convergence theorem applied to the quasi-score; (iii) price convergence to the optimum follows from continuous mapping and a uniform bound on the price deviation |p_t - p*| <= C ||theta_hat_t - theta*||, with the error bound O(1/sqrt(t)) in probability. These additions close the implication without altering the main results. revision: yes
-
Referee: [Gaussian process regression model and regret analysis] For the GP regression approach, the regret analysis and UCB rule require explicit assumptions on the kernel, the sub-Gaussian noise, and the resulting regret bound (e.g., O(sqrt(T log T)) or similar). Without these, it is difficult to compare the GP policy's performance guarantees with the GLM policy or with existing UCB results in the literature.
Authors: We agree that explicit assumptions and the regret bound should be stated to facilitate comparison. In the revised manuscript, we will specify: the kernel is continuous and positive definite (e.g., squared exponential with length-scale l); observations are sub-Gaussian with variance proxy sigma^2; and under these, the GP-UCB policy achieves regret O(sqrt(T log T)) (following standard bounds from Srinivas et al. (2010) adapted to the insurance setting with delayed claims). This allows direct comparison to the GLM policy's O(T^{2/3}) regret and to the broader UCB literature. The analysis for delayed claims remains unchanged. revision: yes
Circularity Check
No significant circularity identified
full rationale
The paper's central derivation is a standard consistency result for MQLE under adaptive GLM designs with controlled exploration (decreasing price variability). The abstract states the implication directly as a theorem to be shown, notes that similar results exist in other domains, and relies on external GLM modeling conventions plus regret definitions rather than any self-referential fit, ansatz, or self-citation chain. No load-bearing step reduces by construction to the paper's own inputs or prior self-work; the argument is self-contained against external benchmarks in adaptive design literature.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Demand and claims are modeled by generalized linear models
Reference graph
Works this paper leans on
-
[1]
R. L. Phillips, Pricing and revenue optimization, Stanford Business Books, Stanford, Calif., 2005
work page 2005
-
[2]
K. Talluri, G. van Ryzin, The theory and practice of revenue management, Springer, Boston, MA, 2005. URL https://doi.org/10.1007/b139000
-
[3]
J. A. Nelder, R. W. M. Wedderburn, Generalized linear models, Journal of the Royal Statistical Society 135 (3) (1972) 370–384. doi:10.2307/2344614. URL http://www.jstor.org/stable/2344614
-
[4]
P. McCullagh, J. A. Nelder, Generalized Linear Models, 2nd Edition, Chapman and Hall, London, 1989. URL http://www.utstat.toronto.edu/~brunner/oldclass/2201s11/readings/glmbook.pdf 28
work page 1989
-
[5]
T. L. Lai, H. Robbins, Adaptive design and stochastic approximation, The Annals of Statistics 7 (6) (1979) 1196–1221. doi:10.1214/aos/1176344840
-
[6]
T. Lai, H. Robbins, Consistency and asymptotic efficiency of slope estimates in stochastic approxi- mation schemes, Z. Wahrscheinlichkeitstheorie verw. Gebiete 56 (3) (1981) 329–360. URL https://doi.org/10.1007/BF00536178
-
[7]
T. Lai, H. Robbins, Iterated least squares in multiperiod control, Advances in Applied Mathematics 3 (1) (1982) 50–73. URL https://doi.org/10.1016/S0196-8858(82)80005-5
-
[8]
T. L. Lai, C. Z. Wei, Least squares estimates in stochastic regression models with applications to identification and control of dynamic systems, The Annals of Statistics 10 (1) (1982) 154–166. doi:10.1214/aos/1176345697
-
[9]
Lai, Stochastic approximation: invited paper, The Annals of Statistics 31 (2) (2003) 391–406
T. Lai, Stochastic approximation: invited paper, The Annals of Statistics 31 (2) (2003) 391–406. doi:10.1214/aos/1051027873
-
[10]
A. Den Boer, B. Zwart, Simultaneously learning and optimizing using controlled variance pricing, Management Science 60 (3) (2013) 770–783. URL https://doi.org/10.1287/mnsc.2013.1788
-
[11]
J. Moˇ ckus, Bayesian Approach to Global Optimization, Mathematics and its Applications, Springer, Netherlands, 1989. doi:10.1007/978-94-009-0909-0
-
[12]
J. Moˇ ckus, Bayesian approach to global optimization and application to multiobjective and con- strained problems, Journal of Global Optimization 4 (4) (1994) 347–365. doi:10.1007/BF01099263
-
[14]
N. Srinivas, A. Krause, S. M. Kakade, M. Seeger, Information-theoretic regret bounds for gaussian process optimization in the bandit setting, IEEE Transactions on Information Theory 58 (5) (2012) 389–434. doi:10.1109/TIT.2011.2182033. URL https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=6138914
-
[15]
C. E. Rasmussen, C. K. I. Williams, Gaussian processes for machine learning, Adaptive computation and machine learning, MIT Press, Cambridge, Mass, 2006
work page 2006
-
[16]
E. Brochu, V. M. Cora, N. de Freitas, A tutorial on bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning, CoRR abs/1012.2599. URL http://arxiv.org/abs/1012.2599
work page internal anchor Pith review Pith/arXiv arXiv
-
[17]
URL http://www.biztositasiszemle.hu/files/201512/sigma6_2015_en.pdf
SwissRe, Life insurance in the digital age: fundamental transformation ahead, Swiss Re Sigma Report. URL http://www.biztositasiszemle.hu/files/201512/sigma6_2015_en.pdf
-
[18]
H. B¨ uhlmann, Mathematical Methods in Risk Theory, Grundlehren der mathematischen Wis- senschaften, Springer-Verlag Berlin Heidelberg, 2005. doi:10.1007/978-3-540-30711-2
-
[19]
McClenahan, Ratemaking, 4th Edition, Casualty Actuarial Society, 1984, Ch
C. McClenahan, Ratemaking, 4th Edition, Casualty Actuarial Society, 1984, Ch. 3, pp. 75–148
work page 1984
-
[20]
P. de Jong, G. Z. Heller, Generalized Linear Models for Insurance Data, Cambridge University Press, Cambridge, 2008. URL https://feb.kuleuven.be/public/u0017833/boek.pdf 29
work page 2008
-
[21]
L. A. Baxter, S. M. Coutts, G. A. F. Ross, Applications of linear models in motor insurance, in: 21st International Congress of Actuaries, Vol. 2, Elsevier, 1980, pp. 11–29
work page 1980
- [22]
-
[23]
R. A. Bailey, L. J. Simon, Two studies in automobile insurance ratemaking, ASTIN Bulletin: The Journal of the International Actuarial Association 1 (4) (1960) 192–217. doi:10.1017/ S0515036100009569
work page 1960
-
[24]
M. David, Auto insurance premium calculation using generalized linear models, Procedia Economics and Finance 20 (2015) 147–156. doi:10.1016/S2212-5671(15)00059-3
-
[25]
E. Ohlsson, J. B., Non-Life Insurance Pricing with Generalized Linear Models, Springer, Berlin, Heidelberg, 2010. URL https://link.springer.com/book/10.1007/978-3-642-10791-7
-
[26]
S. Haberman, A. E. Renshaw, Generalized linear models and actuarial science, Journal of the Royal Statistical Society 45 (4) (1996) 407–436. doi:10.2307/2988543
-
[27]
R. Kaas, M. Goovaerts, J. Dhaene, M. Denuit, Modern actuarial risk theory : using R, 2nd Edition, Springer, Berlin, 2009. doi:10.1007/978-3-540-70998-5
-
[28]
E. W. Frees, Regression modeling with actuarial and financial applications, International series on actuarial science, Cambridge University Press, Cambridge, 2010
work page 2010
-
[29]
M. V. W¨ uthrich, C. Buser, Data analytics for non-life insurance pricing, Swiss Finance Institute Research Paper No. 16-68. URL https://ssrn.com/abstract=2870308
-
[30]
G. C. Evans, The dynamics of monopoly, The American Mathematical Monthly 31 (2) (1924) 77–83. doi:10.2307/2300113. URL http://www.jstor.org/stable/2300113
-
[31]
G. C. Evans, Mathematical introduction to economics, McGraw-Hill, New York, 1930. URL http://hdl.handle.net/2027/uc1.b3427705
work page 1930
-
[32]
E. A. Greenleaf, The impact of reference price effects on the profitability of price promotions, Mar- keting Science 14 (1) (1995) 82–104. doi:10.1287/mksc.14.1.82. URL https://pubsonline.informs.org/doi/pdf/10.1287/mksc.14.1.82
-
[33]
P. Kopalle, A. Rao, J. Assuncao, Asymmetric reference price effects and dynamic pricing policies, Marketing Science 15 (1) (1996) 60–85. URL http://www.jstor.org/stable/184184
work page 1996
- [34]
-
[35]
Y. Aviv, G. Vulcano, Dynamic list pricing, in: The Oxford handbook of pricing management, Oxford University Press, UK, 2012, Ch. 23, pp. 522–58.doi:10.1093/oxfordhb/9780199543175.013.0023
-
[36]
A. Den Boer, Dynamic pricing and learning: Historical origins, current research, and new directions, Surveys in operations research and management science 20 (1) (2015) 1–18. doi:10.1016/j.sorms. 2015.03.001. 30
-
[37]
M. V. W¨ uthrich, Non-life insurance: Mathematics & statistics (2017). URL http://dx.doi.org/10.2139/ssrn.2319328
-
[38]
R. W. M. Wedderburn, Quasi-likelihood functions, generalized linear models, and the gauss-newton method, Biometrika 61 (3) (1974) 439–447. doi:10.2307/2334725
-
[39]
T. W. Anderson, J. B. Taylor, Strong consistency of least squares estimates in normal linear regres- sion, The Annals of Statistics 4 (4) (1976) 788–790. doi:10.1214/aos/1176343552
-
[40]
T. L. Lai, H. Robbins, C. Z. Wei, Strong consistency of least squares estimates in multiple regression, Proceedings of the National Academy of Sciences of the United States of America 75 (7) (1978) 343–
work page 1978
-
[41]
doi:10.1016/0047-259X(79)90093-9
-
[42]
K. Chen, I. Hu, Z. Ying, Strong consistency of maximum quasi-likelihood estimators in generalized linear models with fixed and adaptive designs, The Annals of Statistics 27 (4) (1999) 1155–1163. doi:10.1214/aos/1017938919
- [43]
-
[44]
P. Auer, N. Cesa-Bianchi, P. Fischer, Finite-time analysis of the multiarmed bandit problem, Machine Learning 47 (2-3) (2002) 235–256. doi:10.1023/A:1013689704352
-
[45]
T. Lai, H. Robbins, Asymptotically efficient adaptive allocation rules, Advances in Applied Mathe- matics 6 (1) (1985) 4–22. URL http://dx.doi.org/10.1016/0196-8858(85)90002-8
-
[46]
V. Dani, T. P. Hayes, S. M. Kakade, Stochastic linear optimization under bandit feedback, in: 21st Annual Conference on Learning Theory (COLT), 2008, pp. 355–366. URL http://colt2008.cs.helsinki.fi/papers/80-Dani.pdf
work page 2008
-
[47]
P. Rusmevichientong, J. N. Tsitsiklis, Linearly parameterized bandits, Mathematics of Operations Research 35 (2) (2010) 395–411. URL https://pubsonline.informs.org/doi/pdf/10.1287/moor.1100.0446
-
[48]
S. Filippi, O. Cappe, A. Garivier, C. Szepesv´ ari, Parametric bandits: The generalized linear case, in: Advances in Neural Information Processing Systems 23 (NIPS 2010), 2010, pp. 586–594. URL https://sites.ualberta.ca/~szepesva/papers/GenLinBandits-NIPS2010.pdf
work page 2010
- [49]
-
[50]
R. Kleinberg, A. Slivkins, E. Upfal, Multi-armed bandits in metric spaces, in: Proceedings of the Fortieth Annual ACM Symposium on Theory of Computing, ACM, 2008, pp. 681–690. doi:10. 1145/1374376.1374475. URL http://doi.acm.org/10.1145/1374376.1374475
-
[51]
S. Agrawal, N. Goyal, Analysis of thompson sampling for the multi-armed bandit problem, in: 25th Annual Conference on Learning Theory, Vol. 23, 2012, pp. 39.1–39.26. URL http://proceedings.mlr.press/v23/agrawal12/agrawal12.pdf
work page 2012
-
[52]
S. Agrawal, N. Goyal, Further optimal regret bounds for thompson sampling, in: 16th International Conference on Artificial Intelligence and Statistics (AISTATS), Vol. 31, 2013, pp. 90–107. URL http://proceedings.mlr.press/v31/agrawal13a.pdf 31
work page 2013
-
[53]
D. Russo, B. Van Roy, Learning to optimize via posterior sampling, Mathematics of Operations Research 39 (4) (2013) 1221–1243. URL https://pubsonline.informs.org/doi/pdf/10.1287/moor.2014.0650
-
[54]
J. Moˇ ckus, V. Tiesis, A. Zilinskas, On bayesian methods for seeking the extremum, in: Towards Global Optimization, 2nd Edition, Vol. 2, Elsevier Science Ltd, North Holland, Amsterdam, 1978, pp. 117–129
work page 1978
-
[55]
J. Snoek, H. Larochelle, R. P. Adams, Practical bayesian optimization of machine learning algo- rithms, in: F. Pereira, C. J. C. Burges, L. Bottou, K. Q. Weinberger (Eds.), Advances in Neural Information Processing Systems 25, Curran Associates, Inc., 2012, pp. 2951–2959. URL http://papers.nips.cc/paper/4522-practical-bayesian-optimization
work page 2012
-
[56]
G. Gallego, G. Van Ryzin, Optimal dynamic pricing of inventories with stochastic demand over finite horizons, Management Science 40 (8) (1994) 999–1020. URL http://www.jstor.org.manchester.idm.oclc.org/stable/2633090
- [57]
-
[58]
Exponential spectra in $L^2(\mu)$
J. Harrison, N. Keskin, A. Zeevi, Bayesian dynamic pricing policies: Learning and earning under a binary prior distribution, Management Science 58 (3) (2012) 570–586. URL https://doi.org/10.1287/mnsc.1110.1426
work page internal anchor Pith review Pith/arXiv arXiv doi:10.1287/mnsc.1110.1426 2012
- [59]
-
[60]
R. Kleinberg, T. Leighton, The value of knowing a demand curve: bounds on regret for online posted-price auctions, in: Proceedings of the 44th IEEE Symposium on Foundations of Computer Science, IEEE, USA, 2003, pp. 594–605. doi:10.1109/SFCS.2003.1238232
-
[61]
E. Cope, Bayesian strategies for dynamic pricing in e-commerce, Naval Research Logistics (NRL) 54 (3) (2007) 265–281. doi:10.1002/nav.20204
-
[62]
P. Rusmevichientong, B. Van Roy, P. W. Glynn, A nonparametric approach to multiproduct pricing, Operations Research 54 (1) (2006) 82–98. doi:10.1287/opre.1050.0252
- [63]
-
[64]
O. Besbes, A. Zeevi, Blind network revenue management, Operations Research 60 (6) (2012) 1537– 1550. URL https://pubsonline.informs.org/doi/pdf/10.1287/opre.1120.1103
-
[65]
M. Dud´ ık, D. J. Hsu, S. Kale, N. Karampatziakis, J. Langford, L. Reyzin, T. Zhangn, Efficient optimal learning for contextual bandits, in: Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence (UAI), 2011, pp. 1–20. URL http://www.cs.columbia.edu/~djhsu/papers/amo.pdf
work page 2011
-
[66]
O. Chapelle, L. Li, An empirical evaluation of thompson sampling, in: J. Shawe-Taylor, R. S. Zemel, P. L. Bartlett, F. Pereira, K. Q. Weinberger (Eds.), Advances in Neural Information Processing Systems 24, Curran Associates, Inc., 2012, pp. 2249–2257. URL https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/thompson. pdf 32
work page 2012
-
[67]
N. Cesa-Bianchi, C. Gentile, Y. Mansour, A. Minora, Delay and cooperation in nonstochastic bandits, Journal of Machine Learning Research 20 (17) (2016) 1–38. URL http://www.jmlr.org/papers/volume20/17-631/17-631.pdf
work page 2016
-
[68]
C. Pike-Burke, S. Agrawal, C. Szepesvari, S. Grunewalder, Bandits with delayed, aggregated anony- mous feedback, in: Proceedings of Machine Learning (ICML), Vol. 80, 2018, pp. 4105–4113. URL http://proceedings.mlr.press/v80/pike-burke18a/pike-burke18a.pdf
work page 2018
-
[69]
A. Agarwal, J. C. Duchi, Distributed delayed stochastic optimization, in: Proceedings of the 24th International Conference on Neural Information Processing Systems (NIPS), NIPS’11, Curran As- sociates Inc., USA, 2011, pp. 2312–2320. URL https://papers.nips.cc/paper/4247-distributed-delayed-stochastic-optimization
work page 2011
-
[70]
T. Desautels, A. Krause, J. W. Burdick, Parallelizing exploration-exploitation tradeoffs in gaussian process bandit optimization, Journal of Machine Learning Research 15 (2014) 4053–4103. URL http://jmlr.org/papers/volume15/desautels14a/desautels14a.pdf
work page 2014
-
[71]
Stochastic Bandit Models for Delayed Conversions
C. Vernade, O. Capp´ e, V. Perchet, Stochastic bandit models for delayed conversions, arXiv preprint abs/1706.09186. arXiv:1706.09186. URL http://arxiv.org/abs/1706.09186
work page internal anchor Pith review Pith/arXiv arXiv
-
[72]
P. Joulani, A. Gy¨ orgy, C. Szepesv´ ari, Online learning under delayed feedback, in: Proceedings of the 30th International Conference on Machine Learning (ICML), Atlanta, Georgia, USA, 2013, pp. 1453–1461. URL http://proceedings.mlr.press/v28/joulani13.pdf
work page 2013
-
[73]
T. W. Anderson, J. B. Taylor, Strong consistency of least squares estimates in dynamic models, The Annals of Statistics 7 (3) (1979) 484–489. doi:10.1214/aos/1176344670
-
[74]
N. B. Keskin, A. Zeevi, Dynamic pricing with an unknown demand model: Asymptotically optimal semi-myopic policies, Operations Research 62 (5) (2014) 1142–1167. URL https://doi.org/10.1287/opre.2014.1294
-
[75]
P. Joulani, A. Gy¨ orgy, C. Szepesv´ ari, Delay-tolerant online convex optimization: Unified analysis and adaptive-gradient algorithms, in: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16), Phoenix, Arizona, USA, 2016, pp. 1744–1750. URL https://sites.ualberta.ca/~pooria/publications/AAAI16-Extended.pdf
work page 2016
-
[76]
M. S. Bartlett, An inverse matrix adjustment arising in discriminant analysis, Annals of Mathemat- ical Statistics 22 (1) (1951) 107–111. doi:10.1214/aoms/1177729698
-
[77]
J. J. Duistermaat, J. A. C. Kolk, Multidimensional Real Analysis I: Differentiation, Cambridge Studies in Advanced Mathematics, Cambridge University Press, Cambridge, Mass, 2004. doi: 10.1017/CBO978051161671
-
[78]
Dugundji, Topology, Series in advanced mathematics, Allyn & Bacon, Boston, 1966
J. Dugundji, Topology, Series in advanced mathematics, Allyn & Bacon, Boston, 1966
work page 1966
-
[79]
Y. S. Chow, Local convergence of martingales and the law of large numbers, The Annals of Mathe- matical Statistics 36 (2) (1965) 552–558. doi:10.1214/aoms/1177700166
-
[80]
D. Freedman, Another note on the borel-cantelli lemma and the strong law, with the poisson approxi- mation as a by-product, Annals of Probability 1 (6) (1973) 910–925. doi:10.1214/aop/1176996800. 33
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.