Learning from weakly dependent data under Dobrushin's condition

Constantinos Daskalakis; Nishanth Dikkala; Siddhartha Jayanti; Yuval Dagan

arxiv: 1906.09247 · v1 · pith:WKEWDMQLnew · submitted 2019-06-21 · 💻 cs.LG · stat.ML

Learning from weakly dependent data under Dobrushin's condition

Yuval Dagan , Constantinos Daskalakis , Nishanth Dikkala , Siddhartha Jayanti This is my paper

Pith reviewed 2026-05-25 18:51 UTC · model grok-4.3

classification 💻 cs.LG stat.ML

keywords generalization boundsdependent dataDobrushin's conditionRademacher complexityVC dimensionlearning theoryweak dependencespatial data

0 comments

The pith

Under Dobrushin's condition, standard Gaussian and Rademacher complexities and VC dimension suffice to bound generalization error for weakly dependent data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that data drawn from a joint distribution satisfying Dobrushin's condition allows the same complexity measures used in the independent case to control generalization and learning. Generalization bounds hold up to constant factors relative to i.i.d. analogs, while learnability rates suffer only logarithmic degradation in sample size. This matters for network or spatial sampling where observations are dependent yet not arbitrarily coupled, so existing tools apply without new complexity notions.

Core claim

When the data distribution satisfies Dobrushin's condition, the Gaussian complexity, Rademacher complexity, and VC dimension of a hypothesis class are sufficient to bound its generalization error and to obtain learnability rates; these bounds degrade by only constant factors for generalization and by logarithmic factors in the training-set size compared with the corresponding i.i.d. bounds.

What carries the argument

Dobrushin's condition, a quantitative bound on the dependence between any pair of variables, which lets standard complexity measures carry the argument without modification.

If this is right

Generalization error bounds for any hypothesis class remain within constant factors of their i.i.d. versions.
Learnability rates degrade by at most logarithmic factors in the number of training points.
The same Gaussian, Rademacher, and VC-dimension quantities control performance for network and spatial data.
No new complexity measures are required beyond those already used for independent samples.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Existing learning algorithms whose analysis relies only on Rademacher or VC bounds can be transferred to graph or lattice data without change, once Dobrushin's condition is verified.
Empirical checks of pairwise dependence strength on real sensor or social-network datasets could immediately certify that standard bounds apply.
The result suggests a route to relax Dobrushin's condition further while retaining the same complexity measures, by replacing the uniform pairwise bound with a decaying influence graph.

Load-bearing premise

The joint distribution of the observed variables must satisfy Dobrushin's condition that limits pairwise dependence.

What would settle it

A concrete counter-example would be a hypothesis class and a distribution obeying Dobrushin's condition for which the observed generalization gap grows faster than a constant multiple of the i.i.d. bound given by its Rademacher complexity.

read the original abstract

Statistical learning theory has largely focused on learning and generalization given independent and identically distributed (i.i.d.) samples. Motivated by applications involving time-series data, there has been a growing literature on learning and generalization in settings where data is sampled from an ergodic process. This work has also developed complexity measures, which appropriately extend the notion of Rademacher complexity to bound the generalization error and learning rates of hypothesis classes in this setting. Rather than time-series data, our work is motivated by settings where data is sampled on a network or a spatial domain, and thus do not fit well within the framework of prior work. We provide learning and generalization bounds for data that are complexly dependent, yet their distribution satisfies the standard Dobrushin's condition. Indeed, we show that the standard complexity measures of Gaussian and Rademacher complexities and VC dimension are sufficient measures of complexity for the purposes of bounding the generalization error and learning rates of hypothesis classes in our setting. Moreover, our generalization bounds only degrade by constant factors compared to their i.i.d. analogs, and our learnability bounds degrade by log factors in the size of the training set.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Dobrushin's condition lets the usual Rademacher, Gaussian, and VC tools give generalization bounds within constant factors of the i.i.d. case for spatial and network data.

read the letter

The main takeaway is that this paper shows standard complexity measures suffice under Dobrushin's condition, with generalization bounds degrading only by constants and sample complexity by log factors. It targets network and spatial sampling, which sits outside the ergodic time-series literature they cite. That distinction is the core new piece: they avoid new complexity notions and just control the dependence enough for existing arguments to carry over. The abstract states the claims cleanly and the assumption is quantitative, which is a plus. If the proofs handle the propagation from pairwise dependence bounds to the empirical process without extra looseness, this is a useful incremental result that lets people apply off-the-shelf tools to a common practical setting. The soft spot is that everything rests on how tightly the Dobrushin parameter enters the concentration steps; without seeing the derivations it is hard to judge whether the constants stay reasonable or blow up in high dimensions. The assumption itself may also be nontrivial to verify on real networks, though that is an application question rather than a flaw in the math. This is for statistical learning theorists working on dependent data. It is worth sending to peer review because the claim is focused, the setting is motivated, and the result would be citable if the details check out.

Referee Report

0 major / 2 minor

Summary. The paper extends statistical learning theory beyond i.i.d. samples to data drawn from distributions satisfying Dobrushin's condition (a quantitative pairwise dependence bound), motivated by network or spatial sampling rather than time series. It claims that the standard i.i.d. complexity measures—Gaussian complexity, Rademacher complexity, and VC dimension—remain sufficient to control generalization error and learning rates, with generalization bounds degrading only by constant factors relative to their i.i.d. counterparts and learnability (sample-complexity) bounds degrading by logarithmic factors in the training-set size.

Significance. If the central claims hold, the work is significant for providing a direct reduction from weakly dependent spatial/network data to existing i.i.d. complexity measures under an explicit, standard dependence condition, without introducing new dependence-adjusted complexities. This yields clean, usable bounds that differ from the i.i.d. case only by universal constants or log n factors, which is a strong positive for applications in spatial statistics and graph-structured data.

minor comments (2)

The abstract states that the bounds 'only degrade by constant factors' and 'log factors in the size of the training set,' but does not indicate whether these constants or log factors are made explicit or left as existential; if the former, a short table or remark comparing the precise prefactors to the i.i.d. case would strengthen the contribution.
The motivation section contrasts the setting with ergodic time-series work; a one-sentence pointer to the specific Dobrushin parameter range that excludes typical mixing conditions used in that literature would clarify the scope.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive assessment of our work and the recommendation for minor revision. The referee's summary accurately captures the paper's focus on generalization and learnability under Dobrushin's condition for weakly dependent spatial/network data, with bounds that degrade only by constants or log factors relative to the i.i.d. case.

Circularity Check

0 steps flagged

No significant circularity; derivation applies standard complexity measures under explicit quantitative assumption

full rationale

The paper's central claim is that Dobrushin's condition (a quantitative bound on pairwise dependence) allows the standard i.i.d. Gaussian/Rademacher complexities and VC dimension to control generalization and learning rates, with only constant-factor degradation in bounds and log-factor degradation in sample complexity. The abstract states this extension directly without any self-definitional steps, fitted parameters renamed as predictions, or load-bearing self-citations. No equations or derivation steps reduce the claimed bounds to the inputs by construction; the dependence control is an external assumption that propagates to the empirical process. The result is therefore self-contained against external benchmarks and receives score 0.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The work rests on standard concentration and complexity tools from learning theory plus the domain assumption of Dobrushin's condition; no free parameters, invented entities, or ad-hoc axioms are introduced in the abstract.

axioms (1)

domain assumption Dobrushin's condition holds for the data-generating distribution
Invoked to control dependence so that standard complexity measures apply.

pith-pipeline@v0.9.0 · 5740 in / 1074 out tokens · 26155 ms · 2026-05-25T18:51:45.485045+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

44 extracted references · 44 canonical work pages · 4 internal anchors

[1]

The generalization ability of online algorithms for dependent data

Alekh Agarwal and John C Duchi. The generalization ability of online algorithms for dependent data. IEEE Transactions on Information Theory, 59 0 (1): 0 573--587, 2013

work page 2013
[2]

Rademacher and gaussian complexities: Risk bounds and structural results

Peter L Bartlett and Shahar Mendelson. Rademacher and gaussian complexities: Risk bounds and structural results. Journal of Machine Learning Research, 3 0 (Nov): 0 463--482, 2002

work page 2002
[3]

On the boundedness of bernoulli processes

Witold Bednorz and Rafal Latala. On the boundedness of bernoulli processes. Annals of Mathematics, 180 0 (3): 0 1167--1203, 2014

work page 2014
[4]

Rate of convergence of predictive distributions for dependent data

Patrizia Berti, Irene Crimaldi, Luca Pratelli, Pietro Rigo, et al. Rate of convergence of predictive distributions for dependent data. Bernoulli, 15 0 (4): 0 1351--1367, 2009

work page 2009
[5]

Network effects and welfare cultures

Marianne Bertrand, Erzo FP Luttmer, and Sendhil Mullainathan. Network effects and welfare cultures. The Quarterly Journal of Economics, 115 0 (3): 0 1019--1055, 2000

work page 2000
[6]

Identification of peer effects through social networks

Yann Bramoull \'e , Habiba Djebbari, and Bernard Fortin. Identification of peer effects through social networks. Journal of econometrics, 150 0 (1): 0 41--55, 2009

work page 2009
[7]

Concentration Inequalities with Exchangeable Pairs

Sourav Chatterjee. Concentration Inequalities with Exchangeable Pairs. PhD thesis, Stanford University, June 2005 a

work page 2005
[8]

Concentration inequalities with exchangeable pairs (Ph.D. thesis)

Sourav Chatterjee. Concentration inequalities with exchangeable pairs (ph. d. thesis). arXiv preprint math/0507526, 2005 b

work page internal anchor Pith review Pith/arXiv arXiv 2005
[9]

Social contagion theory: examining dynamic social networks and human behavior

Nicholas A Christakis and James H Fowler. Social contagion theory: examining dynamic social networks and human behavior. Statistics in medicine, 32 0 (4): 0 556--577, 2013

work page 2013
[10]

Testing I sing models

Constantinos Daskalakis, Nishanth Dikkala, and Gautam Kamath. Testing I sing models. In Proceedings of the 29th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA '18, Philadelphia, PA, USA, 2018. SIAM

work page 2018
[11]

On statistical learning via the lens of compression

Ofir David, Shay Moran, and Amir Yehudayoff. On statistical learning via the lens of compression. In Proceedings of the 30th International Conference on Neural Information Processing Systems, pages 2792--2800. Curran Associates Inc., 2016

work page 2016
[12]

The description of a random field by means of conditional probabilities and conditions of its regularity

PL Dobrushin. The description of a random field by means of conditional probabilities and conditions of its regularity. Theory of Probability & Its Applications, 13 0 (2): 0 197--224, 1968

work page 1968
[13]

Completely analytical interactions: constructive description

RL Dobrushin and SB Shlosman. Completely analytical interactions: constructive description. Journal of Statistical Physics, 46 0 (5-6): 0 983--1014, 1987

work page 1987
[14]

The role of information and social interactions in retirement plan decisions: Evidence from a randomized experiment

Esther Duflo and Emmanuel Saez. The role of information and social interactions in retirement plan decisions: Evidence from a randomized experiment. The Quarterly journal of economics, 118 0 (3): 0 815--842, 2003

work page 2003
[15]

Boosting a weak learning algorithm by majority

Yoav Freund. Boosting a weak learning algorithm by majority. Information and computation, 121 0 (2): 0 256--285, 1995

work page 1995
[16]

Concentration inequalities for polynomials of contracting Ising models

Reza Gheissari, Eyal Lubetzky, and Yuval Peres. Concentration inequalities for polynomials of contracting I sing models. arXiv preprint arXiv:1706.00121, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[17]

Crime and social interactions

Edward L Glaeser, Bruce Sacerdote, and Jose A Scheinkman. Crime and social interactions. The Quarterly Journal of Economics, 111 0 (2): 0 507--548, 1996

work page 1996
[18]

Waves i & ii, 1994--1996; wave iii, 2001--2002; wave iv, 2007--2009 [machine-readable data file and documentation]

Kathleen Mullan Harris, National Longitudinal Study of Adolescent Health, et al. Waves i & ii, 1994--1996; wave iii, 2001--2002; wave iv, 2007--2009 [machine-readable data file and documentation]. Chapel Hill, NC: Carolina Population Center, University of North Carolina at Chapel Hill, 10, 2009

work page 1994
[19]

Concentration inequalities for functions of gibbs fields with application to diffraction and random gibbs measures

Christof K \"u lske. Concentration inequalities for functions of gibbs fields with application to diffraction and random gibbs measures. Communications in mathematical physics, 239 0 (1-2): 0 29--51, 2003

work page 2003
[20]

Decay of correlations under dobrushin's uniqueness condition and its applications

H K \"u nsch. Decay of correlations under dobrushin's uniqueness condition and its applications. Communications in Mathematical Physics, 84 0 (2): 0 207--222, 1982

work page 1982
[21]

Generalization bounds for time series prediction with non-stationary processes

Vitaly Kuznetsov and Mehryar Mohri. Generalization bounds for time series prediction with non-stationary processes. In Peter Auer, Alexander Clark, Thomas Zeugmann, and Sandra Zilles, editors, Algorithmic Learning Theory, pages 260--274, Cham, 2014. Springer International Publishing. ISBN 978-3-319-11662-4

work page 2014
[22]

Learning theory and algorithms for forecasting non-stationary time series

Vitaly Kuznetsov and Mehryar Mohri. Learning theory and algorithms for forecasting non-stationary time series. In Advances in neural information processing systems, pages 541--549, 2015

work page 2015
[23]

Generalization bounds for non-stationary mixing processes

Vitaly Kuznetsov and Mehryar Mohri. Generalization bounds for non-stationary mixing processes. Machine Learning, 106 0 (1): 0 93--117, Jan 2017. doi:10.1007/s10994-016-5588-2. URL https://doi.org/10.1007/s10994-016-5588-2

work page doi:10.1007/s10994-016-5588-2 2017
[24]

Levin, Yuval Peres, and Elizabeth L

David A. Levin, Yuval Peres, and Elizabeth L. Wilmer. M arkov Chains and Mixing Times . American Mathematical Society, 2009

work page 2009
[25]

Prediction models for network-linked data

Tianxi Li, Elizaveta Levina, and Ji Zhu. Prediction models for network-linked data. arXiv preprint arXiv:1602.01192, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[26]

Relating data compression and learnability

Nick Littlestone and Manfred Warmuth. Relating data compression and learnability. 1986

work page 1986
[27]

Identification of endogenous social effects: The reflection problem

Charles F Manski. Identification of endogenous social effects: The reflection problem. The review of economic studies, 60 0 (3): 0 531--542, 1993

work page 1993
[28]

Bounding d -distance by informational divergence: A method to prove measure concentration

Katalin Marton et al. Bounding d -distance by informational divergence: A method to prove measure concentration. The Annals of Probability, 24 0 (2): 0 857--866, 1996

work page 1996
[29]

Rademacher complexity of stationary sequences

Daniel J. McDonald and Cosma Rohilla Shalizi. Rademacher complexity of stationary sequences. arXiv preprint arXiv:1106.0730, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[30]

Rademacher complexity bounds for non-iid processes

Mehryar Mohri and Afshin Rostamizadeh. Rademacher complexity bounds for non-iid processes. In Advances in Neural Information Processing Systems, pages 1097--1104, 2009

work page 2009
[31]

Stability bounds for stationary -mixing and -mixing processes

Mehryar Mohri and Afshin Rostamizadeh. Stability bounds for stationary -mixing and -mixing processes. Journal of Machine Learning Research, 11 0 (Feb): 0 789--814, 2010

work page 2010
[32]

The spread of innovations in social networks

Andrea Montanari and Amin Saberi. The spread of innovations in social networks. Proceedings of the National Academy of Sciences, 107 0 (47): 0 20196--20201, 2010. ISSN 0027-8424. doi:10.1073/pnas.1004098107. URL https://www.pnas.org/content/107/47/20196

work page doi:10.1073/pnas.1004098107 2010
[33]

Sample compression schemes for vc classes

Shay Moran and Amir Yehudayoff. Sample compression schemes for vc classes. Journal of the ACM (JACM), 63 0 (3): 0 21, 2016

work page 2016
[34]

Predictive pac learnability: A paradigm for learning from exchangeable input data

Vladimir Pestov. Predictive pac learnability: A paradigm for learning from exchangeable input data. In Granular Computing (GrC), 2010 IEEE International Conference on, pages 387--391. IEEE, 2010

work page 2010
[35]

Online learning: Random averages, combinatorial parameters, and learnability

Alexander Rakhlin, Karthik Sridharan, and Ambuj Tewari. Online learning: Random averages, combinatorial parameters, and learnability. In Advances in Neural Information Processing Systems, pages 1984--1992, 2010

work page 1984
[36]

Peer effects with random assignment: Results for dartmouth roommates

Bruce Sacerdote. Peer effects with random assignment: Results for dartmouth roommates. The Quarterly journal of economics, 116 0 (2): 0 681--704, 2001

work page 2001
[37]

Understanding machine learning: From theory to algorithms

Shai Shalev-Shwartz and Shai Ben-David. Understanding machine learning: From theory to algorithms. Cambridge university press, 2014

work page 2014
[38]

The logarithmic sobolev inequality for discrete spin systems on a lattice

Daniel W Stroock and Boguslaw Zegarlinski. The logarithmic sobolev inequality for discrete spin systems on a lattice. Communications in Mathematical Physics, 149 0 (1): 0 175--193, 1992

work page 1992
[39]

Majorizing measures: the generic chaining

Michel Talagrand et al. Majorizing measures: the generic chaining. The Annals of Probability, 24 0 (3): 0 1049--1103, 1996

work page 1996
[40]

Banach-Mazur distances and finite-dimensional operator ideals, volume 38

Nicole Tomczak-Jaegermann. Banach-Mazur distances and finite-dimensional operator ideals, volume 38. Longman Sc & Tech, 1989

work page 1989
[41]

Peer effects in adolescent overweight

Justin G Trogdon, James Nonnemaker, and Joanne Pais. Peer effects in adolescent overweight. Journal of health economics, 27 0 (5): 0 1388--1399, 2008

work page 2008
[42]

On the uniform convergence of relative frequencies of events to their probabilities

Vladimir N Vapnik and A Ya Chervonenkis. On the uniform convergence of relative frequencies of events to their probabilities. In Measures of complexity, pages 11--30. Springer, 2015

work page 2015
[43]

Combinatorial criteria for uniqueness of gibbs measures

Dror Weitz. Combinatorial criteria for uniqueness of gibbs measures. Random Structures & Algorithms, 27 0 (4): 0 445--475, 2005

work page 2005
[44]

Rates of convergence for empirical processes of stationary mixing sequences

Bin Yu. Rates of convergence for empirical processes of stationary mixing sequences. The Annals of Probability, pages 94--116, 1994

work page 1994

[1] [1]

The generalization ability of online algorithms for dependent data

Alekh Agarwal and John C Duchi. The generalization ability of online algorithms for dependent data. IEEE Transactions on Information Theory, 59 0 (1): 0 573--587, 2013

work page 2013

[2] [2]

Rademacher and gaussian complexities: Risk bounds and structural results

Peter L Bartlett and Shahar Mendelson. Rademacher and gaussian complexities: Risk bounds and structural results. Journal of Machine Learning Research, 3 0 (Nov): 0 463--482, 2002

work page 2002

[3] [3]

On the boundedness of bernoulli processes

Witold Bednorz and Rafal Latala. On the boundedness of bernoulli processes. Annals of Mathematics, 180 0 (3): 0 1167--1203, 2014

work page 2014

[4] [4]

Rate of convergence of predictive distributions for dependent data

Patrizia Berti, Irene Crimaldi, Luca Pratelli, Pietro Rigo, et al. Rate of convergence of predictive distributions for dependent data. Bernoulli, 15 0 (4): 0 1351--1367, 2009

work page 2009

[5] [5]

Network effects and welfare cultures

Marianne Bertrand, Erzo FP Luttmer, and Sendhil Mullainathan. Network effects and welfare cultures. The Quarterly Journal of Economics, 115 0 (3): 0 1019--1055, 2000

work page 2000

[6] [6]

Identification of peer effects through social networks

Yann Bramoull \'e , Habiba Djebbari, and Bernard Fortin. Identification of peer effects through social networks. Journal of econometrics, 150 0 (1): 0 41--55, 2009

work page 2009

[7] [7]

Concentration Inequalities with Exchangeable Pairs

Sourav Chatterjee. Concentration Inequalities with Exchangeable Pairs. PhD thesis, Stanford University, June 2005 a

work page 2005

[8] [8]

Concentration inequalities with exchangeable pairs (Ph.D. thesis)

Sourav Chatterjee. Concentration inequalities with exchangeable pairs (ph. d. thesis). arXiv preprint math/0507526, 2005 b

work page internal anchor Pith review Pith/arXiv arXiv 2005

[9] [9]

Social contagion theory: examining dynamic social networks and human behavior

Nicholas A Christakis and James H Fowler. Social contagion theory: examining dynamic social networks and human behavior. Statistics in medicine, 32 0 (4): 0 556--577, 2013

work page 2013

[10] [10]

Testing I sing models

Constantinos Daskalakis, Nishanth Dikkala, and Gautam Kamath. Testing I sing models. In Proceedings of the 29th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA '18, Philadelphia, PA, USA, 2018. SIAM

work page 2018

[11] [11]

On statistical learning via the lens of compression

Ofir David, Shay Moran, and Amir Yehudayoff. On statistical learning via the lens of compression. In Proceedings of the 30th International Conference on Neural Information Processing Systems, pages 2792--2800. Curran Associates Inc., 2016

work page 2016

[12] [12]

The description of a random field by means of conditional probabilities and conditions of its regularity

PL Dobrushin. The description of a random field by means of conditional probabilities and conditions of its regularity. Theory of Probability & Its Applications, 13 0 (2): 0 197--224, 1968

work page 1968

[13] [13]

Completely analytical interactions: constructive description

RL Dobrushin and SB Shlosman. Completely analytical interactions: constructive description. Journal of Statistical Physics, 46 0 (5-6): 0 983--1014, 1987

work page 1987

[14] [14]

The role of information and social interactions in retirement plan decisions: Evidence from a randomized experiment

Esther Duflo and Emmanuel Saez. The role of information and social interactions in retirement plan decisions: Evidence from a randomized experiment. The Quarterly journal of economics, 118 0 (3): 0 815--842, 2003

work page 2003

[15] [15]

Boosting a weak learning algorithm by majority

Yoav Freund. Boosting a weak learning algorithm by majority. Information and computation, 121 0 (2): 0 256--285, 1995

work page 1995

[16] [16]

Concentration inequalities for polynomials of contracting Ising models

Reza Gheissari, Eyal Lubetzky, and Yuval Peres. Concentration inequalities for polynomials of contracting I sing models. arXiv preprint arXiv:1706.00121, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[17] [17]

Crime and social interactions

Edward L Glaeser, Bruce Sacerdote, and Jose A Scheinkman. Crime and social interactions. The Quarterly Journal of Economics, 111 0 (2): 0 507--548, 1996

work page 1996

[18] [18]

Waves i & ii, 1994--1996; wave iii, 2001--2002; wave iv, 2007--2009 [machine-readable data file and documentation]

Kathleen Mullan Harris, National Longitudinal Study of Adolescent Health, et al. Waves i & ii, 1994--1996; wave iii, 2001--2002; wave iv, 2007--2009 [machine-readable data file and documentation]. Chapel Hill, NC: Carolina Population Center, University of North Carolina at Chapel Hill, 10, 2009

work page 1994

[19] [19]

Concentration inequalities for functions of gibbs fields with application to diffraction and random gibbs measures

Christof K \"u lske. Concentration inequalities for functions of gibbs fields with application to diffraction and random gibbs measures. Communications in mathematical physics, 239 0 (1-2): 0 29--51, 2003

work page 2003

[20] [20]

Decay of correlations under dobrushin's uniqueness condition and its applications

H K \"u nsch. Decay of correlations under dobrushin's uniqueness condition and its applications. Communications in Mathematical Physics, 84 0 (2): 0 207--222, 1982

work page 1982

[21] [21]

Generalization bounds for time series prediction with non-stationary processes

Vitaly Kuznetsov and Mehryar Mohri. Generalization bounds for time series prediction with non-stationary processes. In Peter Auer, Alexander Clark, Thomas Zeugmann, and Sandra Zilles, editors, Algorithmic Learning Theory, pages 260--274, Cham, 2014. Springer International Publishing. ISBN 978-3-319-11662-4

work page 2014

[22] [22]

Learning theory and algorithms for forecasting non-stationary time series

Vitaly Kuznetsov and Mehryar Mohri. Learning theory and algorithms for forecasting non-stationary time series. In Advances in neural information processing systems, pages 541--549, 2015

work page 2015

[23] [23]

Generalization bounds for non-stationary mixing processes

Vitaly Kuznetsov and Mehryar Mohri. Generalization bounds for non-stationary mixing processes. Machine Learning, 106 0 (1): 0 93--117, Jan 2017. doi:10.1007/s10994-016-5588-2. URL https://doi.org/10.1007/s10994-016-5588-2

work page doi:10.1007/s10994-016-5588-2 2017

[24] [24]

Levin, Yuval Peres, and Elizabeth L

David A. Levin, Yuval Peres, and Elizabeth L. Wilmer. M arkov Chains and Mixing Times . American Mathematical Society, 2009

work page 2009

[25] [25]

Prediction models for network-linked data

Tianxi Li, Elizaveta Levina, and Ji Zhu. Prediction models for network-linked data. arXiv preprint arXiv:1602.01192, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016

[26] [26]

Relating data compression and learnability

Nick Littlestone and Manfred Warmuth. Relating data compression and learnability. 1986

work page 1986

[27] [27]

Identification of endogenous social effects: The reflection problem

Charles F Manski. Identification of endogenous social effects: The reflection problem. The review of economic studies, 60 0 (3): 0 531--542, 1993

work page 1993

[28] [28]

Bounding d -distance by informational divergence: A method to prove measure concentration

Katalin Marton et al. Bounding d -distance by informational divergence: A method to prove measure concentration. The Annals of Probability, 24 0 (2): 0 857--866, 1996

work page 1996

[29] [29]

Rademacher complexity of stationary sequences

Daniel J. McDonald and Cosma Rohilla Shalizi. Rademacher complexity of stationary sequences. arXiv preprint arXiv:1106.0730, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[30] [30]

Rademacher complexity bounds for non-iid processes

Mehryar Mohri and Afshin Rostamizadeh. Rademacher complexity bounds for non-iid processes. In Advances in Neural Information Processing Systems, pages 1097--1104, 2009

work page 2009

[31] [31]

Stability bounds for stationary -mixing and -mixing processes

Mehryar Mohri and Afshin Rostamizadeh. Stability bounds for stationary -mixing and -mixing processes. Journal of Machine Learning Research, 11 0 (Feb): 0 789--814, 2010

work page 2010

[32] [32]

The spread of innovations in social networks

Andrea Montanari and Amin Saberi. The spread of innovations in social networks. Proceedings of the National Academy of Sciences, 107 0 (47): 0 20196--20201, 2010. ISSN 0027-8424. doi:10.1073/pnas.1004098107. URL https://www.pnas.org/content/107/47/20196

work page doi:10.1073/pnas.1004098107 2010

[33] [33]

Sample compression schemes for vc classes

Shay Moran and Amir Yehudayoff. Sample compression schemes for vc classes. Journal of the ACM (JACM), 63 0 (3): 0 21, 2016

work page 2016

[34] [34]

Predictive pac learnability: A paradigm for learning from exchangeable input data

Vladimir Pestov. Predictive pac learnability: A paradigm for learning from exchangeable input data. In Granular Computing (GrC), 2010 IEEE International Conference on, pages 387--391. IEEE, 2010

work page 2010

[35] [35]

Online learning: Random averages, combinatorial parameters, and learnability

Alexander Rakhlin, Karthik Sridharan, and Ambuj Tewari. Online learning: Random averages, combinatorial parameters, and learnability. In Advances in Neural Information Processing Systems, pages 1984--1992, 2010

work page 1984

[36] [36]

Peer effects with random assignment: Results for dartmouth roommates

Bruce Sacerdote. Peer effects with random assignment: Results for dartmouth roommates. The Quarterly journal of economics, 116 0 (2): 0 681--704, 2001

work page 2001

[37] [37]

Understanding machine learning: From theory to algorithms

Shai Shalev-Shwartz and Shai Ben-David. Understanding machine learning: From theory to algorithms. Cambridge university press, 2014

work page 2014

[38] [38]

The logarithmic sobolev inequality for discrete spin systems on a lattice

Daniel W Stroock and Boguslaw Zegarlinski. The logarithmic sobolev inequality for discrete spin systems on a lattice. Communications in Mathematical Physics, 149 0 (1): 0 175--193, 1992

work page 1992

[39] [39]

Majorizing measures: the generic chaining

Michel Talagrand et al. Majorizing measures: the generic chaining. The Annals of Probability, 24 0 (3): 0 1049--1103, 1996

work page 1996

[40] [40]

Banach-Mazur distances and finite-dimensional operator ideals, volume 38

Nicole Tomczak-Jaegermann. Banach-Mazur distances and finite-dimensional operator ideals, volume 38. Longman Sc & Tech, 1989

work page 1989

[41] [41]

Peer effects in adolescent overweight

Justin G Trogdon, James Nonnemaker, and Joanne Pais. Peer effects in adolescent overweight. Journal of health economics, 27 0 (5): 0 1388--1399, 2008

work page 2008

[42] [42]

On the uniform convergence of relative frequencies of events to their probabilities

Vladimir N Vapnik and A Ya Chervonenkis. On the uniform convergence of relative frequencies of events to their probabilities. In Measures of complexity, pages 11--30. Springer, 2015

work page 2015

[43] [43]

Combinatorial criteria for uniqueness of gibbs measures

Dror Weitz. Combinatorial criteria for uniqueness of gibbs measures. Random Structures & Algorithms, 27 0 (4): 0 445--475, 2005

work page 2005

[44] [44]

Rates of convergence for empirical processes of stationary mixing sequences

Bin Yu. Rates of convergence for empirical processes of stationary mixing sequences. The Annals of Probability, pages 94--116, 1994

work page 1994