pith. sign in

arxiv: 2606.10916 · v1 · pith:PZ4WJHHGnew · submitted 2026-06-09 · 📊 stat.ML · cs.LG· math.ST· stat.ME· stat.TH

Range Penalization: Theoretical Insights with Applications in Federated Learning

Pith reviewed 2026-06-27 11:31 UTC · model grok-4.3

classification 📊 stat.ML cs.LGmath.STstat.MEstat.TH
keywords range regularizationfederated learningstatistical accuracypattern recoverypolar clusteringnonasymptotic analysisoptimization algorithm
0
0 comments X

The pith

Range penalization enhances statistical accuracy and induces cross-client regularity in federated learning with linear systematic components.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper proposes range regularization for federated learning to boost statistical accuracy and create regularity across clients that supports efficient coding and resource use. The approach finds features that share the same weights among clients and groups the weights of personalized features toward their extreme values, called polar clustering. Theoretical analysis is difficult because the regularizer is a seminorm and cannot be decomposed easily. New proof methods are introduced to provide nonasymptotic bounds on statistical accuracy and to guarantee recovery of the underlying patterns. A fast optimization algorithm is developed that takes advantage of different levels of local strong convexity to cut down on the number of iterations needed.

Core claim

Range regularization applied to federated learning estimators with linear systematic components achieves enhanced statistical accuracy and induces cross-client regularity through the identification of shared weights and polar clustering of personalized feature weights, with new proof techniques overcoming the seminorm and non-decomposability to deliver nonasymptotic analysis and faithful pattern recovery, complemented by a fast optimization algorithm leveraging local strong convexity.

What carries the argument

The range regularizer, which penalizes differences in weights across clients to promote either shared weights or clustering at extreme values.

If this is right

  • Enhanced statistical accuracy for the estimators in federated settings.
  • Induced cross-client regularity that aids quantization, coding, and resource efficiency.
  • Faithful recovery of shared and personalized patterns.
  • Reduced iteration complexity in the optimization process due to local strong convexity.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • These techniques might generalize to other regularizers with similar seminorm properties in distributed estimation problems.
  • The polar clustering could be tested in settings with nonlinear models to see if similar regularity emerges.
  • Applications beyond federated learning, such as in multi-task learning with shared components, may benefit from the range penalization approach.

Load-bearing premise

The new proof techniques can overcome the seminorm nature and non-decomposability of the range regularizer to achieve the nonasymptotic statistical accuracy and pattern recovery guarantees.

What would settle it

A simulation where the range-penalized federated estimator does not show improved accuracy or fails to recover the shared features compared to standard methods would challenge the central claims.

Figures

Figures reproduced from arXiv: 2606.10916 by Yifan Sun, Yiyuan She, Zhaojun Hu.

Figure 1
Figure 1. Figure 1: As λ varies in the proximity problem, enforcing the range penalty can produce estimates with distinct components, estimates that cluster at extreme values, and a uniform estimate. The effects of shrinkage on extreme values and polar clustering are evident. and λ := 1 2 Xm k=1 |yk − y¯| (= ∥y∥R∗ = 1 2 ∥P⊥ Ky∥1). (11) Suppose λ ∈ [¯ck1 , c¯k1+1) and λ ∈ [ck2 , ck2+1) for some k1 = k1(λ), k2 = k2(λ) ∈ [m] and… view at source ↗
read the original abstract

This paper introduces range regularization for federated learning with linear systematic components to enhance statistical accuracy and induce cross-client regularity conducive to quantization, coding, and resource efficiency. Our approach identifies features with shared weights across different clients and adaptively clusters the weights of personalized features at extreme values, a process we refer to as polar clustering. Theoretical analysis of the associated estimators poses significant challenges due to the seminorm nature and non-decomposability of the regularizer. We develop new proof techniques for the nonasymptotic analysis of statistical accuracy and faithful pattern recovery. Moreover, a fast optimization algorithm that leverages varying degrees of local strong convexity is proposed to reduce iteration complexity. Experiments support the efficacy and efficiency of the proposed approach.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The paper introduces range regularization for federated learning with linear systematic components. The method identifies features with shared weights across clients and performs polar clustering on personalized features by adaptively driving their weights to extreme values. It claims new proof techniques to overcome the seminorm nature and non-decomposability of the regularizer for nonasymptotic statistical accuracy and faithful pattern recovery, proposes a fast optimization algorithm exploiting local strong convexity, and reports experiments supporting efficacy.

Significance. If the new proof techniques deliver the claimed nonasymptotic bounds and pattern recovery despite the non-decomposable seminorm, the work would provide useful theoretical tools for federated linear models that promote cross-client regularity for downstream efficiency gains in quantization and coding. The optimization algorithm leveraging varying local strong convexity could also reduce practical iteration counts. However, the absence of any derivations, explicit error bounds, or dataset details in the provided text leaves the significance unassessable at present.

major comments (1)
  1. [Abstract] Abstract: the central claims rest on new proof techniques that overcome the seminorm and non-decomposability of the range regularizer to obtain nonasymptotic statistical accuracy and pattern recovery, yet no theorem statements, proof sketches, error bounds, or assumption lists are supplied, rendering it impossible to verify whether the techniques succeed on the load-bearing technical obstacles.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their feedback. We respond to the major comment below.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central claims rest on new proof techniques that overcome the seminorm and non-decomposability of the range regularizer to obtain nonasymptotic statistical accuracy and pattern recovery, yet no theorem statements, proof sketches, error bounds, or assumption lists are supplied, rendering it impossible to verify whether the techniques succeed on the load-bearing technical obstacles.

    Authors: The abstract is a concise summary and does not contain full technical details by design. The complete manuscript supplies the requested elements: theorem statements and nonasymptotic bounds appear in Section 3, proof sketches addressing the seminorm and non-decomposability are in Section 4 and the appendix, explicit error bounds are stated under the listed assumptions in Section 2, and pattern recovery guarantees are derived in Theorem 4.2. If only the abstract was provided for review, the full text enables verification. We are willing to add a brief reference to the main theorems in a revised abstract. revision: partial

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper presents range penalization as a new regularization approach for federated learning, with explicitly new proof techniques developed to handle the seminorm and non-decomposability issues for nonasymptotic statistical accuracy and pattern recovery. No equations or claims in the abstract reduce by construction to fitted inputs, self-definitions, or load-bearing self-citations; the central results are positioned as independent derivations relying on the proposed methods rather than renaming or smuggling prior results. The derivation chain is self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 2 invented entities

Review is based solely on the abstract; no explicit free parameters, axioms, or invented entities beyond the named regularizer and clustering process are detailed.

invented entities (2)
  • range regularization no independent evidence
    purpose: Enhance statistical accuracy and induce cross-client regularity for quantization and efficiency in federated learning
    New regularizer introduced in the abstract
  • polar clustering no independent evidence
    purpose: Adaptively cluster weights of personalized features at extreme values
    Process described as part of the approach

pith-pipeline@v0.9.1-grok · 5656 in / 1184 out tokens · 31036 ms · 2026-06-27T11:31:08.112743+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

66 extracted references · 2 canonical work pages

  1. [1]

    Federated learning with personalization layers,

    M. G. Arivazhagan, V . Aggarwal, A. K. Singh, and S. Choudhary, “Federated learning with personalization layers,”arXiv preprint arXiv:1912.00818, 2019

  2. [2]

    Personalized federated learning for intelligent IoT applications: A cloud-edge based framework,

    Q. Wu, K. He, and X. Chen, “Personalized federated learning for intelligent IoT applications: A cloud-edge based framework,”IEEE Open Journal of the Computer Society, vol. 1, pp. 35–44, 2020

  3. [3]

    Federated learning with partial model personalization,

    K. Pillutla, K. Malik, A.-R. Mohamed, M. Rabbat, M. Sanjabi, and L. Xiao, “Federated learning with partial model personalization,” inProceedings of the 39th International Conference on Machine Learning, vol. 162, 2022, pp. 17 716– 17 758

  4. [4]

    Communication-efficient learning of deep networks from decentralized data,

    B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y. Arcas, “Communication-efficient learning of deep networks from decentralized data,” inProceedings of the 20th International Conference on Artificial Intelligence and Statistics, vol. 54, 2017, pp. 1273–1282

  5. [5]

    An efficient framework for clustered federated learning,

    A. Ghosh, J. Chung, D. Yin, and K. Ramchandran, “An efficient framework for clustered federated learning,” inAdvances in Neural Information Processing Systems, vol. 33, 2020, pp. 19 586–19 597

  6. [6]

    Robust personalized federated learning with sparse penalization,

    W. Liu, X. Mao, X. Zhang, and X. Zhang, “Robust personalized federated learning with sparse penalization,”Journal of the American Statistical Association, vol. 120, no. 549, pp. 266–277, 2025

  7. [7]

    Sparse regression with exact clustering,

    Y . She, “Sparse regression with exact clustering,”Electronic Journal of Statistics, vol. 4, pp. 1055–1096, 2010

  8. [8]

    Splitting methods for convex clustering,

    E. C. Chi and K. Lange, “Splitting methods for convex clustering,”Journal of Computational and Graphical Statistics, vol. 24, no. 4, pp. 994–1013, 2015

  9. [9]

    Fused lasso approach in regression coefficients clustering: learning parameter heterogeneity in data integration,

    L. Tang and P. X. Song, “Fused lasso approach in regression coefficients clustering: learning parameter heterogeneity in data integration,”Journal of Machine Learning Research, vol. 17, no. 113, pp. 1–23, 2016

  10. [10]

    Supervised multivariate learning with simultaneous feature auto-grouping and dimension reduction,

    Y . She, J. Shen, and C. Zhang, “Supervised multivariate learning with simultaneous feature auto-grouping and dimension reduction,”Journal of the Royal Statistical Society Series B: Statistical Methodology, vol. 84, no. 3, pp. 912–932, 2022

  11. [11]

    Communication and computation efficiency in federated learning: A survey,

    O. R. A. Almanifi, C.-O. Chow, M.-L. Tham, J. H. Chuah, and J. Kanesan, “Communication and computation efficiency in federated learning: A survey,”Internet of Things, vol. 22, p. 100742, 2023

  12. [12]

    The benefit of multitask representation learning,

    A. Maurer, M. Pontil, and B. Romera-Paredes, “The benefit of multitask representation learning,”Journal of Machine Learning Research, vol. 17, no. 81, pp. 1–32, 2016

  13. [13]

    Adaptive personalized federated learning,

    Y . Deng, M. M. Kamani, and M. Mahdavi, “Adaptive personalized federated learning,”arXiv preprint arXiv:2003.13461, 2020

  14. [14]

    Fedhb: Hierarchical bayesian federated learning,

    M. Kim and T. Hospedales, “Fedhb: Hierarchical bayesian federated learning,”Journal of Machine Learning Research, vol. 26, no. 272, pp. 1–50, 2025

  15. [15]

    Personalized federated learning with theoretical guarantees: A model-agnostic meta-learning approach,

    A. Fallah, A. Mokhtari, and A. Ozdaglar, “Personalized federated learning with theoretical guarantees: A model-agnostic meta-learning approach,”Advances in neural information processing systems, vol. 33, pp. 3557–3568, 2020

  16. [16]

    Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding,

    S. Han, H. Mao, and W. J. Dally, “Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding,” inInternational Conference on Learning Representations, 2016

  17. [17]

    Robust and communication-efficient federated learning from non-i.i.d. data,

    F. Sattler, S. Wiedemann, K.-R. M ¨uller, and W. Samek, “Robust and communication-efficient federated learning from non-i.i.d. data,”IEEE Transactions on Neural Networks and Learning Systems, vol. 31, no. 9, pp. 3400–3413, 2020. 39

  18. [18]

    Model compression for communication efficient federated learning,

    S. M. Shah and V . K. N. Lau, “Model compression for communication efficient federated learning,”IEEE Transactions on Neural Networks and Learning Systems, vol. 34, no. 9, pp. 5937–5951, 2021

  19. [19]

    Toward energy-efficient federated learning over 5G+ mobile devices,

    D. Shi, L. Li, R. Chen, P. Prakash, M. Pan, and Y . Fang, “Toward energy-efficient federated learning over 5G+ mobile devices,”IEEE Wireless Communications, vol. 29, no. 5, pp. 44–51, 2022

  20. [20]

    Compressive differentially private federated learning through universal vector quantization,

    S. Amiri, A. Belloum, S. Klous, and L. Gommans, “Compressive differentially private federated learning through universal vector quantization,” inAAAI Workshop on Privacy-Preserving Artificial Intelligence, 2021, pp. 2–9

  21. [21]

    Sparse regression with exact clustering,

    Y . She, “Sparse regression with exact clustering,” Ph.D. dissertation, Stanford University, 2008

  22. [22]

    M. J. Wainwright,High-Dimensional Statistics: A Non-Asymptotic Viewpoint, ser. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 2019

  23. [23]

    Dual seminorms, ergodic coefficients and semicontraction theory,

    G. De Pasquale, K. D. Smith, F. Bullo, and M. E. Valcher, “Dual seminorms, ergodic coefficients and semicontraction theory,”IEEE Transactions on Automatic Control, vol. 69, no. 5, pp. 3040–3053, 2024

  24. [24]

    R. T. Rockafellar,Convex Analysis. Princeton University Press, 1997

  25. [25]

    Computational methods to identify bimodal gene expression and facilitate personalized treatment in cancer patients,

    L. Moody, S. Mantha, H. Chen, and Y . Pan, “Computational methods to identify bimodal gene expression and facilitate personalized treatment in cancer patients,”Journal of Biomedical Informatics, vol. 100, p. 100001, 2019

  26. [26]

    The bimodality index: A criterion for discovering and ranking bimodal signatures from cancer gene expression profiling data,

    J. Wang, S. Wen, W. F. Symmans, L. Pusztai, and K. R. Coombes, “The bimodality index: A criterion for discovering and ranking bimodal signatures from cancer gene expression profiling data,”Cancer Informatics, vol. 7, pp. 199–216, 2009

  27. [27]

    Comparing coefficients across subpopulations in gaussian mixture regression models,

    S.-F. Tsai, “Comparing coefficients across subpopulations in gaussian mixture regression models,”Journal of Agricultural, Biological and Environmental Statistics, vol. 24, no. 4, pp. 610–633, 2019

  28. [28]

    Predicting the loss given default distribution with the zero-inflated censored beta- mixture regression that allows probability masses and bimodality,

    R.-C. Hwang, C.-K. Chu, and K. Yu, “Predicting the loss given default distribution with the zero-inflated censored beta- mixture regression that allows probability masses and bimodality,”Journal of Financial Services Research, vol. 59, no. 3, pp. 143–172, 2021

  29. [29]

    The relationships between PM 2.5 and meteorological factors in china: Seasonal and regional variations,

    Q. Yanget al., “The relationships between PM 2.5 and meteorological factors in china: Seasonal and regional variations,” International Journal of Environmental Research and Public Health, vol. 14, no. 12, p. 1510, 2017

  30. [30]

    Effect of vertical wind shear on PM 2.5 changes over a complex terrain region,

    X. Sun, Y . Zhou, T. Zhao, Y . Bai, T. Huo, L. Leng, H. He, and J. Sun, “Effect of vertical wind shear on PM 2.5 changes over a complex terrain region,”Remote Sensing, vol. 14, no. 14, p. 3333, 2022

  31. [31]

    Analysis of generalized bregman surrogate algorithms for nonsmooth nonconvex statistical learning,

    Y . She, Z. Wang, and J. Jin, “Analysis of generalized bregman surrogate algorithms for nonsmooth nonconvex statistical learning,”The Annals of Statistics, vol. 49, no. 6, pp. 3434–3459, 2021

  32. [32]

    On the conditions used to prove oracle results for the lasso,

    S. A. van de Geer and P. B ¨uhlmann, “On the conditions used to prove oracle results for the lasso,”Electronic Journal of Statistics, vol. 3, pp. 1360–1392, 2009

  33. [33]

    Restricted eigenvalue properties for correlated gaussian designs,

    G. Raskutti, M. J. Wainwright, and B. Yu, “Restricted eigenvalue properties for correlated gaussian designs,”Journal of Machine Learning Research, vol. 11, pp. 2241–2259, 2010

  34. [34]

    A unified framework for high-dimensional analysis of M-estimators with decomposable regularizers,

    S. N. Negahban, P. Ravikumar, M. J. Wainwright, and B. Yu, “A unified framework for high-dimensional analysis of M-estimators with decomposable regularizers,”Statistical Science, vol. 27, no. 4, pp. 538–557, 2012

  35. [35]

    Regularized M-estimators with nonconvexity: statistical and algorithmic theory for local optima,

    P.-L. Loh and M. J. Wainwright, “Regularized M-estimators with nonconvexity: statistical and algorithmic theory for local optima,”Journal Machine Learning Research, vol. 16, no. 19, pp. 559–616, 2015

  36. [36]

    Concentration inequalities for polynomials inα-sub-exponential random variables,

    F. G ¨otze, H. Sambale, and A. Sinulis, “Concentration inequalities for polynomials inα-sub-exponential random variables,” Electronic Journal of Probability, vol. 26, pp. 1–22, 2021

  37. [37]

    Selective factor extraction in high dimensions,

    Y . She, “Selective factor extraction in high dimensions,”Biometrika, vol. 104, no. 1, pp. 97–110, 2017

  38. [38]

    Vershynin,High-Dimensional Probability: An Introduction with Applications in Data Science, ser

    R. Vershynin,High-Dimensional Probability: An Introduction with Applications in Data Science, ser. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 2018

  39. [39]

    On model selection consistency of lasso,

    P. Zhao and B. Yu, “On model selection consistency of lasso,”Journal of Machine Learning Research, vol. 7, no. 90, pp. 2541–2563, 2006

  40. [40]

    Sharp thresholds for high-dimensional and noisy sparsity recovery usingℓ 1-constrained quadratic programming (lasso),

    M. J. Wainwright, “Sharp thresholds for high-dimensional and noisy sparsity recovery usingℓ 1-constrained quadratic programming (lasso),”IEEE Transactions on Information Theory, vol. 55, no. 5, pp. 2183–2202, 2009

  41. [41]

    Federated optimization in heterogeneous networks,

    T. Li, A. K. Sahu, M. Zaheer, M. Sanjabi, A. Talwalkar, and V . Smith, “Federated optimization in heterogeneous networks,” inProceedings of Machine Learning and Systems, vol. 2, 2020, pp. 429–450

  42. [42]

    Fast global convergence of gradient methods for high-dimensional statistical recovery,

    A. Agarwal, S. Negahban, and M. J. Wainwright, “Fast global convergence of gradient methods for high-dimensional statistical recovery,”The Annals of Statistics, vol. 40, no. 5, pp. 2452–2482, 2012

  43. [43]

    Fedpaq: A communication-efficient federated learning method with periodic averaging and quantization,

    A. Reisizadeh, A. Mokhtari, H. Hassani, A. Jadbabaie, and R. Pedarsani, “Fedpaq: A communication-efficient federated learning method with periodic averaging and quantization,” inProceedings of the 23rd International Conference on Artificial Intelligence and Statistics, vol. 108, 2020, pp. 2021–2031

  44. [44]

    Model pruning enables efficient federated learning on edge devices,

    Y . Jiang, S. Wang, V . Valls, B. J. Ko, W.-H. Lee, K. K. Leung, and L. Tassiulas, “Model pruning enables efficient federated learning on edge devices,”IEEE Transactions on Neural Networks and Learning Systems, vol. 34, no. 12, pp. 10 374–10 386, 2022

  45. [45]

    Cfd: Communication-efficient federated distillation via soft-label quantization and delta coding,

    F. Sattler, A. Marban, R. Rischke, and W. Samek, “Cfd: Communication-efficient federated distillation via soft-label quantization and delta coding,”IEEE Transactions on Network Science and Engineering, vol. 9, no. 4, pp. 2025–2038, 2021. 40

  46. [46]

    Proxskip: Yes! Local gradient steps provably lead to communication acceleration! Finally!

    K. Mishchenko, G. Malinovsky, S. Stich, and P. Richt ´arik, “Proxskip: Yes! Local gradient steps provably lead to communication acceleration! Finally!” inProceedings of the 39th International Conference on Machine Learning, vol. 162, 2022, pp. 15 750–15 769

  47. [47]

    Client selection for federated learning with heterogeneous resources in mobile edge,

    T. Nishio and R. Yonetani, “Client selection for federated learning with heterogeneous resources in mobile edge,” inICC 2019-2019 IEEE international conference on communications (ICC), 2019, pp. 1–7

  48. [48]

    On an approach to the construction of optimal methods of minimization of smooth convex functions,

    Y . Nesterov, “On an approach to the construction of optimal methods of minimization of smooth convex functions,” Ekonom. i. Mat. Metody (In Russian), vol. 24, no. 3, pp. 509–517, 1988

  49. [49]

    Approximation accuracy, gradient methods, and error bound for structured convex optimization,

    P. Tseng, “Approximation accuracy, gradient methods, and error bound for structured convex optimization,”Mathematical Programming, vol. 125, no. 2, pp. 263–295, 2010

  50. [50]

    A. B. Tsybakov,Introduction to Nonparametric Estimation, 1st ed. Springer Publishing Company, Incorporated, 2008

  51. [51]

    Exponential Screening and optimal rates of sparse estimation,

    P. Rigollet and A. Tsybakov, “Exponential Screening and optimal rates of sparse estimation,”The Annals of Statistics, vol. 39, no. 2, pp. 731 – 771, 2011

  52. [52]

    Calculus on Gauss space: An introduction to Gaussian analysis,

    T. Alberts and D. Khoshnevisan, “Calculus on Gauss space: An introduction to Gaussian analysis,” 2018, https://www.math.utah.edu/ davar/math7880/F18/Gaussi- anAnalysis.pdf

  53. [53]

    Communities and Crime

    M. Redmond, “Communities and Crime,” UCI Machine Learning Repository, 2009, DOI: https://doi.org/10.24432/C53W3X

  54. [54]

    A multiobjective exploratory procedure for regression model selection,

    A. Sinha, P. Malo, and T. Kuosmanen, “A multiobjective exploratory procedure for regression model selection,”Journal of Computational and Graphical Statistics, vol. 24, no. 1, pp. 154–182, 2015

  55. [55]

    High-dimensional integrative analysis with homogeneity and sparsity recovery,

    X. Yang, X. Yan, and J. Huang, “High-dimensional integrative analysis with homogeneity and sparsity recovery,”Journal of Multivariate Analysis, vol. 174, p. 104529, 2019

  56. [56]

    Using machine learning to identify top antecedents affecting crime in us communities,

    K. Samara, “Using machine learning to identify top antecedents affecting crime in us communities,” inAdvances in Information and Communication. Springer, 2023, pp. 96–101

  57. [57]

    On cross-validation for sparse reduced rank regression,

    Y . She and H. Tran, “On cross-validation for sparse reduced rank regression,”Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol. 81, no. 1, pp. 145–161, 2019

  58. [58]

    Social inequality, crime, and deviance,

    R. L. Matsueda and M. S. Grigoryeva, “Social inequality, crime, and deviance,”Handbook of the social psychology of inequality, pp. 683–714, 2014

  59. [59]

    ‘seattle’s best practices in the 1990s: Municipal-led economic and workforce development,

    B. Watrus and J. Haavig, “‘seattle’s best practices in the 1990s: Municipal-led economic and workforce development,” Economic Development in American Cities: The Pursuit of an Equity Agenda, pp. 111–132, 2007

  60. [60]

    2017 , howpublished =

    S. Chen, “Beijing Multi-Site Air Quality,” UCI Machine Learning Repository, 2019, DOI: https://doi.org/10.24432/C5RK5G

  61. [61]

    PM10 and PM2.5 real-time prediction models using an interpolated convolutional neural network,

    S. Chae, J. Shin, S. Kwon, S. Lee, S. Kang, and D. Lee, “PM10 and PM2.5 real-time prediction models using an interpolated convolutional neural network,”Scientific Reports, vol. 11, no. 1, p. 11952, 2021

  62. [62]

    Improving federated learning personalization via model agnostic meta learning,

    Y . Jiang, J. Kone ˇcn`y, K. Rush, and S. Kannan, “Improving federated learning personalization via model agnostic meta learning,”arXiv preprint arXiv:1909.12488, 2019

  63. [63]

    Motley: Benchmarking heterogeneity and personalization in federated learning,

    S. Wu, T. Li, Z. Charles, Y . Xiao, Z. Liu, Z. Xu, and V . Smith, “Motley: Benchmarking heterogeneity and personalization in federated learning,” inProceedings of the Workshop on Federated Learning: Recent Advances and New Challenges (in Conjunction with NeurIPS 2022), 2022, neurIPS Federated Learning Workshop

  64. [64]

    Meteorological and urban landscape factors on severe air pollution in beijing,

    L. Han, W. Zhou, W. Li, D. T. Meshesha, L. Li, and M. Zheng, “Meteorological and urban landscape factors on severe air pollution in beijing,”Journal of the Air & Waste Management Association, vol. 65, no. 7, pp. 782–787, 2015

  65. [65]

    Impacts of complex terrain features on local wind field and PM2. 5 concentration,

    Y . Song and M. Shao, “Impacts of complex terrain features on local wind field and PM2. 5 concentration,”Atmosphere, vol. 14, no. 5, p. 761, 2023

  66. [66]

    Research on the pollution characteristics and causality of haze-sand air pollution in Beijing in spring,

    Y . Wang, Q. Li, Z. Zheng, and Y . Dou, “Research on the pollution characteristics and causality of haze-sand air pollution in Beijing in spring,”Environmental Science, vol. 40, no. 6, pp. 2582–2594, 2019