pith. sign in

arxiv: 2506.01404 · v2 · submitted 2025-06-02 · 💻 cs.LG · cs.MA· cs.SY· eess.SY

Quantitative Error Feedback for Quantization Noise Reduction of Filtering over Graphs

Pith reviewed 2026-05-19 11:15 UTC · model grok-4.3

classification 💻 cs.LG cs.MAcs.SYeess.SY
keywords quantization noiseerror feedbackgraph filteringdistributed algorithmsquantized communicationdecentralized optimizationrandom graphsasynchronous updates
0
0 comments X

The pith

Quantitative error feedback compensates quantization noise exactly in distributed graph filtering by feeding back isolated errors with closed-form optimal coefficients.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a framework that isolates quantization noise during message passing in graph filters and feeds it back quantitatively for compensation, drawing from error spectrum shaping in digital filters. It analyzes this under deterministic graphs, random graphs, and random asynchronous node updates, deriving closed-form solutions for the best feedback coefficients in each case. The result is lower overall error in the filtering output without extra communication overhead. This mechanism also integrates into broader decentralized optimization routines to push down their error floors. Experiments confirm consistent gains over standard quantization approaches in accuracy and stability.

Core claim

By quantitatively feeding back the exact quantization noise term within the graph filtering update rules, the framework achieves substantial reduction in quantization-induced error, with explicit closed-form expressions for the optimal feedback coefficients that apply to deterministic filtering, filtering on random graphs, and filtering under random node-asynchronous updates.

What carries the argument

The quantitative error feedback mechanism that isolates the quantization noise term and subtracts a scaled version of it from subsequent updates to achieve exact compensation.

If this is right

  • Quantization noise effect is significantly reduced in distributed graph filtering across the three analyzed scenarios.
  • Closed-form optimal error feedback coefficients are available for deterministic, random-graph, and asynchronous cases.
  • The same mechanism integrates directly into communication-efficient decentralized optimization to achieve lower error floors.
  • Numerical results show consistent outperformance versus conventional quantization strategies in both accuracy and robustness.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The approach may extend to other message-passing algorithms such as consensus or distributed learning where quantization is a bottleneck.
  • In sensor networks or edge devices with strict bit constraints, this could enable higher effective precision without increasing bandwidth.
  • Similar feedback ideas might apply to quantization in neural network inference on graphs or other structured data.

Load-bearing premise

The quantization noise term can be precisely isolated from the signal and fed back for exact compensation without needing extra communication bits or breaking the rules of the three filtering scenarios.

What would settle it

An experiment on a small deterministic graph where the measured steady-state error after applying the derived optimal coefficients fails to match the predicted reduction or requires additional bits to isolate the noise would falsify the central claim.

Figures

Figures reproduced from arXiv: 2506.01404 by Stefan Vlaski, Tareq Al-Naffouri, Weihang Liu, Xin Lou, Xue Xian Zheng.

Figure 1
Figure 1. Figure 1: MSD versus filter order/iteration of quantized graph filtering on deterministic processes. [PITH_FULL_IMAGE:figures/full_fig_p010_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: MSD versus probability/iteration of quantized graph filtering on random processes. [PITH_FULL_IMAGE:figures/full_fig_p011_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: MSD versus iteration of quantized regression over a network, [PITH_FULL_IMAGE:figures/full_fig_p012_3.png] view at source ↗
read the original abstract

This paper introduces an innovative error feedback framework designed to mitigate quantization noise in distributed graph filtering, where communications are constrained to quantized messages. It comes from error spectrum shaping techniques from state-space digital filters, and therefore establishes connections between quantized filtering processes over different domains. In contrast to existing error compensation methods, our framework quantitatively feeds back the quantization noise for exact compensation. We examine the framework under three key scenarios: (i) deterministic graph filtering, (ii) graph filtering over random graphs, and (iii) graph filtering with random node-asynchronous updates. Rigorous theoretical analysis demonstrates that the proposed framework significantly reduces the effect of quantization noise, and we provide closed-form solutions for the optimal error feedback coefficients. Moreover, this quantitative error feedback mechanism can be seamlessly integrated into communication-efficient decentralized optimization frameworks, enabling lower error floors. Numerical experiments validate the theoretical results, consistently showing that our method outperforms conventional quantization strategies in terms of both accuracy and robustness.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces a quantitative error feedback framework, adapted from error spectrum shaping in digital filters, to mitigate quantization noise in distributed graph filtering under quantized communications. It derives closed-form optimal error feedback coefficients for exact compensation in three scenarios—deterministic graph filtering, filtering over random graphs, and filtering with random node-asynchronous updates—claims significant noise reduction via rigorous analysis, and shows integration into decentralized optimization with lower error floors, supported by numerical experiments.

Significance. If the exact local compensation holds without hidden communication or state requirements, the closed-form solutions and DSP-to-graph-filtering connections could meaningfully advance communication-efficient distributed algorithms on graphs, particularly in asynchronous or random settings common to sensor networks and decentralized ML. The numerical validation showing consistent outperformance over standard quantization is a concrete strength.

major comments (2)
  1. [§4.3] §4.3 (random node-asynchronous updates): the derivation of optimal feedback coefficients assumes each node can locally isolate the precise quantization error e = x − Q(x) and inject it to achieve exact global cancellation in the subsequent iteration. However, the asynchronous activation schedule couples the effective filtering operator to which nodes transmit at each step; without the feedback term explicitly depending on the (unobserved) activation pattern of neighbors, the claimed exactness guarantee does not follow from the local information available at each node.
  2. [§3–§4] §3–§4 (closed-form derivations): the abstract and introduction assert 'rigorous theoretical analysis' yielding closed-form optimal coefficients, yet the main text provides only the final expressions without the intermediate steps that isolate the quantization noise term and solve the resulting quadratic optimization. This omission prevents verification that the noise-reduction claim remains exact rather than approximate once the three scenario-specific constraints are imposed.
minor comments (2)
  1. [§2] The quantization noise model is introduced as additive and independent, but the paper does not state whether the quantizer is uniform, dithered, or mid-rise; this detail affects the validity of the error-feedback optimality equations.
  2. [Figure 3] Figure 3 caption and surrounding text use 'error floor' without defining the metric (e.g., steady-state MSE or relative error); consistency with the theoretical expressions in Eq. (12) would improve clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments. We address each major comment below and indicate the corresponding revisions to the manuscript.

read point-by-point responses
  1. Referee: [§4.3] §4.3 (random node-asynchronous updates): the derivation of optimal feedback coefficients assumes each node can locally isolate the precise quantization error e = x − Q(x) and inject it to achieve exact global cancellation in the subsequent iteration. However, the asynchronous activation schedule couples the effective filtering operator to which nodes transmit at each step; without the feedback term explicitly depending on the (unobserved) activation pattern of neighbors, the claimed exactness guarantee does not follow from the local information available at each node.

    Authors: We thank the referee for this observation. In Section 4.3 the asynchronous updates are modeled via independent Bernoulli activation variables whose statistics are known locally. The optimal feedback coefficients are derived by minimizing the expected quantization-noise variance in the subsequent global state, where the expectation is taken over the random activation pattern. Each node computes its local quantization error from the value it transmits and applies the (scalar) feedback coefficient to that error before transmission. Substituting the feedback term into the update equation shows that the expected noise contribution vanishes exactly, without any node needing to observe the instantaneous activation pattern of its neighbors. This is the standard notion of exact compensation for random asynchronous algorithms. We will add a clarifying paragraph in the revised Section 4.3 stating that the guarantee holds in expectation over the activation schedule and briefly contrast it with a path-wise guarantee. revision: partial

  2. Referee: [§3–§4] §3–§4 (closed-form derivations): the abstract and introduction assert 'rigorous theoretical analysis' yielding closed-form optimal coefficients, yet the main text provides only the final expressions without the intermediate steps that isolate the quantization noise term and solve the resulting quadratic optimization. This omission prevents verification that the noise-reduction claim remains exact rather than approximate once the three scenario-specific constraints are imposed.

    Authors: We agree that the intermediate algebraic steps were omitted for conciseness. In the revised manuscript we will expand Sections 3 and 4 to include the full derivation for each scenario: (i) write the quantized recursion, (ii) substitute the linear error-feedback term, (iii) collect the effective noise term, (iv) form the quadratic objective in the feedback coefficients, and (v) solve the resulting linear system to obtain the closed-form expressions. These steps will make explicit that the compensation is exact (the noise term is driven to zero) once the scenario-specific constraints are imposed. revision: yes

Circularity Check

0 steps flagged

No circularity: closed-form derivations are independent of results

full rationale

The paper adapts error spectrum shaping from state-space digital filters to graph filtering and derives closed-form optimal error feedback coefficients for three scenarios using standard quantization noise models and graph signal processing. These derivations rely on explicit update rules and noise isolation assumptions that are stated upfront rather than fitted or defined in terms of the final performance claims. No self-citation chains, ansatzes smuggled via prior work, or predictions that reduce to input fits by construction are present. The analysis remains self-contained against external benchmarks in digital filtering and graph theory.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The framework rests on adapting error spectrum shaping assumptions to graph domains with no explicit free parameters or new entities stated; relies on standard modeling of quantization as additive noise that can be exactly tracked.

axioms (1)
  • domain assumption Quantization noise can be isolated and quantitatively fed back for exact compensation in the graph filtering update process.
    This is the core premise enabling the framework, drawn from state-space digital filter techniques and applied to the three graph scenarios.

pith-pipeline@v0.9.0 · 5711 in / 1211 out tokens · 55184 ms · 2026-05-19T11:15:22.920275+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

65 extracted references · 65 canonical work pages · 5 internal anchors

  1. [1]

    Error feedback approach for quanti- zation noise reduction of distributed graph filters,

    X. X. Zheng and T. Al-Naffouri, “Error feedback approach for quanti- zation noise reduction of distributed graph filters,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP) , 2025, pp. 1–5

  2. [2]

    The emerging field of signal processing on graphs: Ex- tending high-dimensional data analysis to networks and other irregular domains,

    D. I. Shuman, S. K. Narang, P. Frossard, A. Ortega, and P. Van- dergheynst, “The emerging field of signal processing on graphs: Ex- tending high-dimensional data analysis to networks and other irregular domains,” IEEE Signal Process. Mag. , vol. 30, no. 3, pp. 83–98, 2013

  3. [3]

    Discrete signal processing on graphs,

    A. Sandryhaila and J. M. Moura, “Discrete signal processing on graphs,” IEEE Trans. Signal Process. , vol. 61, no. 7, pp. 1644–1656, 2013

  4. [4]

    Graph signal processing: Overview, challenges, and ap- plications,

    A. Ortega, P. Frossard, J. Kova ˇcevi´c, J. M. F. Moura, and P. Van- dergheynst, “Graph signal processing: Overview, challenges, and ap- plications,” Proc. IEEE., vol. 106, no. 5, pp. 808–828, 2018

  5. [5]

    Graph filters for signal processing and machine learning on graphs,

    E. Isufi, F. Gama, D. I. Shuman, and S. Segarra, “Graph filters for signal processing and machine learning on graphs,” IEEE Trans. Signal Process., pp. 1–32, 2024

  6. [6]

    Graph signal denoising via trilateral filter on graph spectral domain,

    M. Onuki, S. Ono, M. Yamagishi, and Y . Tanaka, “Graph signal denoising via trilateral filter on graph spectral domain,” IEEE Trans. Signal Inf. Process. Netw. , vol. 2, no. 2, pp. 137–148, 2016

  7. [7]

    Compressive spectral clustering,

    N. Tremblay, G. Puy, R. Gribonval, and P. Vandergheynst, “Compressive spectral clustering,” in Proc. Int. Conf. Mach. Learn. (ICML) , vol. 48, 2016, pp. 1002–1011

  8. [8]

    Semi-supervised learning for graph to signal mapping: A graph signal wiener filter interpretation,

    B. Girault, P. Goncalves, E. Fleury, and A. S. Mor, “Semi-supervised learning for graph to signal mapping: A graph signal wiener filter interpretation,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 2014, pp. 1115–1119

  9. [9]

    Localized spectral graph filter frames: A unifying framework, survey of design considerations, and numerical comparison,

    D. I. Shuman, “Localized spectral graph filter frames: A unifying framework, survey of design considerations, and numerical comparison,” IEEE Signal Process. Mag. , vol. 37, no. 6, pp. 43–63, 2020

  10. [10]

    Connecting the dots: Identifying network structure via graph signal processing,

    G. Mateos, S. Segarra, A. G. Marques, and A. Ribeiro, “Connecting the dots: Identifying network structure via graph signal processing,” IEEE Signal Process. Mag. , vol. 36, no. 3, pp. 16–43, 2019

  11. [11]

    Convolutional neural networks on graphs with fast localized spectral filtering,

    M. Defferrard, X. Bresson, and P. Vandergheynst, “Convolutional neural networks on graphs with fast localized spectral filtering,” Adv. Neural Inf. Process. Syst. , vol. 29, 2016

  12. [12]

    Semi-Supervised Classification with Graph Convolutional Networks

    T. N. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,” arXiv preprint arXiv:1609.02907 , 2016

  13. [13]

    Classic gnns are strong baselines: Re- assessing gnns for node classification,

    Y . Luo, L. Shi, and X.-M. Wu, “Classic gnns are strong baselines: Re- assessing gnns for node classification,”arXiv preprint arXiv:2406.08993, 2024

  14. [14]

    On vanishing gradients, over-smoothing, and over-squashing in gnns: Bridging recurrent and graph learning,

    ´A. Arroyo, A. Gravina, B. Gutteridge, F. Barbero, C. Gallicchio, X. Dong, M. Bronstein, and P. Vandergheynst, “On vanishing gradients, over-smoothing, and over-squashing in gnns: Bridging recurrent and graph learning,” arXiv preprint arXiv:2502.10818 , 2025. 13

  15. [15]

    Graph neural networks in network neuroscience,

    A. Bessadok, M. A. Mahjoub, and I. Rekik, “Graph neural networks in network neuroscience,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 5, pp. 5833–5848, 2022

  16. [16]

    Incorporating corporation relationship via graph convolutional neural networks for stock price prediction,

    Y . Chen, Z. Wei, and X. Huang, “Incorporating corporation relationship via graph convolutional neural networks for stock price prediction,” in Proc. 27th ACM Int. Conf. Inf. Knowl. Manage. , 2018, pp. 1655–1658

  17. [17]

    Half a decade of graph convolutional net- works,

    M. Haghir Chehreghani, “Half a decade of graph convolutional net- works,” Nat. Mach. Intell. , vol. 4, no. 3, pp. 192–193, 2022

  18. [18]

    Graphs, convolutions, and neural networks: From graph filters to graph neural networks,

    F. Gama, E. Isufi, G. Leus, and A. Ribeiro, “Graphs, convolutions, and neural networks: From graph filters to graph neural networks,” IEEE Signal Process. Mag. , vol. 37, no. 6, pp. 128–138, 2020

  19. [19]

    Networked signal and information processing: Learning by multiagent systems,

    S. Vlaski, S. Kar, A. H. Sayed, and J. M. Moura, “Networked signal and information processing: Learning by multiagent systems,” IEEE Signal Process. Mag., vol. 40, no. 5, pp. 92–105, 2023

  20. [20]

    Advances in distributed graph filtering,

    M. Coutino, E. Isufi, and G. Leus, “Advances in distributed graph filtering,” IEEE Trans. Signal Process. , vol. 67, no. 9, pp. 2320–2333, 2019

  21. [21]

    Autoregressive moving average graph filtering,

    E. Isufi, A. Loukas, A. Simonetto, and G. Leus, “Autoregressive moving average graph filtering,” IEEE Trans. Signal Process., vol. 65, no. 2, pp. 274–288, 2016

  22. [22]

    Quantization for decentralized learning under subspace constraints,

    R. Nassif, S. Vlaski, M. Carpentiero, V . Matta, M. Antonini, and A. H. Sayed, “Quantization for decentralized learning under subspace constraints,” IEEE Trans. Signal Process., vol. 71, pp. 2320–2335, 2023

  23. [23]

    Filtering random graph processes over random time-varying graphs,

    E. Isufi, A. Loukas, A. Simonetto, and G. Leus, “Filtering random graph processes over random time-varying graphs,” IEEE Trans. Signal Process., vol. 65, no. 16, pp. 4406–4421, 2017

  24. [24]

    Iir filtering on graphs with random node-asynchronous updates,

    O. Teke and P. P. Vaidyanathan, “Iir filtering on graphs with random node-asynchronous updates,” IEEE Trans. Signal Process. , vol. 68, pp. 3945–3960, 2020

  25. [25]

    Asynchronous distributed edge-variant graph filters,

    M. Coutino and G. Leus, “Asynchronous distributed edge-variant graph filters,” in Proc. IEEE Data Sci. Workshop (DSW) , 2019, pp. 115–119

  26. [26]

    Distributed average consensus with dithered quantization,

    T. C. Aysal, M. J. Coates, and M. G. Rabbat, “Distributed average consensus with dithered quantization,” IEEE Trans. Signal Process. , vol. 56, no. 10, pp. 4905–4918, 2008

  27. [27]

    An adaptive quantization scheme for distributed consensus,

    J. Fang and H. Li, “An adaptive quantization scheme for distributed consensus,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 2009, pp. 2777–2780

  28. [28]

    Gossip consensus algorithms via quantized communication,

    R. Carli, F. Fagnani, P. Frasca, and S. Zampieri, “Gossip consensus algorithms via quantized communication,” Automatica, vol. 46, no. 1, pp. 70–80, 2010

  29. [29]

    Distributed consensus algorithms in sensor networks: Quantized data and random link failures,

    S. Kar and J. M. Moura, “Distributed consensus algorithms in sensor networks: Quantized data and random link failures,” IEEE Trans. Signal Process., vol. 58, no. 3, pp. 1383–1400, 2009

  30. [30]

    Distributed average consensus with quantization refinement,

    D. Thanou, E. Kokiopoulou, Y . Pu, and P. Frossard, “Distributed average consensus with quantization refinement,” IEEE Trans. Signal Process. , vol. 61, no. 1, pp. 194–205, 2012

  31. [31]

    Finite-precision effects on graph filters,

    L. F. Chamon and A. Ribeiro, “Finite-precision effects on graph filters,” in Proc. Glob. Conf. Signal Inf. Process. (GlobalSIP) , 2017, pp. 603– 607

  32. [32]

    Optimized quantization in distributed graph signal processing,

    I. C. M. Nobre and P. Frossard, “Optimized quantization in distributed graph signal processing,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP) , 2019, pp. 5376–5380

  33. [33]

    Learning of robust spectral graph dictionar- ies for distributed processing,

    D. Thanou and P. Frossard, “Learning of robust spectral graph dictionar- ies for distributed processing,” EURASIP J. Adv. Signal Process. , vol. 2018, no. 1, p. 67, 2018

  34. [34]

    Quantization analysis and robust design for distributed graph filters,

    L. Ben Saad, B. Beferull-Lozano, and E. Isufi, “Quantization analysis and robust design for distributed graph filters,” IEEE Trans. Signal Process., vol. 70, pp. 643–658, 2022

  35. [35]

    K. K. Parhi, VLSI digital signal processing systems: design and imple- mentation. John Wiley & Sons, 2007

  36. [36]

    Reduction of quantizing noise by use of feedback,

    H. Spang and P. Schultheiss, “Reduction of quantizing noise by use of feedback,” IRE Trans. Commun. , vol. 10, no. 4, pp. 373–380, 1962

  37. [37]

    Error spectrum shaping in narrow-band recur- sive filters,

    Tran-Thong and B. Liu, “Error spectrum shaping in narrow-band recur- sive filters,” IEEE Trans. Acoust. Speech, Signal Processing , vol. 25, no. 2, pp. 200–203, 1977

  38. [38]

    On error-spectrum shaping in state-space digital filters,

    P. Vaidyanathan, “On error-spectrum shaping in state-space digital filters,” IEEE Trans. Circuits Syst. , vol. 32, no. 1, pp. 88–92, 1985

  39. [39]

    Noise reduction strategies for digital filters: Error spectrum shaping versus the optimal linear state-space formulation,

    W. Higgins and D. Munson, “Noise reduction strategies for digital filters: Error spectrum shaping versus the optimal linear state-space formulation,” IEEE Trans. Acoust. Speech, Signal Processing , vol. 30, no. 6, pp. 963–973, 1982

  40. [40]

    Roundoff noise minimiza- tion of state-space digital filters using separate and joint error feed- back/coordinate transformation optimization,

    T. Hinamoto, H. Ohnishi, and W.-S. Lu, “Roundoff noise minimiza- tion of state-space digital filters using separate and joint error feed- back/coordinate transformation optimization,” IEEE Trans. Circuits Syst. I, vol. 50, no. 1, pp. 23–33, 2003

  41. [41]

    Jointly optimized error-feedback and realization for roundoff noise minimization in state-space digital filters,

    W.-S. Lu and T. Hinamoto, “Jointly optimized error-feedback and realization for roundoff noise minimization in state-space digital filters,” IEEE Trans. Signal Process. , vol. 53, no. 6, pp. 2135–2145, 2005

  42. [42]

    Spectral shaping of circuit errors in digital-to-analog con- verters,

    I. Galton, “Spectral shaping of circuit errors in digital-to-analog con- verters,” IEEE Trans. Circuits Syst. II , vol. 44, no. 10, pp. 808–817, 1997

  43. [43]

    Spectral shaping of dithered quantization errors in sigma–delta modulators,

    H. Hsieh and C.-L. Lin, “Spectral shaping of dithered quantization errors in sigma–delta modulators,” IEEE Trans. Circuits Syst. I , vol. 54, no. 5, pp. 974–980, 2007

  44. [44]

    7.8 a 22nm delta- sigma computing-in-memory (∆Σcim) sram macro with near-zero-mean outputs and lsb-first adcs achieving 21.38tops/w for 8b-mac edge ai processing,

    P. Chen, M. Wu, W. Zhao, J. Cui, Z. Wang, Y . Zhang, Q. Wang, J. Ru, L. Shen, T. Jia, Y . Ma, L. Ye, and R. Huang, “7.8 a 22nm delta- sigma computing-in-memory (∆Σcim) sram macro with near-zero-mean outputs and lsb-first adcs achieving 21.38tops/w for 8b-mac edge ai processing,” in ISSCC, 2023, pp. 140–142

  45. [45]

    1-bit stochastic gradient descent and its application to data-parallel distributed training of speech dnns,

    F. Seide, H. Fu, J. Droppo, G. Li, and D. Yu, “1-bit stochastic gradient descent and its application to data-parallel distributed training of speech dnns,” in Interspeech, 2014, pp. 1058–1062

  46. [46]

    Sparsified sgd with memory,

    S. U. Stich, J.-B. Cordonnier, and M. Jaggi, “Sparsified sgd with memory,” Adv. Neural Inf. Process. Syst. , vol. 31, 2018

  47. [47]

    Error feedback fixes signsgd and other gradient compression schemes,

    S. P. Karimireddy, Q. Rebjock, S. Stich, and M. Jaggi, “Error feedback fixes signsgd and other gradient compression schemes,” in Proc. Int. Conf. Mach. Learn. (ICML) , 2019, pp. 3252–3261

  48. [48]

    Ef21: A new, simpler, theoretically better, and practically faster error feedback,

    P. Richt ´arik, I. Sokolov, and I. Fatkhullin, “Ef21: A new, simpler, theoretically better, and practically faster error feedback,” Adv. Neural Inf. Process. Syst. , vol. 34, pp. 4384–4396, 2021

  49. [49]

    Momentum provably im- proves error feedback!

    I. Fatkhullin, A. Tyurin, and P. Richt ´arik, “Momentum provably im- proves error feedback!” Adv. Neural Inf. Process. Syst. , vol. 36, pp. 76 444–76 495, 2023

  50. [50]

    On biased compression for distributed learning,

    A. Beznosikov, S. Horv ´ath, P. Richt ´arik, and M. Safaryan, “On biased compression for distributed learning,” J. Mach. Learn. Res. , vol. 24, no. 276, pp. 1–50, 2023

  51. [51]

    Differential error feedback for communication-efficient decentralized optimization,

    R. Nassif, S. Vlaski, M. Carpentiero, V . Matta, and A. H. Sayed, “Differential error feedback for communication-efficient decentralized optimization,” in Proc. IEEE Sens. Array Multichannel Signal Process. Workshop (SAM), 2024, pp. 1–5

  52. [52]

    Differential error feedback for communication-efficient decentralized learning,

    R. Nassif, S. Vlaski, M. Carpentiero, V . Matta, and A. H. Sayed, “Differential error feedback for communication-efficient decentralized learning,” IEEE Trans. Signal Process. , pp. 1–16, 2025

  53. [53]

    Efficiently Modeling Long Sequences with Structured State Spaces

    A. Gu, K. Goel, and C. R ´e, “Efficiently modeling long sequences with structured state spaces,” arXiv preprint arXiv:2111.00396 , 2021

  54. [54]

    Mamba: Linear-Time Sequence Modeling with Selective State Spaces

    A. Gu and T. Dao, “Mamba: Linear-time sequence modeling with selective state spaces,” arXiv preprint arXiv:2312.00752 , 2023

  55. [55]

    Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality

    T. Dao and A. Gu, “Transformers are ssms: Generalized models and ef- ficient algorithms through structured state space duality,” arXiv preprint arXiv:2405.21060, 2024

  56. [56]

    Hippo: Recurrent memory with optimal polynomial projections,

    A. Gu, T. Dao, S. Ermon, A. Rudra, and C. R ´e, “Hippo: Recurrent memory with optimal polynomial projections,” Adv. Neural Inf. Process. Syst., vol. 33, pp. 1474–1487, 2020

  57. [57]

    Stochastic graph neural networks,

    Z. Gao, E. Isufi, and A. Ribeiro, “Stochastic graph neural networks,” IEEE Trans. Signal Process. , vol. 69, pp. 4428–4443, 2021

  58. [58]

    Synthesis of linear-phase fir filters with a complex exponential impulse response,

    X. X. Zheng, J. Yang, S. Y . Yang, W. Chen, L. Y . Huang, and X. Y . Zhang, “Synthesis of linear-phase fir filters with a complex exponential impulse response,” IEEE Trans. Signal Process., vol. 69, pp. 6101–6115, 2021

  59. [59]

    GSPBOX: A toolbox for signal processing on graphs

    N. Perraudin, J. Paratte, D. Shuman, L. Martin, V . Kalofolias, P. Van- dergheynst, and D. K. Hammond, “Gspbox: A toolbox for signal processing on graphs,” arXiv preprint arXiv:1408.5781 , 2014

  60. [60]

    Quantization,

    R. Gray and D. Neuhoff, “Quantization,” IEEE Trans. Inf. Theory , vol. 44, no. 6, pp. 2325–2383, 1998

  61. [61]

    Dithered quantizers,

    R. Gray and T. Stockham, “Dithered quantizers,” IEEE Trans. Inf. Theory, vol. 39, no. 3, pp. 805–812, 1993

  62. [62]

    Multitask learning over graphs: An approach for distributed, streaming machine learning,

    R. Nassif, S. Vlaski, C. Richard, J. Chen, and A. H. Sayed, “Multitask learning over graphs: An approach for distributed, streaming machine learning,” IEEE Signal Process. Mag. , vol. 37, no. 3, pp. 14–25, 2020

  63. [63]

    Learning over multitask graphs—part i: Stability analysis,

    R. Nassif, S. Vlaski, C. Richard, and A. H. Sayed, “Learning over multitask graphs—part i: Stability analysis,” IEEE Open J. Signal Process., vol. 1, pp. 28–45, 2020

  64. [64]

    Learning over multitask graphs—part ii: Performance analysis,

    R. Nassif, S. Vlaski, C. Richard, and A. H. Sayed, “Learning over multitask graphs—part ii: Performance analysis,” IEEE Open J. Signal Process., vol. 1, pp. 46–63, 2020

  65. [65]

    Multitask Learning over Graphs Simulation,

    R. Nassif, “Multitask Learning over Graphs Simulation,” GitHub repository, 2022. [Online]. Available: https://github.com/Roulanassif/ Multitask-learning-over-graphs-simulation 14 APPENDIX A MATHEMATICAL PROOFS In this section, we provide the proofs of our main technical results. These include Theorem 1 from Section III-A; Proposition 2 and Remark 3 from S...