pith. sign in

arxiv: 2605.21915 · v1 · pith:HIKO4UNBnew · submitted 2026-05-21 · 💻 cs.CR · cs.LG

CCLab: Adversarial Testing of Learning- and Non-Learning-Based Congestion Controllers

Pith reviewed 2026-05-22 06:08 UTC · model grok-4.3

classification 💻 cs.CR cs.LG
keywords congestion controladversarial testingreinforcement learningnetwork robustnesslearning-based controllersadversarial robustnessRL adversary
0
0 comments X

The pith

Learning-based congestion controllers prove more robust than traditional ones when facing adversarial perturbations to inputs or network conditions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces CCLab, a closed-loop framework that pairs a reinforcement learning adversary with any congestion controller to generate bounded, realistic perturbations either on the controller's observed signals or on the external network environment. Testing shows that performance falls for every controller under these attacks, yet learning-based controllers retain higher throughput and lower latency than human-designed algorithms on average. The same adversarial traces can then be used to retrain controllers, yielding versions that improve on both normal and adversarial workloads. This line of evaluation matters because real networks routinely encounter corrupted measurements, changing loads, and potentially malicious interference.

Core claim

An RL-based adversarial agent, constrained to keep perturbations realistic and bounded, can be run in closed loop with a congestion-control policy to produce systematic feature-level or environment-level attacks. When this agent is applied to both learning-based and traditional controllers, the learning-based group exhibits smaller performance drops; moreover, retraining any controller on the resulting adversarial traces produces policies that outperform prior learning-based controllers under both attack and standard conditions.

What carries the argument

The RL-based adversarial agent that generates bounded perturbations on input signals or external network conditions while enforcing explicit realism constraints.

If this is right

  • Both learning-based and non-learning-based congestion controllers suffer measurable performance loss under the generated adversarial conditions.
  • Learning-based controllers, on average, degrade less than traditional human-designed algorithms.
  • Controllers retrained on the adversarial traces outperform existing learning-based controllers in both challenging and normal network settings.
  • Closed-loop adversarial evaluation can surface vulnerabilities that standard benchmarks miss.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Networks operating in environments with frequent signal corruption or load variation may benefit from preferring learning-based controllers.
  • Adversarial trace generation could be added as a routine step when certifying new congestion controllers for deployment.
  • The same closed-loop testing pattern may transfer to robustness evaluation of other feedback-based network mechanisms such as routing or traffic shaping.

Load-bearing premise

The bounded perturbations produced by the RL agent remain realistic enough to reflect plausible real-world noise or interference without violating the underlying network dynamics.

What would settle it

Replicating the exact adversarial test suite on a physical testbed with a fresh collection of learning-based and traditional controllers and observing that traditional controllers suffer smaller throughput or latency degradation would falsify the robustness ordering.

Figures

Figures reproduced from arXiv: 2605.21915 by Brighten Godfrey, Chenkai Wang, Gang Wang, Shehab Sarar Ahmed, Zhi Chen.

Figure 1
Figure 1. Figure 1: CCLab: our adversarial testing framework for the [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Feature-level adversarial testing results with 50% [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Feature-level adversarial testing results with 5% adver [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Feature-level adversarial testing results with 50% [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Example of an adversarially generated bandwidth trace [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Environment-level adversarial testing results. We only [PITH_FULL_IMAGE:figures/full_fig_p010_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: A comparison example between Cubic’s egress and [PITH_FULL_IMAGE:figures/full_fig_p011_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: A comparison example between Orca’s egress and [PITH_FULL_IMAGE:figures/full_fig_p011_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Visualization of one testing case of TCP LP. [PITH_FULL_IMAGE:figures/full_fig_p012_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Replaying the TCP LP adversarial trace on Orca. [PITH_FULL_IMAGE:figures/full_fig_p012_10.png] view at source ↗
read the original abstract

Congestion controllers (CCs) are critical to network performance, and yet their robustness under adverse conditions remains insufficiently understood. While recent learning-based CCs have demonstrated strong performance in controlled environments, it is unclear how they compare to traditional CCs when controllers' input signals are corrupted or when environmental conditions become systematically challenging. In this paper, we introduce CCLab, an adversarial testing framework for systematically evaluating the robustness of both learning-based and non-learning-based CCs. CCLab includes a reinforcement learning (RL)-based adversarial agent that operates in a closed loop with the congestion control policy, generating bounded perturbations either on input signals (feature-level) or on external network conditions (environment-level), while preserving realism through explicit constraints. Using this framework, we compare learning-based CCs with non-learning-based CCs under both feature-level and environment-level adversarial conditions. While both types of CCs suffer from performance degradation under adversarial testing, we find that learning-based CCs, in general, are more robust than traditional human-designed algorithms. Finally, we show that our adversarial traces can be used to train more robust CCs that outperform existing learning-based CCs under both challenging and normal conditions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces CCLab, an adversarial testing framework for congestion controllers (CCs) that employs a reinforcement learning-based adversarial agent operating in closed loop. The agent generates bounded perturbations either on input signals (feature-level) or external network conditions (environment-level), subject to explicit constraints intended to preserve realism. Through this framework the authors compare learning-based and non-learning-based CCs, report that both families degrade under attack but that learning-based controllers are generally more robust, and demonstrate that the generated adversarial traces can be used to retrain CCs that outperform prior learning-based designs under both adversarial and normal conditions.

Significance. If the fairness and realism of the perturbations are established, the work would be significant for network protocol design by supplying a systematic, reproducible method for robustness evaluation of CCs and a concrete technique for hardening learning-based controllers. The closed-loop RL adversary and the dual feature/environment attack surfaces constitute clear technical contributions.

major comments (2)
  1. [§4.2] §4.2 (Environment-Level Perturbations): The explicit constraints on allowable ranges for delay, loss, and bandwidth are load-bearing for the headline robustness comparison. Without a quantitative validation (e.g., overlap statistics with real traces or a sensitivity analysis) showing that these ranges do not extend into regimes outside the support of traditional CC design assumptions, the observed advantage for learning-based controllers could arise from distributional mismatch rather than intrinsic robustness differences.
  2. [§5.1] §5.1 and Table 3: The performance-degradation tables report point estimates without variance across random seeds or statistical significance tests. Because the central claim is a general ordering between two families of controllers, the absence of these measures leaves open whether the reported gaps are robust or sensitive to particular adversarial-agent initializations.
minor comments (2)
  1. [Abstract] Abstract: the qualifier 'in general' is imprecise; the manuscript should state the precise conditions (network scenarios, attack budgets, CC implementations) under which the robustness ordering holds.
  2. [Figure 2] Figure 2: axis labels and legend entries are too small for readability; enlarge or add a supplementary high-resolution version.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which help clarify the presentation of our robustness claims. We address each major comment below and describe the revisions we will make to strengthen the manuscript.

read point-by-point responses
  1. Referee: [§4.2] §4.2 (Environment-Level Perturbations): The explicit constraints on allowable ranges for delay, loss, and bandwidth are load-bearing for the headline robustness comparison. Without a quantitative validation (e.g., overlap statistics with real traces or a sensitivity analysis) showing that these ranges do not extend into regimes outside the support of traditional CC design assumptions, the observed advantage for learning-based controllers could arise from distributional mismatch rather than intrinsic robustness differences.

    Authors: We agree that quantitative validation of the perturbation ranges is necessary to support the claim that robustness differences are intrinsic. The ranges were derived from commonly cited real-network variation bounds in the CC literature, but we did not include direct distributional comparisons in the original submission. In the revision we will add overlap statistics (e.g., Wasserstein distance or support overlap) between the perturbed environments and publicly available real traces (CAIDA, M-Lab), together with a sensitivity analysis that varies the constraint bounds while preserving the reported ordering. These additions will be placed in §4.2 and a new appendix. revision: yes

  2. Referee: [§5.1] §5.1 and Table 3: The performance-degradation tables report point estimates without variance across random seeds or statistical significance tests. Because the central claim is a general ordering between two families of controllers, the absence of these measures leaves open whether the reported gaps are robust or sensitive to particular adversarial-agent initializations.

    Authors: We concur that variance estimates and significance tests are required to substantiate the family-level ordering. The original experiments used single runs for each controller-adversary pair. In the revised manuscript we will re-execute the evaluation suite over at least five independent random seeds for both the RL adversary and the CC policies, report means with standard deviations in Table 3, and include paired statistical tests (Wilcoxon signed-rank) with p-values to confirm that the performance gaps between learning-based and non-learning-based controllers remain significant under both attack types. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical framework is self-contained

full rationale

The paper introduces CCLab as an RL-based adversarial testing framework that applies bounded perturbations under explicit realism constraints, then reports comparative robustness results between learning-based and traditional CCs from those experiments. No derivation chain reduces a claimed prediction or uniqueness result to a fitted parameter or self-citation by construction. The central finding (learning-based CCs degrade less) is presented as an observed outcome of the closed-loop tests rather than a tautological restatement of inputs. The framework relies on external network models and standard RL training, with no load-bearing self-referential definitions or ansatz smuggling visible in the abstract or described structure. This is a standard empirical evaluation paper whose results stand or fall on the experimental setup and data, not on internal redefinition.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the assumption that the generated perturbations remain realistic and bounded; no free parameters or invented physical entities are described in the abstract.

axioms (1)
  • domain assumption Perturbations preserve realism through explicit constraints
    Stated directly in the abstract as the operating principle of the adversarial agent.
invented entities (1)
  • RL-based adversarial agent no independent evidence
    purpose: To generate bounded perturbations on input signals or network conditions in closed loop with the CC policy
    New component introduced as part of the CCLab framework.

pith-pipeline@v0.9.0 · 5750 in / 1189 out tokens · 30338 ms · 2026-05-22T06:08:01.756092+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

30 extracted references · 30 canonical work pages · 1 internal anchor

  1. [1]

    Classic meets modern: A pragmatic learning-based congestion control for the internet,

    S. Abbasloo, C.-Y . Yen, and H. J. Chao, “Classic meets modern: A pragmatic learning-based congestion control for the internet,” inProc. of SIGCOMM, 2020

  2. [2]

    Canopy: Property-driven learning for congestion control,

    C. Yang, D. Saxena, R. Dwivedula, K. Mahajan, S. Chaudhuri, and A. Akella, “Canopy: Property-driven learning for congestion control,” inProc. of EuroSys, 2026

  3. [3]

    Computers can learn from the heuristic designs and master internet congestion control,

    C.-Y . Yen, S. Abbasloo, and H. J. Chao, “Computers can learn from the heuristic designs and master internet congestion control,” inProc. of SIGCOMM, 2023

  4. [4]

    A deep reinforcement learning perspective on internet congestion control,

    N. Jay, N. Rotman, B. Godfrey, M. Schapira, and A. Tamar, “A deep reinforcement learning perspective on internet congestion control,” in Proc. of ICML, 2019

  5. [5]

    PCC vivace: Online-learning congestion control,

    M. Dong, T. Meng, D. Zarchy, E. Arslan, Y . Gilad, B. Godfrey, and M. Schapira, “PCC vivace: Online-learning congestion control,” inProc. of NSDI, 2018

  6. [6]

    Reinforcement learning-based congestion control: A systematic evaluation of fairness, efficiency and responsive- ness,

    L. Giacomoni and G. Parisis, “Reinforcement learning-based congestion control: A systematic evaluation of fairness, efficiency and responsive- ness,” inProc. of INFOCOM, 2024

  7. [7]

    Learning-based vs human- derived congestion control: An in-depth experimental study,

    M. Mazilu, L. Giacomoni, and G. Parisis, “Learning-based vs human- derived congestion control: An in-depth experimental study,”arXiv, 2025

  8. [8]

    Advnet: Revealing performance issues in network protocols by generating adversarial environments,

    S. S. Ahmed, W. Sentosa, Y . Zhang, Y . Lebendiker, M. Shnaiderman, T. Gilad, N. H. Jay, P. B. Godfrey, and M. Schapira, “Advnet: Revealing performance issues in network protocols by generating adversarial environments,” inProc. of ACM CoNEXT, 2026

  9. [9]

    Cc-fuzz: genetic algorithm-based fuzzing for stress testing congestion control algorithms,

    D. Ray and S. Seshan, “Cc-fuzz: genetic algorithm-based fuzzing for stress testing congestion control algorithms,” inProc. of ACM HotNets, 2022

  10. [10]

    Robustifying network protocols with adversarial examples,

    T. Gilad, N. H. Jay, M. Shnaiderman, B. Godfrey, and M. Schapira, “Robustifying network protocols with adversarial examples,” inProc. of HotNets, 2019

  11. [11]

    Tcp vegas: New techniques for congestion detection and avoidance,

    L. S. Brakmo, S. W. O’malley, and L. L. Peterson, “Tcp vegas: New techniques for congestion detection and avoidance,” inProc. of SIGCOMM, 1994

  12. [12]

    Bbr: Congestion-based congestion control,

    N. Cardwell, Y . Cheng, C. S. Gunn, S. H. Yeganeh, and V . Jacobson, “Bbr: Congestion-based congestion control,”Communications of the ACM, 2017

  13. [13]

    Cubic: a new tcp-friendly high-speed tcp variant,

    S. Ha, I. Rhee, and L. Xu, “Cubic: a new tcp-friendly high-speed tcp variant,”ACM SIGOPS operating systems review, 2008

  14. [14]

    Tcp-lp: A distributed algorithm for low priority data transfer,

    A. Kuzmanovic and E. W. Knightly, “Tcp-lp: A distributed algorithm for low priority data transfer,” inProc. of INFOCOM, 2003

  15. [15]

    Binary increase congestion control (bic) for fast long-distance networks,

    L. Xu, K. Harfoush, and I. Rhee, “Binary increase congestion control (bic) for fast long-distance networks,” inProc. of INFOCOM, 2004

  16. [16]

    Highspeed tcp for large congestion windows,

    S. Floyd, “Highspeed tcp for large congestion windows,” Tech. Rep., 2003

  17. [17]

    H-tcp: Tcp for high-speed and long-distance networks,

    D. Leith and R. Shorten, “H-tcp: Tcp for high-speed and long-distance networks,” inProc. of PFLDnet, 2004

  18. [18]

    Tcp hybla: a tcp enhancement for heteroge- neous networks,

    C. Caini and R. Firrincieli, “Tcp hybla: a tcp enhancement for heteroge- neous networks,”International journal of satellite communications and networking, 2004

  19. [19]

    Scalable tcp: Improving performance in highspeed wide area networks,

    T. Kelly, “Scalable tcp: Improving performance in highspeed wide area networks,”ACM SIGCOMM computer communication Review, 2003

  20. [20]

    Tcp veno: Tcp enhancement for transmission over wireless access networks,

    C. P. Fu and S. C. Liew, “Tcp veno: Tcp enhancement for transmission over wireless access networks,”IEEE Journal on selected areas in communications, 2003

  21. [21]

    Revisiting tcp congestion control using delay gradients,

    D. A. Hayes and G. Armitage, “Revisiting tcp congestion control using delay gradients,” inProc. of Networking, 2011

  22. [22]

    Tcp-illinois: A loss and delay-based congestion control algorithm for high-speed networks,

    S. Liu, T. Bas ¸ar, and R. Srikant, “Tcp-illinois: A loss and delay-based congestion control algorithm for high-speed networks,” inProc. of ValueTools, 2006

  23. [23]

    Yeah-tcp: yet another highspeed tcp,

    A. Baiocchi, A. P. Castellani, F. Vacircaet al., “Yeah-tcp: yet another highspeed tcp,” inProc. of PFLDnet, 2007

  24. [24]

    Tcp westwood: Bandwidth estimation for enhanced transport over wireless links,

    S. Mascolo, C. Casetti, M. Gerla, M. Y . Sanadidi, and R. Wang, “Tcp westwood: Bandwidth estimation for enhanced transport over wireless links,” inProc. of MobiCom, 2001

  25. [25]

    Data center tcp (dctcp),

    M. Alizadeh, A. Greenberg, D. A. Maltz, J. Padhye, P. Patel, B. Prab- hakar, S. Sengupta, and M. Sridharan, “Data center tcp (dctcp),” inProc. of SIGCOMM, 2010

  26. [26]

    Glider: rethinking congestion control with deep reinforcement learning,

    Z. Xia, L. Wu, F. Wang, X. Liao, H. Hu, J. Wu, and D. Wu, “Glider: rethinking congestion control with deep reinforcement learning,”World Wide Web, 2023

  27. [27]

    Toward fair and efficient congestion control: Machine learning aided congestion control (mlacc),

    A. Elbery, Y . Lian, and G. Li, “Toward fair and efficient congestion control: Machine learning aided congestion control (mlacc),” inProc. of APNet, 2023

  28. [28]

    Mahimahi: accurate Record-and-Replay for HTTP,

    R. Netravali, A. Sivaraman, S. Das, A. Goyal, K. Winstein, J. Mick- ens, and H. Balakrishnan, “Mahimahi: accurate Record-and-Replay for HTTP,” inProc. of USENIX Security, 2015

  29. [29]

    AdvNet: Revealing Performance Issues in Network Protocols by Generating Adversarial Environments

    S. S. Ahmed, W. Sentosa, Y . Zhang, Y . Lebendiker, M. Shnaiderman, T. Gilad, N. H. Jay, B. Godfrey, and M. Schapira, “Advnet: Revealing performance issues in network protocols by generating adversarial environments,”arXiv preprint arXiv:2605.00755, 2026

  30. [30]

    The menlo report: Ethical principles guiding information and communication technology research,

    E. Kenneally and D. Dittrich, “The menlo report: Ethical principles guiding information and communication technology research,”Available at SSRN 2445102, 2012. APPENDIX A. Feature-Level Manipulation: Real-world Traces To validate the generality of our findings beyond Canopy simulated traces, we repeat the feature-level adversarial ex- periments on Canopy ...