pith. sign in

arxiv: 2605.17575 · v1 · pith:FV6D27COnew · submitted 2026-05-17 · 💻 cs.LG · cs.AI

UniAlign: A Model-Agnostic Framework for Robust Network Traffic Classification under Distribution Shifts

Pith reviewed 2026-05-20 13:27 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords network traffic classificationdistribution shiftsdomain alignmentmodel ensemblingrobustnessdeep learning
0
0 comments X

The pith

UniAlign is a model-agnostic framework that makes deep learning network traffic classifiers more robust to distribution shifts through domain alignment and stable ensembling.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to solve the drop in accuracy that network traffic classification models experience when network conditions change after training. It does so by pairing domain alignment fine-tuning, which pushes the model toward representations that stay consistent across different conditions, with stable model ensembling that combines checkpoints from flat regions of the loss surface. A sympathetic reader would care because prior robustness techniques either lock to one model family, fail on modern raw-byte inputs, or demand heavy extra training work. If the claim holds, existing supervised classifiers could be made more reliable in live networks without redesigning features or paying constant extra cost.

Core claim

UniAlign combines domain alignment fine-tuning, which encourages the learning of domain-invariant traffic representations across heterogeneous network conditions, with stable model ensembling, which enhances inference robustness by aggregating checkpoints within a flat loss region. The framework integrates into existing supervised NTC models without requiring specific feature modalities or introducing non-constant additional training costs and is tested on shifts arising from encryption schemes, data collection devices, and attack behaviors.

What carries the argument

The UniAlign framework, which pairs domain alignment fine-tuning for invariant representations with stable model ensembling for robust inference.

If this is right

  • Existing deep-learning NTC models can retain higher performance when encryption schemes, collection hardware, or attack patterns change after deployment.
  • The same training pipeline works across different model architectures without custom feature engineering.
  • Robustness gains arrive at lower total training time than methods built specifically for traffic classification.
  • No ongoing extra cost appears once the fine-tuning and ensembling steps are complete.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same two-step pattern of alignment followed by flat-region ensembling could be tested on other classification tasks that suffer distribution shift, such as malware detection or sensor-based activity recognition.
  • Live network traces with continuous drift would provide a stricter test than the static public datasets used here.

Load-bearing premise

The distribution shifts present in the three public datasets are representative of those encountered in real deployments and the added steps impose no non-constant training costs.

What would settle it

Running UniAlign on a fresh NTC model and a dataset containing distribution shifts absent from the three public ones and finding no gain or even a loss relative to standard training would refute the central claim.

Figures

Figures reproduced from arXiv: 2605.17575 by Chuyi Wang, Tongze Wang, Wenduo Wang, Xiaohui Xie, Yong Cui.

Figure 1
Figure 1. Figure 1: Overview of the UniAlign framework. The framework consists of two modules: (left) a domain alignment fine-tuning module that incorporates an additional representation alignment loss to minimize cross-domain feature discrepancies, and (right) a stable model ensembling module that merges model checkpoints located in an identified flat loss valley. in the representation space via adversarial training on domai… view at source ↗
Figure 2
Figure 2. Figure 2: Illustration of a sharp minimum and a flat loss valley. [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Visualization of feature representations produced by [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Average training time versus OOD accuracy of different [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: The impact of different distance metrics on NTC [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗
Figure 7
Figure 7. Figure 7: The impact of label smoothing on training loss scales [PITH_FULL_IMAGE:figures/full_fig_p012_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: NTC performance of different ensembling variants [PITH_FULL_IMAGE:figures/full_fig_p012_8.png] view at source ↗
read the original abstract

Network traffic classification (NTC) models often suffer severe performance degradation when deployed in real-world environments due to distribution shifts caused by changing network conditions. Existing robustness-enhancing approaches are commonly coupled to specific model architectures or data settings, fail to generalize to state-of-the-art raw-byte-based NTC models, or incur significant training overhead. In this paper, we propose UniAlign, a novel model-agnostic framework that improves the robustness of deep learning-based NTC models under distribution shifts. UniAlign combines \emph{domain alignment fine-tuning}, which encourages the learning of domain-invariant traffic representations across heterogeneous network conditions, with \emph{stable model ensembling}, which enhances inference robustness by aggregating checkpoints within a flat loss region. The framework can be seamlessly integrated into existing supervised NTC models without requiring specific feature modalities or introducing non-constant additional training costs. We evaluate UniAlign on three public datasets covering diverse distribution shifts, including encryption schemes, data collection devices, and attack behaviors. Experimental results on two representative NTC models demonstrate that, compared with standard training, UniAlign improves average classification accuracy by 2.51\% and average F1 score by 2.71\%, outperforming the strongest baseline by 1.45\% in accuracy and 1.69\% in F1 score, while requiring only 12.4\%--53.9\% of the training time of all NTC-specific baselines.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper presents UniAlign, a model-agnostic framework for enhancing the robustness of deep learning-based network traffic classification (NTC) models against distribution shifts. It combines domain alignment fine-tuning to learn domain-invariant representations and stable model ensembling to aggregate checkpoints in flat loss regions. The approach is designed to integrate seamlessly into existing supervised NTC models without requiring specific feature modalities or incurring non-constant additional training costs. Evaluations on three public datasets involving shifts from encryption schemes, collection devices, and attack behaviors show that UniAlign improves average classification accuracy by 2.51% and F1 score by 2.71% compared to standard training, outperforming the strongest baseline by 1.45% in accuracy and 1.69% in F1, while using only 12.4%--53.9% of the training time of NTC-specific baselines.

Significance. If the reported improvements are confirmed to be statistically significant and generalizable beyond the three datasets, UniAlign could provide a valuable, efficient method for making NTC models more robust to real-world variations in network conditions, addressing a practical challenge in deploying such models without architecture-specific modifications or excessive computational overhead.

major comments (3)
  1. [Abstract] Abstract: The central claims of 2.51% average accuracy improvement and 2.71% F1 improvement (outperforming the strongest baseline by 1.45%/1.69%) are presented without error bars, standard deviations across runs, or details on the number of experimental repetitions and statistical tests. This undercuts the ability to assess whether the robustness gains are reliable or sensitive to random seeds and hyperparameter choices.
  2. [§4.1] §4.1: The three public datasets are positioned as covering encryption schemes, collection devices, and attack behaviors, yet the manuscript does not provide evidence or discussion that these shifts adequately proxy real-world factors such as temporal drift, new protocols, or mixed adversarial patterns. If they do not, the reported gains may not transfer to deployments.
  3. [§3.2] §3.2: The domain alignment fine-tuning component is described as incurring no non-constant additional training costs, but the integration details and any dependence on how shifts are realized during fine-tuning are not quantified across the two representative NTC models, leaving the model-agnostic claim partially unsupported.
minor comments (2)
  1. Add error bars or confidence intervals to all quantitative results in tables and figures to support the average improvement claims.
  2. Clarify the exact protocol for stable model ensembling, including how checkpoints are selected within the flat loss region.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their insightful comments, which have helped us improve the manuscript. We address each major comment below.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The central claims of 2.51% average accuracy improvement and 2.71% F1 improvement (outperforming the strongest baseline by 1.45%/1.69%) are presented without error bars, standard deviations across runs, or details on the number of experimental repetitions and statistical tests. This undercuts the ability to assess whether the robustness gains are reliable or sensitive to random seeds and hyperparameter choices.

    Authors: We agree with the referee that providing statistical details strengthens the claims. Our experiments were repeated 5 times with different random seeds, and we will update the abstract and relevant sections to include standard deviations as error bars. We will also report the number of repetitions and include statistical significance tests (e.g., paired t-test results) to demonstrate that the improvements are reliable. revision: yes

  2. Referee: [§4.1] §4.1: The three public datasets are positioned as covering encryption schemes, collection devices, and attack behaviors, yet the manuscript does not provide evidence or discussion that these shifts adequately proxy real-world factors such as temporal drift, new protocols, or mixed adversarial patterns. If they do not, the reported gains may not transfer to deployments.

    Authors: The three datasets were selected to represent key types of distribution shifts in NTC as identified in prior literature. However, we recognize that they may not fully capture all real-world aspects such as long-term temporal drift or novel protocols. In the revision, we will add a dedicated paragraph in §4.1 discussing the scope of these shifts and their relation to real-world conditions, including potential limitations in generalizability. revision: partial

  3. Referee: [§3.2] §3.2: The domain alignment fine-tuning component is described as incurring no non-constant additional training costs, but the integration details and any dependence on how shifts are realized during fine-tuning are not quantified across the two representative NTC models, leaving the model-agnostic claim partially unsupported.

    Authors: We appreciate this observation. The domain alignment fine-tuning adds a constant overhead independent of the model architecture by using a domain classifier on top of the existing features. We will revise §3.2 to provide explicit integration steps for the two NTC models and quantify the additional training time, which is constant and minimal (less than 5% overhead), confirming the model-agnostic nature without dependence on specific shift realizations beyond domain labels. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical gains from direct dataset comparisons

full rationale

The paper presents UniAlign as a model-agnostic combination of domain alignment fine-tuning and stable model ensembling, then reports measured accuracy/F1 lifts (2.51%/2.71% average) and training-time reductions on three public datasets against baselines. These are straightforward experimental outcomes, not quantities obtained by fitting parameters to the target metric or by any self-referential derivation. No equations, uniqueness theorems, or ansatzes are invoked that collapse back to the inputs; the central claims rest on observable performance deltas rather than on any load-bearing self-citation or definitional loop.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no explicit free parameters, axioms, or invented entities; the framework rests on standard supervised learning assumptions and the representativeness of the chosen public datasets.

pith-pipeline@v0.9.0 · 5792 in / 1066 out tokens · 41587 ms · 2026-05-20T13:27:39.702255+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

72 extracted references · 72 canonical work pages · 6 internal anchors

  1. [1]

    Streaming video qoe modeling and prediction: A long short-term memory approach,

    N. Eswara, S. Ashique, A. Panchbhai, S. Chakraborty, H. P. Sethuram, K. Kuchi, A. Kumar, and S. S. Channappayya, “Streaming video qoe modeling and prediction: A long short-term memory approach,”IEEE Transactions on Circuits and Systems for Video Technology, vol. 30, no. 3, pp. 661–673, 2019

  2. [2]

    Ptu: Pre-trained model for network traffic understanding,

    L. Peng, X. Xie, S. Huang, Z. Wang, and Y . Cui, “Ptu: Pre-trained model for network traffic understanding,” in2024 IEEE 32nd International Conference on Network Protocols (ICNP). IEEE, 2024, pp. 1–12

  3. [3]

    Realtime robust malicious traffic detection via frequency domain analysis,

    C. Fu, Q. Li, M. Shen, and K. Xu, “Realtime robust malicious traffic detection via frequency domain analysis,” inProceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, 2021, pp. 3431–3446

  4. [4]

    Flowlens: Enabling efficient flow classification for ml- based network security applications

    D. Barradas, N. Santos, L. Rodrigues, S. Signorello, F. M. Ramos, and A. Madeira, “Flowlens: Enabling efficient flow classification for ml- based network security applications.” inNDSS, 2021

  5. [5]

    On the effectiveness of machine and deep learning for cyber security,

    G. Apruzzese, M. Colajanni, L. Ferretti, A. Guido, and M. Marchetti, “On the effectiveness of machine and deep learning for cyber security,” in2018 10th international conference on cyber Conflict (CyCon). IEEE, 2018, pp. 371–390

  6. [6]

    Appsniffer: Towards robust mobile app fingerprinting against vpn,

    S. Oh, M. Lee, H. Lee, E. Bertino, and H. Kim, “Appsniffer: Towards robust mobile app fingerprinting against vpn,” inProceedings of the ACM Web Conference 2023, 2023, pp. 2318–2328

  7. [7]

    Fingerprinting obfuscated proxy traffic with encapsulated{TLS}handshakes,

    D. Xue, M. Kallitsis, A. Houmansadr, and R. Ensafi, “Fingerprinting obfuscated proxy traffic with encapsulated{TLS}handshakes,” in33rd USENIX Security Symposium (USENIX Security 24), 2024, pp. 2689– 2706

  8. [8]

    k-fingerprinting: A robust scalable web- site fingerprinting technique,

    J. Hayes and G. Danezis, “k-fingerprinting: A robust scalable web- site fingerprinting technique,” in25th USENIX Security Symposium (USENIX Security 16), 2016, pp. 1187–1203

  9. [9]

    Robust smart- phone app identification via encrypted network traffic analysis,

    V . F. Taylor, R. Spolaor, M. Conti, and I. Martinovic, “Robust smart- phone app identification via encrypted network traffic analysis,”IEEE Transactions on Information Forensics and Security, vol. 13, no. 1, pp. 63–78, 2017

  10. [10]

    Fs-net: A flow sequence network for encrypted traffic classification,

    C. Liu, L. He, G. Xiong, Z. Cao, and Z. Li, “Fs-net: A flow sequence network for encrypted traffic classification,” inIEEE INFOCOM 2019- IEEE Conference On Computer Communications. IEEE, 2019, pp. 1171–1179

  11. [11]

    Flowprint: Semi-supervised mobile-app fingerprinting on encrypted network traf- fic,

    T. Van Ede, R. Bortolameotti, A. Continella, J. Ren, D. J. Dubois, M. Lindorfer, D. Choffnes, M. Van Steen, and A. Peter, “Flowprint: Semi-supervised mobile-app fingerprinting on encrypted network traf- fic,” inNetwork and distributed system security symposium (NDSS), vol. 27, 2020

  12. [12]

    Deep packet: A novel approach for encrypted traffic classification using deep learning,

    M. Lotfollahi, M. Jafari Siavoshani, R. Shirali Hossein Zade, and M. Saberian, “Deep packet: A novel approach for encrypted traffic classification using deep learning,”Soft Computing, vol. 24, no. 3, pp. 1999–2012, 2020

  13. [13]

    Et-bert: A contextualized datagram representation with pre-training transformers for encrypted traffic classification,

    X. Lin, G. Xiong, G. Gou, Z. Li, J. Shi, and J. Yu, “Et-bert: A contextualized datagram representation with pre-training transformers for encrypted traffic classification,” inProceedings of the ACM Web Conference 2022, 2022, pp. 633–642

  14. [14]

    Yet another traffic classifier: A masked autoencoder based traffic transformer with multi-level flow representation,

    R. Zhao, M. Zhan, X. Deng, Y . Wang, Y . Wang, G. Gui, and Z. Xue, “Yet another traffic classifier: A masked autoencoder based traffic transformer with multi-level flow representation,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 37, no. 4, 2023, pp. 5420–5427

  15. [15]

    Netmamba: Efficient network traffic classification via pre-training unidirectional mamba,

    T. Wang, X. Xie, W. Wang, C. Wang, Y . Zhao, and Y . Cui, “Netmamba: Efficient network traffic classification via pre-training unidirectional mamba,” in2024 IEEE 32nd International Conference on Network Protocols (ICNP). IEEE, 2024, pp. 1–11

  16. [16]

    Trafficformer: an efficient pre-trained model for traffic data,

    G. Zhou, X. Guo, Z. Liu, T. Li, Q. Li, and K. Xu, “Trafficformer: an efficient pre-trained model for traffic data,” in2025 IEEE Symposium on Security and Privacy (SP). IEEE, 2025, pp. 1844–1860

  17. [17]

    Mm4flow: A pre-trained multi-modal model for versatile network traffic analysis,

    L. Yang, L. Liu, J. Huang, Z. Liu, S. Liang, S. Fu, and Y . Wang, “Mm4flow: A pre-trained multi-modal model for versatile network traffic analysis,” inProceedings of the 2025 ACM SIGSAC Conference on Computer and Communications Security, 2025, pp. 1664–1678

  18. [18]

    The sweet danger of sugar: Debunking representation learning for encrypted traffic classification,

    Y . Zhao, G. Dettori, M. Boffa, L. Vassio, and M. Mellia, “The sweet danger of sugar: Debunking representation learning for encrypted traffic classification,” inProceedings of the ACM SIGCOMM 2025 Conference, 2025, pp. 296–310

  19. [19]

    Cd-net: Robust mobile traffic classification against apps updating,

    Y . Chen, B. Hou, B. Wu, and H. Hu, “Cd-net: Robust mobile traffic classification against apps updating,”Computers & Security, vol. 150, p. 104214, 2025

  20. [20]

    Fg-sat: Efficient flow graph for encrypted traffic classification under environment shifts,

    S. Cui, X. Han, D. Han, Z. Wang, W. Wang, B. Jiang, B. Liu, and Z. Lu, “Fg-sat: Efficient flow graph for encrypted traffic classification under environment shifts,”IEEE Transactions on Information Forensics and Security, 2025

  21. [21]

    Respond to change with constancy: Instruction-tuning with llm for non-iid network traffic classification,

    X. Lin, G. Xiong, G. Gou, W. Dong, J. Yu, Z. Li, and W. Xia, “Respond to change with constancy: Instruction-tuning with llm for non-iid network traffic classification,”IEEE Transactions on Information Forensics and Security, 2025

  22. [22]

    Realistic website fingerprinting by augmenting network traces,

    A. Bahramali, A. Bozorgi, and A. Houmansadr, “Realistic website fingerprinting by augmenting network traces,” inProceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security, 2023, pp. 1035–1049

  23. [23]

    Rosetta: Enabling robust tls encrypted traffic classification in diverse network environments with tcp-aware traffic augmentation,

    R. Xie, Y . Wang, J. Cao, E. Dong, M. Xu, K. Sun, Q. Li, L. Shen, and M. Zhang, “Rosetta: Enabling robust tls encrypted traffic classification in diverse network environments with tcp-aware traffic augmentation,” inProceedings of the ACM turing award celebration conference-China 2023, 2023, pp. 131–132

  24. [24]

    Training robust classifiers for classifying encrypted traffic under dynamic network conditions,

    Y . Qing, Q. Yin, X. Deng, X. Zhang, P. Li, Z. Liu, K. Sun, K. Xu, and Q. Li, “Training robust classifiers for classifying encrypted traffic under dynamic network conditions,” inProceedings of the 2025 ACM SIGSAC Conference on Computer and Communications Security, 2025, pp. 3564–3578

  25. [25]

    Domain general- ization: A survey,

    K. Zhou, Z. Liu, Y . Qiao, T. Xiang, and C. C. Loy, “Domain general- ization: A survey,”IEEE transactions on pattern analysis and machine intelligence, vol. 45, no. 4, pp. 4396–4415, 2022

  26. [26]

    Deep fingerprinting: Undermining website fingerprinting defenses with deep learning,

    P. Sirinam, M. Imani, M. Juarez, and M. Wright, “Deep fingerprinting: Undermining website fingerprinting defenses with deep learning,” in Proceedings of the 2018 ACM SIGSAC conference on computer and communications security, 2018, pp. 1928–1943

  27. [27]

    Robust multi-tab website fingerprinting attacks in the wild,

    X. Deng, Q. Yin, Z. Liu, X. Zhao, Q. Li, M. Xu, K. Xu, and J. Wu, “Robust multi-tab website fingerprinting attacks in the wild,” in2023 IEEE symposium on security and privacy (SP). IEEE, 2023, pp. 1005– 1022

  28. [28]

    Flow-mae: Leveraging masked autoencoder for accurate, efficient and robust malicious traffic classifica- tion,

    Z. Hang, Y . Lu, Y . Wang, and Y . Xie, “Flow-mae: Leveraging masked autoencoder for accurate, efficient and robust malicious traffic classifica- tion,” inProceedings of the 26th International Symposium on Research in Attacks, Intrusions and Defenses, 2023, pp. 297–314

  29. [29]

    Tfe-gnn: A temporal fusion encoder using graph neural networks for fine-grained encrypted traffic classification,

    H. Zhang, L. Yu, X. Xiao, Q. Li, F. Mercaldo, X. Luo, and Q. Liu, “Tfe-gnn: A temporal fusion encoder using graph neural networks for fine-grained encrypted traffic classification,” inProceedings of the ACM Web Conference 2023, 2023, pp. 2066–2075

  30. [30]

    Deep coral: Correlation alignment for deep do- main adaptation,

    B. Sun and K. Saenko, “Deep coral: Correlation alignment for deep do- main adaptation,” inEuropean conference on computer vision. Springer, 2016, pp. 443–450

  31. [31]

    Domain generalization via conditional invariant representations,

    Y . Li, M. Gong, X. Tian, T. Liu, and D. Tao, “Domain generalization via conditional invariant representations,” inProceedings of the AAAI conference on artificial intelligence, vol. 32, no. 1, 2018

  32. [32]

    Sharpness-Aware Minimization for Efficiently Improving Generalization

    P. Foret, A. Kleiner, H. Mobahi, and B. Neyshabur, “Sharpness-aware minimization for efficiently improving generalization,”arXiv preprint arXiv:2010.01412, 2020

  33. [33]

    Swad: Domain generalization by seeking flat minima,

    J. Cha, S. Chun, K. Lee, H.-C. Cho, S. Park, Y . Lee, and S. Park, “Swad: Domain generalization by seeking flat minima,”Advances in Neural Information Processing Systems, vol. 34, pp. 22 405–22 418, 2021

  34. [34]

    Surrogate gap minimization improves sharpness-aware training,

    J. Zhuang, B. Gong, L. Yuan, Y . Cui, H. Adam, N. Dvornek, S. Tatikonda, J. Duncan, and T. Liu, “Surrogate gap minimization improves sharpness-aware training,”arXiv preprint arXiv:2203.08065, 2022

  35. [35]

    Model-agnostic meta-learning for fast adaptation of deep networks,

    C. Finn, P. Abbeel, and S. Levine, “Model-agnostic meta-learning for fast adaptation of deep networks,” inInternational conference on machine learning. PMLR, 2017, pp. 1126–1135

  36. [36]

    Learning to generalize: Meta-learning for domain generalization,

    D. Li, Y . Yang, Y .-Z. Song, and T. Hospedales, “Learning to generalize: Meta-learning for domain generalization,” inProceedings of the AAAI conference on artificial intelligence, vol. 32, no. 1, 2018

  37. [37]

    Domain-adversarial training of neural networks,

    Y . Ganin, E. Ustinova, H. Ajakan, P. Germain, H. Larochelle, F. Lavi- olette, M. March, and V . Lempitsky, “Domain-adversarial training of neural networks,”Journal of machine learning research, vol. 17, no. 59, pp. 1–35, 2016

  38. [38]

    Reducing domain gap by reducing style bias,

    H. Nam, H. Lee, J. Park, W. Yoon, and D. Yoo, “Reducing domain gap by reducing style bias,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 8690–8699

  39. [39]

    Domain generalization by learning and removing domain-specific features,

    Y . Ding, L. Wang, B. Liang, S. Liang, Y . Wang, and F. Chen, “Domain generalization by learning and removing domain-specific features,”Ad- vances in Neural Information Processing Systems, vol. 35, pp. 24 226– 24 239, 2022

  40. [40]

    Domain generalization with adversarial feature learning,

    H. Li, S. J. Pan, S. Wang, and A. C. Kot, “Domain generalization with adversarial feature learning,” inProceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 5400–5409

  41. [41]

    Multi-task learning using uncer- tainty to weigh losses for scene geometry and semantics,

    A. Kendall, Y . Gal, and R. Cipolla, “Multi-task learning using uncer- tainty to weigh losses for scene geometry and semantics,” inProceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 7482–7491

  42. [42]

    Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks,

    Z. Chen, V . Badrinarayanan, C.-Y . Lee, and A. Rabinovich, “Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks,” inInternational conference on machine learning. PMLR, 2018, pp. 794–803

  43. [43]

    Rethinking the inception architecture for computer vision,

    C. Szegedy, V . Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking the inception architecture for computer vision,” inProceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 2818–2826

  44. [44]

    Regularizing Neural Networks by Penalizing Confident Output Distributions

    G. Pereyra, G. Tucker, J. Chorowski, Ł. Kaiser, and G. Hinton, “Reg- ularizing neural networks by penalizing confident output distributions,” arXiv preprint arXiv:1701.06548, 2017

  45. [45]

    When does label smoothing help?

    R. M ¨uller, S. Kornblith, and G. E. Hinton, “When does label smoothing help?”Advances in neural information processing systems, vol. 32, 2019

  46. [46]

    Averaging Weights Leads to Wider Optima and Better Generalization

    P. Izmailov, D. Podoprikhin, T. Garipov, D. Vetrov, and A. G. Wilson, “Averaging weights leads to wider optima and better generalization,” arXiv preprint arXiv:1803.05407, 2018

  47. [47]

    Ensemble of averages: Improving model selection and boosting performance in domain gener- alization,

    D. Arpit, H. Wang, Y . Zhou, and C. Xiong, “Ensemble of averages: Improving model selection and boosting performance in domain gener- alization,”Advances in Neural Information Processing Systems, vol. 35, pp. 8265–8277, 2022

  48. [48]

    Sok: Decoding the enigma of encrypted network traffic classifiers,

    N. Wickramasinghe, A. Shaghaghi, G. Tsudik, and S. Jha, “Sok: Decoding the enigma of encrypted network traffic classifiers,” in2025 IEEE Symposium on Security and Privacy (SP). IEEE, 2025, pp. 1825– 1843

  49. [49]

    A large- scale mobile traffic dataset for mobile application identification,

    S. Zhao, S. Chen, F. Wang, Z. Wei, J. Zhong, and J. Liang, “A large- scale mobile traffic dataset for mobile application identification,”The Computer Journal, vol. 67, no. 4, pp. 1501–1513, 2024

  50. [50]

    Toward generating a new intrusion detection dataset and intrusion traffic characterization

    I. Sharafaldin, A. H. Lashkari, A. A. Ghorbaniet al., “Toward generating a new intrusion detection dataset and intrusion traffic characterization.” ICISSp, vol. 1, no. 2018, pp. 108–116, 2018

  51. [51]

    Netmamba+: A framework of pre-trained models for efficient and accurate network traffic classification,

    T. Wang, X. Xie, W. Wang, C. Wang, J. Liu, B. Huang, Y . Hu, Y . Zhao, and Y . Cui, “Netmamba+: A framework of pre-trained models for efficient and accurate network traffic classification,”arXiv preprint arXiv:2601.21792, 2026

  52. [52]

    Kitsune: An Ensemble of Autoencoders for Online Network Intrusion Detection

    Y . Mirsky, T. Doitshman, Y . Elovici, and A. Shabtai, “Kitsune: an ensemble of autoencoders for online network intrusion detection,”arXiv preprint arXiv:1802.09089, 2018

  53. [53]

    UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction

    L. McInnes, J. Healy, and J. Melville, “Umap: Uniform manifold approximation and projection for dimension reduction,”arXiv preprint arXiv:1802.03426, 2018

  54. [54]

    Flowpic: Encrypted internet traffic classi- fication is as easy as image recognition,

    T. Shapira and Y . Shavitt, “Flowpic: Encrypted internet traffic classi- fication is as easy as image recognition,” inIEEE INFOCOM 2019- IEEE conference on computer communications workshops (INFOCOM WKSHPS). IEEE, 2019, pp. 680–687

  55. [55]

    A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks

    D. Hendrycks and K. Gimpel, “A baseline for detecting misclassified and out-of-distribution examples in neural networks,”arXiv preprint arXiv:1610.02136, 2016

  56. [56]

    Energy-based out-of-distribution detection,

    W. Liu, X. Wang, J. Owens, and Y . Li, “Energy-based out-of-distribution detection,”Advances in neural information processing systems, vol. 33, pp. 21 464–21 475, 2020

  57. [57]

    Enhancing the reliability of out-of-distribution image detection in neural networks

    S. Liang, Y . Li, and R. Srikant, “Enhancing the reliability of out- of-distribution image detection in neural networks,”arXiv preprint arXiv:1706.02690, 2017

  58. [58]

    Gen: Pushing the limits of softmax- based out-of-distribution detection,

    X. Liu, Y . Lochman, and C. Zach, “Gen: Pushing the limits of softmax- based out-of-distribution detection,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2023, pp. 23 946–23 955

  59. [59]

    Ssd: A unified framework for self-supervised outlier detection

    V . Sehwag, M. Chiang, and P. Mittal, “Ssd: A unified framework for self- supervised outlier detection,”arXiv preprint arXiv:2103.12051, 2021

  60. [60]

    Out-of-distribution detection with deep nearest neighbors,

    Y . Sun, Y . Ming, X. Zhu, and Y . Li, “Out-of-distribution detection with deep nearest neighbors,” inInternational conference on machine learning. PMLR, 2022, pp. 20 827–20 840

  61. [61]

    Nearest neighbor guidance for out-of-distribution detection,

    J. Park, Y . G. Jung, and A. B. J. Teoh, “Nearest neighbor guidance for out-of-distribution detection,” inProceedings of the IEEE/CVF international conference on computer vision, 2023, pp. 1686–1695

  62. [62]

    On the importance of gradients for detecting distributional shifts in the wild,

    R. Huang, A. Geng, and Y . Li, “On the importance of gradients for detecting distributional shifts in the wild,”Advances in Neural Information Processing Systems, vol. 34, pp. 677–689, 2021

  63. [63]

    Gaia: Delving into gradient-based attribution abnormality for out-of-distribution detec- tion,

    J. Chen, J. Li, X. Qu, J. Wang, J. Wan, and J. Xiao, “Gaia: Delving into gradient-based attribution abnormality for out-of-distribution detec- tion,”Advances in Neural Information Processing Systems, vol. 36, pp. 79 946–79 958, 2023

  64. [64]

    Gradorth: A simple yet efficient out-of-distribution detection with orthogonal projection of gradients,

    S. Behpour, T. L. Doan, X. Li, W. He, L. Gou, and L. Ren, “Gradorth: A simple yet efficient out-of-distribution detection with orthogonal projection of gradients,”Advances in Neural Information Processing Systems, vol. 36, pp. 38 206–38 230, 2023

  65. [65]

    A survey on deep active learning: Recent advances and new frontiers,

    D. Li, Z. Wang, Y . Chen, R. Jiang, W. Ding, and M. Okumura, “A survey on deep active learning: Recent advances and new frontiers,” IEEE Transactions on Neural Networks and Learning Systems, vol. 36, no. 4, pp. 5879–5899, 2024

  66. [66]

    Deep class-incremental learning: A survey,

    D.-W. Zhou, Q.-W. Wang, Z.-H. Qi, H.-J. Ye, D.-C. Zhan, and Z. Liu, “Deep class-incremental learning: A survey,”arXiv preprint arXiv:2302.03648, vol. 1, no. 2, p. 6, 2023

  67. [67]

    A few shots traffic classification with mini-flowpic augmentations,

    E. Horowicz, T. Shapira, and Y . Shavitt, “A few shots traffic classification with mini-flowpic augmentations,” inProceedings of the 22nd ACM internet measurement conference, 2022, pp. 647–654

  68. [68]

    Accurate decentralized application identification via encrypted traffic analysis using graph neural networks,

    M. Shen, J. Zhang, L. Zhu, K. Xu, and X. Du, “Accurate decentralized application identification via encrypted traffic analysis using graph neural networks,”IEEE Transactions on Information Forensics and Security, vol. 16, pp. 2367–2380, 2021

  69. [69]

    Pert: Payload encoding representation from transformer for encrypted traffic classification,

    H. Y . He, Z. G. Yang, and X. N. Chen, “Pert: Payload encoding representation from transformer for encrypted traffic classification,” in 2020 ITU Kaleidoscope: Industry-Driven Digital Transformation (ITU K). IEEE, 2020, pp. 1–8

  70. [70]

    Mtt: an efficient model for encrypted network traffic classification using multi-task transformer,

    W. Zheng, J. Zhong, Q. Zhang, and G. Zhao, “Mtt: an efficient model for encrypted network traffic classification using multi-task transformer,” Applied Intelligence, vol. 52, no. 9, pp. 10 741–10 756, 2022

  71. [71]

    Netgpt: Generative pretrained transformer for network traffic,

    X. Meng, C. Lin, Y . Wang, and Y . Zhang, “Netgpt: Generative pretrained transformer for network traffic,”arXiv preprint arXiv:2304.09513, 2023

  72. [72]

    Lens: A foundation model for network traffic in cybersecurity,

    Q. Wang, C. Qian, X. Li, Z. Yao, and H. Shao, “Lens: A foundation model for network traffic in cybersecurity,”arXiv e-prints, pp. arXiv– 2402, 2024