arxiv: 2604.02379 · v1 · submitted 2026-04-01 · 💻 cs.NI

Recognition: no theorem link

Cardinality is Not Enough: Super Host Detection via Segmented Cardinality Estimation

Yilin Zhao , Jiawei Huang , Xianshi Su , Weihe Li , Xin Li , Yan Liu , Jiacheng Xie , Qichen Su

show 3 more authors

Jin Ye Wanchun Jiang Jianxin Wang

Authors on Pith no claims yet

Pith reviewed 2026-05-13 22:12 UTC · model grok-4.3

classification 💻 cs.NI

keywords super host detectioncardinality estimationsketch algorithmsnetwork securityIP subnet analysisflow monitoringsegmented hashingfalse positive reduction

0 comments

The pith

SegSketch detects super hosts by estimating distinct connections within inferred IP subnets rather than across full addresses.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that existing sketch methods for spotting super hosts count distinct peers using entire IP addresses and therefore misclassify many normal hosts as suspicious when attackers or victims cluster inside subnets. SegSketch adds a halved-segment hashing step that quickly infers common prefix lengths, then measures cardinality only inside those segments. This keeps memory low while raising detection precision, which matters for real-time attack mitigation and service protection on high-speed links where only small memory budgets are available. The authors report that the method raises F1-score by as much as 8.04 times over prior sketches under tight memory constraints.

Core claim

SegSketch introduces a segmented cardinality estimation scheme that uses a halved-segment hashing strategy to infer the common prefix lengths of IP addresses and then computes flow cardinality inside each inferred subnet; the resulting per-subnet counts replace full-address counts, yielding higher detection accuracy at far lower memory cost than either flat sketches or hierarchical structures.

What carries the argument

Halved-segment hashing strategy that infers common IP prefix lengths to partition addresses into subnets for localized cardinality estimation.

If this is right

Super-host detection becomes practical on routers with only a few megabytes of fast memory.
False-positive rates drop because normal cross-subnet traffic no longer inflates global cardinality counts.
Attack mitigation systems can act on the same memory budget that previously produced unreliable results.
The same segmented counting idea can be swapped into other sketch-based tasks that currently ignore address locality.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approach may generalize to detecting other locality-sensitive anomalies such as distributed scanners or botnet command channels.
Router vendors could embed the halved-segment logic in hardware hash tables without increasing on-chip SRAM.
Combining SegSketch with existing heavy-hitter detectors could produce a single low-memory pipeline for multiple security signals.

Load-bearing premise

Super hosts that matter for detection usually talk to many hosts inside the same subnet rather than scattering connections across unrelated addresses.

What would settle it

A traffic trace containing super hosts whose peer sets have no common prefix longer than /32, where SegSketch shows no F1 improvement over a plain full-address sketch of equal size.

Figures

Figures reproduced from arXiv: 2604.02379 by Jiacheng Xie, Jianxin Wang, Jiawei Huang, Jin Ye, Qichen Su, Wanchun Jiang, Weihe Li, Xianshi Su, Xin Li, Yan Liu, Yilin Zhao.

**Figure 2.** Figure 2: Hierarchical approach for cardinality estimation. [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 4.** Figure 4: Data structure. 3.2 Key Design The subnet cardinality is an important feature to assist super-host detection. However, it is hard to get the prior knowledge of subnet information at the measurement nodes, making the subnet cardinality estimation impractical. Different from the hierarchical cardinality estimator requiring large memory usage, the core idea of SegSketch is to leverage a lightweight halved-se… view at source ↗

**Figure 3.** Figure 3: Super spreader detection performance. The experimental results in [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗

**Figure 6.** Figure 6: Example of subnet cardinality estimation. [PITH_FULL_IMAGE:figures/full_fig_p004_6.png] view at source ↗

**Figure 5.** Figure 5: Example of common prefix length estimation using [PITH_FULL_IMAGE:figures/full_fig_p004_5.png] view at source ↗

**Figure 7.** Figure 7: Examples of the update procedure for super spreader detection. [PITH_FULL_IMAGE:figures/full_fig_p005_7.png] view at source ↗

**Figure 8.** Figure 8: Super spreader detection performance on the [PITH_FULL_IMAGE:figures/full_fig_p006_8.png] view at source ↗

**Figure 11.** Figure 11: Impact of varying ratios of super spreaders. [PITH_FULL_IMAGE:figures/full_fig_p007_11.png] view at source ↗

**Figure 12.** Figure 12: Common prefix length estimation error. Varying the size of the host bitmap. To evaluate the impact of various host bitmap sizes on super host detection, we configure SegSketch with host bitmap sizes ranging from 0.2KB to 1KB and conduct experiments on the mixed CAIDA dataset. 0.25KB 0.5KB 0.75KB 1KB F1-Score 0 0.2 0.4 0.6 0.8 1.0 Memory (KB) 32 64 128 256 512 (a) F1-Score 0.25KB 0.5KB 0.75KB 1KB ARE 0 0.2… view at source ↗

**Figure 14.** Figure 14: Throughput on the mixed CAIDA dataset. 6 Performance on P4 We implement the prototypes of SpreadSketch, Couper, RHHH and SegSketch using P4 [38] and deploy them on the Wedge 100BF-32X programmable switch [4]. Due to the lack of support for complex instructions on the Tofino architecture, certain data structures such as heaps cannot be efficiently implemented in the data plane, and thus we integrate the co… view at source ↗

**Figure 15.** Figure 15: ARE of subnet cardinality estimation through full [PITH_FULL_IMAGE:figures/full_fig_p011_15.png] view at source ↗

**Figure 16.** Figure 16: Super spreader detection performance on the [PITH_FULL_IMAGE:figures/full_fig_p011_16.png] view at source ↗

**Figure 19.** Figure 19: Throughput on the mixed MAWI dataset. The efficiency of SegSketch stems from its computationally lightweight design, where the halved-segment hashing and the bitmap-based cardinality estimation enables fast updates. Conversely, SpreadSketch incurs extra cost from operations in MultiResolution Bitmaps, Couper requires maintaining dual-layer estimators, and RHHH suffers from complex multi-layer estimator… view at source ↗

**Figure 18.** Figure 18: Impact of varying IP segment width 𝐺. B.4 Throughput We further evaluate the throughput of all four methods on the mixed MAWI dataset, with results shown in [PITH_FULL_IMAGE:figures/full_fig_p012_18.png] view at source ↗

read the original abstract

Accurately detecting super host that establishes connections to a large number of distinct peers is significant for mitigating web attacks and ensuring high quality of web service. Existing sketch-based approaches estimate the number of distinct connections called flow cardinality according to full IP addresses, while ignoring the fact that a malicious or victim super host often communicates with hosts within the same subnet, resulting in high false positive rates and low accuracy. Though hierarchical-structure based approaches could capture flow cardinality in subnet, they inherently suffer from high memory usage. To address these limitations, we propose SegSketch, a segmented cardinality estimation approach that employs a lightweight halved-segment hashing strategy to infer common prefix lengths of IP addresses, and estimates cardinality within subnet to enhance detection accuracy under constrained memory size. Experiments driven by real-world traces demonstrate that, SegSketch improves F1-Score by up to 8.04x compared to state-of-the-art solutions, particularly under small memory budgets.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

SegSketch adds halved-segment hashing to infer IP prefixes for subnet-level cardinality, but the 8x F1 claim sits on thin evidence in the abstract.

read the letter

The new piece is the halved-segment hash that guesses common prefix lengths on the fly so cardinality can be tallied inside inferred subnets rather than across full addresses. That sidesteps both the false-positive problem of plain sketches and the memory cost of full hierarchical tables. The motivation lands: many super hosts really do cluster their connections inside /24 or similar blocks, so grouping by inferred prefix should cut noise under tight memory budgets. If the full experiments back this with proper ablations on real traces, the engineering move is useful for router-side monitors. The soft spot is the evidence. The abstract states an 8.04x F1 lift without baselines, trace descriptions, or error analysis, so it is impossible to tell whether the gain comes from the hashing, from trace locality, or from something else. The prefix-inference step also needs direct checks; if it mis-groups on traces without strong subnet structure, the whole advantage collapses to ordinary sketch performance. No circular math shows up, and the method builds on existing sketches rather than inventing new parameters that fit the data. This is for network-security and measurement groups that already run sketches and want a lighter way to exploit locality. A reader who needs concrete low-memory detectors will get value only if the experiments are solid and the hashing generalizes. I would send it to peer review so the data and ablations can be examined properly.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes SegSketch, a segmented cardinality estimation method for super-host detection. It introduces a halved-segment hashing strategy to infer common IP prefix lengths from traffic, then estimates flow cardinality inside the inferred subnets rather than over full addresses. The approach is positioned as addressing high false-positive rates in standard sketch methods (which ignore subnet locality) and high memory use in hierarchical methods. Experiments on real-world traces are claimed to yield up to an 8.04x F1-score improvement over state-of-the-art baselines, especially under tight memory budgets.

Significance. If the central performance claim holds after validation, the work would be significant for practical network monitoring and attack mitigation. It offers a lightweight way to exploit the common observation that super hosts (malicious or victim) often communicate inside the same subnet, achieving better accuracy than flat sketches without the memory cost of full hierarchical sketches. The method is presented as a direct extension of existing cardinality sketches rather than a parameter-heavy invention.

major comments (2)

[§3.2] §3.2 (halved-segment hashing): the accuracy of prefix-length inference is not supported by any error analysis, correctness argument, or ablation that isolates the hashing step from the subnet-locality property of the evaluation traces. If the inferred groupings are incorrect on traces lacking strong /24 locality, the reported F1 gain collapses to the performance of the underlying sketch.
[§4] §4 (experimental evaluation): the headline 8.04x F1 improvement is stated without tabulated baselines, memory budgets, error bars, or an ablation that quantifies the contribution of subnet estimation versus the hashing strategy. This makes the central claim impossible to assess from the provided evidence.

minor comments (1)

[Abstract] Abstract: the phrase 'particularly under small memory budgets' is not quantified; the manuscript should state the exact memory sizes (e.g., 1 MB, 2 MB) at which the 8.04x figure is observed.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major point below and have revised the manuscript to strengthen the analysis and experimental presentation.

read point-by-point responses

Referee: [§3.2] §3.2 (halved-segment hashing): the accuracy of prefix-length inference is not supported by any error analysis, correctness argument, or ablation that isolates the hashing step from the subnet-locality property of the evaluation traces. If the inferred groupings are incorrect on traces lacking strong /24 locality, the reported F1 gain collapses to the performance of the underlying sketch.

Authors: We agree that an explicit error analysis and ablation isolating the halved-segment hashing would strengthen the paper. The hashing strategy is designed to probabilistically group addresses sharing common prefixes by splitting the hash space, but we acknowledge the need to separate this from trace-specific locality. In the revision, we add a probabilistic correctness argument for prefix inference accuracy (based on collision probabilities) and an ablation study evaluating the hashing step independently. We also include results on synthetic traces with controlled locality levels to show that gains diminish without strong subnet structure, as expected, rather than collapsing entirely. revision: yes
Referee: [§4] §4 (experimental evaluation): the headline 8.04x F1 improvement is stated without tabulated baselines, memory budgets, error bars, or an ablation that quantifies the contribution of subnet estimation versus the hashing strategy. This makes the central claim impossible to assess from the provided evidence.

Authors: We apologize for the lack of detailed tabular and ablation data in the original submission. The revised manuscript includes a new table reporting F1-scores for all baselines (HyperLogLog, PCSA, and hierarchical sketches) at explicit memory budgets (0.5 MB to 8 MB). We add error bars as standard deviations over 10 independent runs. A dedicated ablation subsection quantifies the separate contributions of subnet cardinality estimation and the halved-segment hashing, confirming that both are necessary for the peak gains (e.g., the 8.04x figure occurs at 1 MB on the CAIDA trace). revision: yes

Circularity Check

0 steps flagged

No circularity: SegSketch introduces independent halved-segment hashing on top of existing sketches

full rationale

The paper describes SegSketch as a new segmented cardinality estimation method that adds a lightweight halved-segment hashing strategy to infer common IP prefix lengths and then estimates cardinality within those subnets. This construction is presented as an engineering extension of prior sketch techniques rather than any self-referential equation, fitted parameter renamed as prediction, or self-citation chain that carries the central claim. No equations or derivations in the provided text reduce the reported F1 improvement to the input data by construction; the accuracy gains are asserted via empirical evaluation on real-world traces. The approach therefore remains self-contained against external benchmarks and receives the default non-circularity finding.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 1 invented entities

The approach rests on the domain assumption that super hosts share subnet prefixes and that prefix inference via hashing is reliable; no explicit free parameters or invented entities are named in the abstract.

free parameters (1)

halved-segment hash parameters
Likely tuned for prefix inference but not specified in abstract

axioms (1)

domain assumption Malicious or victim super hosts communicate with hosts within the same subnet
Explicitly invoked to justify subnet-level estimation

invented entities (1)

SegSketch no independent evidence
purpose: Segmented cardinality estimation for super host detection
New method name and strategy introduced in the abstract

pith-pipeline@v0.9.0 · 5490 in / 1218 out tokens · 53748 ms · 2026-05-13T22:12:26.795956+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

55 extracted references · 55 canonical work pages

[1]

SegSketch Repository

2026. SegSketch Repository. https://github.com/Elaine-codebase/SegSketch- Repository

work page 2026
[2]

Qasem Abu Al-Haija, Eyad Saleh, and Mohammad Alnabhan. 2021. Detecting port scan attacks using logistic regression. InProceedings of IEEE International Symposium on Advanced Electrical and Communication Technologies. 1–5

work page 2021
[3]

Manos Antonakakis, Tim April, Michael Bailey, Matt Bernhard, Elie Bursztein, Jaime Cochran, Zakir Durumeric, J Alex Halderman, Luca Invernizzi, Michalis Kallitsis, et al. 2017. Understanding the mirai botnet. InProceedings of USENIX Security Symposium. 1093–1110

work page 2017
[4]

Barefoot Networks. 2016. Barefoot’s Tofino. https://barefootnetworks.com/ products/brief-tofino

work page 2016
[5]

Ran Ben Basat, Gil Einziger, Roy Friedman, Marcelo C Luizelli, and Erez Waisbard

work page
[6]

InProceedings of the Conference of the ACM Special Interest Group on Data Communication (SIGCOMM)

Constant time updates in hierarchical heavy hitters. InProceedings of the Conference of the ACM Special Interest Group on Data Communication (SIGCOMM). 127–140

work page
[7]

Catalin Cimpanu. 2019. Carpet-bombing ddos attack takes down south african isp for an entire day. https://www.zdnet.com/article/carpet-bombing-ddos-attack- takes-down-south-african-isp-for-an-entire-day/

work page 2019
[8]

Center for Applied Internet Data Analysis. 2016. The CAIDA Anonymized Internet Traces Dataset. https://catalog.caida.org/dataset/passive_2016_pcap

work page 2016
[9]

Graham Cormode, Flip Korn, S Muthukrishnan, and Divesh Srivastava. 2004. Diamond in the rough: Finding hierarchical heavy hitters in multi-dimensional data. InProceedings of ACM SIGMOD International Conference on Management of Data. 155–166

work page 2004
[10]

Graham Cormode, Flip Korn, S Muthukrishnan, and Divesh Srivastava. 2008. Finding hierarchical heavy hitters in streaming data.ACM Transactions on Knowledge Discovery from Data1, 4 (2008), 1–48

work page 2008
[11]

Leonardo Henrique De Melo, Gustavo de Carvalho Bertoli, Michele Nogueira, Aldri Luiz Dos Santos, and Lourenço Alves Pereira. 2025. Anomaly-Flow: A Multi-domain Federated Generative Adversarial Network for Distributed Denial- of-Service Detection.IEEE Network(2025), 1–1

work page 2025
[12]

Damu Ding, Marco Savi, Federico Pederzolli, Mauro Campanella, and Domenico Siracusa. 2021. In-network volumetric DDoS victim identification using pro- grammable commodity switches.IEEE Transactions on Network and Service Management18, 2 (2021), 1191–1202

work page 2021
[13]

Yang Du, He Huang, Yu-E Sun, Kejian Li, Boyu Zhang, and Guoju Gao. 2023. A better cardinality estimator with fewer bits, constant update time, and mergeabil- ity. InProceedings of IEEE International Conference on Computer Communications (INFOCOM). 1–10

work page 2023
[14]

Zakir Durumeric, Michael Bailey, and J Alex Halderman. 2014. An Internet-Wide view of Internet-Wide scanning. InProceedings of USENIX Security Symposium. 65–78

work page 2014
[15]

Cristian Estan, George Varghese, and Mike Fisk. 2003. Bitmap algorithms for counting active flows on high speed links. InProceedings of ACM SIGCOMM Conference on Internet Measurement. 153–166

work page 2003
[16]

FastNetMon. 2023. Rise of carpet bombing attacks. https://fastnetmon.com/2023/ 10/24/rise-of-carpet-bombing-ddos-attacks-and-ways-to-detect-and-defend- against-them-using-fastnetmon-advanced/

work page 2023
[17]

FS-ISAC. 2025. DDoS Attackers Increase Targeting of Global Financial Sector, Ac- cording to FS-ISAC and Akamai Report. https://www.fsisac.com/newsroom/ddos- attackers-increase-targeting-of-global-financial-sector-according-to-fsisac- and-akamai-report

work page 2025
[18]

Cormode Graham, Korn Flip, Muthukrishnan Shanmugavelayutham, and Sri- vastava Divesh. 2003. Finding hierarchical heavy hitters in data streams. In Proceedings of ACM International Conference on Very Large Data Bases (VLDB). 464–475

work page 2003
[19]

Obelheiro, and Carlos A

Tiago Heinrich, Rafael R. Obelheiro, and Carlos A. Maziero. 2021. New kids on the DRDoS block: Characterizing multiprotocol and carpet bombing attacks. In International Conference on Passive and Active Network Measurement. 269–283

work page 2021
[20]

Stefan Heule, Marc Nunkesser, and Alexander Hall. 2013. Hyperloglog in practice: Algorithmic engineering of a state of the art cardinality estimation algorithm. InProceedings of International Conference on Extending Database Technology. 683–692

work page 2013
[21]

Hirsi Abdinasir, Audah Lukman, Salh, Adeb. 2024. SDN-DDoS Traffic Dataset. https://data.mendeley.com/datasets/b7vw628825/1

work page 2024
[22]

Esra Hotoğlu, Sevil Sen, and Burcu Can. 2025. A Comprehensive Analysis of Adversarial Attacks against Spam Filters.arXiv preprint arXiv:2505.03831(2025)

work page arXiv 2025
[23]

Zhen Huang, Shang Liu, Ke Zhao, and Yong Xiang. 2024. GMCB: An Efficient and Light Graph Analysis Model for Detecting Carpet Bombing DDoS Attacks. InProceedings of IEEE International Conference on Computer and Communications. 1918–1922

work page 2024
[24]

Itay Raviv. 2023. DDoS Carpet-Bombing – Coming In Fast And Bru- tal. https://www.radware.com/blog/ddos-protection/ddos-carpet-bombing- coming-in-fast-and-brutal/

work page 2023
[25]

Xuyang Jing, Hui Han, Zheng Yan, and Witold Pedrycz. 2021. SuperSketch: A multi-dimensional reversible data structure for super host identification.IEEE Transactions on Dependable and Secure Computing (TDSC)19, 4 (2021), 2741–2754

work page 2021
[26]

Xuyang Jing, Zheng Yan, Hui Han, and Witold Pedrycz. 2021. ExtendedSketch: Fusing network traffic for super host identification with a memory efficient sketch.IEEE Transactions on Dependable and Secure Computing (TDSC)19, 6 (2021), 3913–3924

work page 2021
[27]

Jenkins Jr

Robert J. Jenkins Jr. 1995. Hash Functions for Hash Table Lookup. http: //burtleburtle.net/bob/hash/evahash.html

work page 1995
[28]

Sian Kim, Changhun Jung, Rhongho Jang, David Mohaisen, and Dae Hun Nyang

work page
[29]

InProceedings of Annual Network and Distributed System Security Symposium

A robust counting sketch for data plane intrusion detection. InProceedings of Annual Network and Distributed System Security Symposium

work page
[30]

Tatyana Kulikova, Olga Svistunova, Roman Dedenok, Andrey Kovtun, Irina Shimko, and Anna Lazaricheva. 2024. Spam and phishing in 2024. https:// securelist.com/spam-and-phishing-report-2024/115536/

work page 2024
[31]

Qingyang Li, Yihang Zhang, Zhidong Jia, Yannan Hu, Lei Zhang, Jianrong Zhang, Yongming Xu, Yong Cui, Zongming Guo, and Xinggong Zhang. 2024. Dollm: How large language models understanding network flow data to detect carpet bombing ddos.arXiv preprint arXiv:2405.07638(2024)

work page arXiv 2024
[32]

Weijiang Liu, Wenyu Qu, Jian Gong, and Keqiu Li. 2015. Detection of superpoints using a vector bloom filter.IEEE Transactions on Information Forensics and Security (TIFS)11, 3 (2015), 514–527

work page 2015
[33]

Chaoyi Ma, Shigang Chen, Youlin Zhang, Qingjun Xiao, and Olufemi O Odeg- bile. 2021. Super spreader identification using geometric-min filter.IEEE/ACM Transactions on Networking (TON)30, 1 (2021), 299–312

work page 2021
[34]

Chaoyi Ma, Olufemi O Odegbile, Dimitrios Melissourgos, Haibo Wang, and Shiping Chen. 2023. From CountMin to Super kJoin Sketches for Flow Spread Estimation.IEEE Transactions on Network Science and Engineering11, 3 (2023), 2353–2370

work page 2023
[35]

Qingxin Mao, Daisuke Makita, Michel van Eeten, Katsunari Yoshioka, and Tsu- tomu Matsumoto. 2024. Characteristics Comparison between Carpet Bombing- type and Single Target DRDoS Attacks Observed by Honeypot.Journal of Infor- mation Processing32 (2024), 731–747

work page 2024
[36]

Ahmed Metwally, Divyakant Agrawal, and Amr El Abbadi. 2005. Efficient compu- tation of frequent and top-k elements in data streams. InProceedings of Springer International Conference on Database Theory. 398–412

work page 2005
[37]

Michael Mitzenmacher, Thomas Steinke, and Justin Thaler. 2012. Hierarchical heavy hitters with the space saving algorithm. InProceedings of Workshop on Algorithm Engineering and Experiments. 160–174

work page 2012
[38]

Netresec. 2012. Capture files from Mid-Atlantic CCDC. https://www.netresec. com/?page=MACCDC

work page 2012
[39]

Jorge Pacheco Omer Yoachimik. 2025. Targeted by 20.5 million DDoS attacks, up 358% year-over-year: Cloudflare’s 2025 Q1 DDoS Threat Report. https://blog. cloudflare.com/ddos-threat-report-for-2025-q1/

work page 2025
[40]

P4 Language Consortium. 2015. P4 Language. https://p4.org

work page 2015
[41]

Xun Song, Jiaqi Zheng, Hao Qian, Shiju Zhao, Hongxuan Zhang, Xuntao Pan, and Guihai Chen. 2023. Couper: Memory-Efficient Cardinality Estimation under Unbalanced Distribution. InProceedings of IEEE International Conference on Data Engineering. 2753–2765

work page 2023
[42]

Lu Tang, Yao Xiao, Qun Huang, and Patrick PC Lee. 2022. A high-performance invertible sketch for network-wide superspreader detection.IEEE/ACM Transac- tions on Networking (TON)31, 2 (2022), 724–737

work page 2022
[43]

Terry Young. 2024. Carpet-bombing attacks highlight the need for intelligent and automated ddos protection. https://www.a10networks.com/blog/carpet- bombing-attacks-highlight-the-need-for-intelligent-and-automated-ddos- protection

work page 2024
[44]

The Measurement and Analysis on the WIDE Internet (MAWI) Working Group

work page
[45]

http://mawi.wide.ad.jp/mawi/

MAWI Working Group Traffic Archive. http://mawi.wide.ad.jp/mawi/

work page
[46]

Patrick Truong and Fabrice Guillemin. 2009. Identification of heavyweight address prefix pairs in IP traffic. InProceedings of IEEE International Teletraffic Congress. 1–8

work page 2009
[47]

UNSW Canberra at ADFA. 2015. The UNSW-NB15 Dataset. https://research. unsw.edu.au/projects/unsw-nb15-dataset

work page 2015
[48]

Haibo Wang, Chaoyi Ma, Olufemi O Odegbile, Shigang Chen, and Jih-Kwon Peir

work page
[49]

(2021), 1040—-1052

Randomized error removal for online spread estimation in data streaming. (2021), 1040—-1052

work page 2021
[50]

Jincheng Wang, Le Yu, John Lui, and Xiapu Luo. 2025. Modern DDoS Threats and Countermeasures: Insights into Emerging Attacks and Detection Strategies. arXiv preprint arXiv:2502.19996(2025)

work page arXiv 2025
[51]

Pinghui Wang, Xiaohong Guan, Tao Qin, and Qiuzhen Huang. 2011. A data streaming method for monitoring host connection degrees of high-speed links. IEEE Transactions on Information Forensics and Security (TIFS)6, 3 (2011), 1086– 1098

work page 2011
[52]

Kyu-Young Whang, Brad T Vander-Zanden, and Howard M Taylor. 1990. A linear-time probabilistic counting algorithm for database applications.ACM Transactions on Database Systems15, 2 (1990), 208–229

work page 1990
[53]

Qingjun Xiao, Shigang Chen, Min Chen, and Yibei Ling. 2015. Hyper-compact virtual estimators for big network data based on register sharing. InProceedings of the ACM International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS). 417–428. WWW ’26, April 13–17, 2026, Dubai, United Arab Emirates Yilin Zhao et al

work page 2015
[54]

You Zhou, Youlin Zhang, Chaoyi Ma, Shigang Chen, and Olufemi O Odegbile

work page
[55]

A Theoretical Analysis A.1 Cardinality Estimation Error Bound Proof

Generalized sketch families for network traffic measurement.Proceedings of the ACM on Measurement and Analysis of Computing Systems3, 3 (2019), 1–34. A Theoretical Analysis A.1 Cardinality Estimation Error Bound Proof. For a flow, the probability that it is hashed to a specific bit 𝑗∈ { 1, . . . , 𝑀} in a bitmap of size 𝑀 is 𝑀 −1. Conversely, the probabil...

work page 2019