pith. sign in

arxiv: 2602.10125 · v6 · submitted 2026-01-31 · 💻 cs.SI · cs.NI· stat.AP

How segmented is my network?

Pith reviewed 2026-05-16 08:55 UTC · model grok-4.3

classification 💻 cs.SI cs.NIstat.AP
keywords network segmentationsegmentednessedge densitysampling estimatorconfidence intervalsnetwork securitylateral movement
0
0 comments X

The pith

Segmentedness of a network equals one minus its allowed-communication edge density and can be estimated to within 10 percent error from 97 random node-pair samples.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper defines segmentedness as the fraction of node pairs whose communication is blocked by security policy, which is exactly the complement of the edge density in the allowed-communication graph. It derives a simple normalized estimator for this quantity and shows that the estimator's sampling variance yields a 95 percent confidence interval with margin of error 0.1 once at least 97 pairs have been checked. The required sample size does not grow with the total number of nodes provided the pairs are chosen uniformly at random. The same estimator is tested on Erdős–Rényi graphs, stochastic block models, and real enterprise network traces, recovering the true value within the predicted interval. This supplies the first statistically grounded scalar that security teams can use to track how effectively a network limits lateral movement.

Core claim

Segmentedness is defined as the fraction of potential node-pair communications disallowed by policy—equivalently, the complement of graph edge density—and is shown to be the first statistically principled scalar metric for this purpose. A normalized estimator is derived whose uncertainty is bounded by confidence intervals; for a 95 percent interval with margin of error ±0.1, a minimum of M=97 sampled node pairs suffices, independent of total network size under uniform random sampling. Monte Carlo evaluation on Erdős–Rényi, stochastic block, and real enterprise graphs confirms that the estimator recovers the true segmentedness within the stated error bound.

What carries the argument

The normalized estimator that computes the fraction of disallowed pairs observed in a uniform random sample of node pairs and scales it to estimate the complement of edge density.

If this is right

  • Security teams can track segmentation levels over time as a baseline metric without enumerating every possible connection.
  • Zero-trust initiatives can be assessed quantitatively by comparing measured segmentedness before and after policy changes.
  • During mergers, the estimator can quantify how segmented the combined network remains after integration.
  • Networks of different sizes become directly comparable because the required sample count is independent of total nodes.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The approach implies that full topology discovery is unnecessary for policy-effectiveness audits when only an aggregate segmentation score is needed.
  • Repeated sampling at different times could detect gradual policy drift even if individual rules change.
  • The same sampling logic might be adapted to measure segmentation under directed or time-varying policies by treating each snapshot separately.

Load-bearing premise

That the network policy can be represented as a static undirected graph and that the allowed or disallowed status of any sampled node pair can be determined accurately and at reasonable cost.

What would settle it

Running the estimator on a live enterprise network and finding that the fraction of pairs whose policy status cannot be determined without mapping the entire graph exceeds the margin of error.

Figures

Figures reproduced from arXiv: 2602.10125 by Rohit Dube.

Figure 1
Figure 1. Figure 1: Monte Carlo mean and 95% CI of the edge density estimator [PITH_FULL_IMAGE:figures/full_fig_p007_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Empirical coverage probability of the 95% Wald confidence [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Monte Carlo mean and 95% CI of the edge density estimator [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Monte Carlo mean and 95% CI of the edge density estimator [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
read the original abstract

Network segmentation is a popular security practice for limiting lateral movement, yet practitioners lack a metric to measure how segmented a network actually is. We define segmentedness as the fraction of potential node-pair communications disallowed by policy -- equivalently, the complement of graph edge density -- and show it to be the first statistically principled scalar metric for this purpose. Then, we derive a normalized estimator for segmentedness and evaluate its uncertainty using confidence intervals. For a 95\% confidence interval with a margin-of-error of $\pm 0.1$, we show that a minimum of $M=97$ sampled node pairs is sufficient. This result is independent of the total number of nodes in the network, provided that node pairs are sampled uniformly at random. We evaluate the estimator through Monte Carlo simulations on Erd\H{o}s--R\'enyi, stochastic block models, and real-world enterprise network datasets, demonstrating accurate estimation. Finally, we discuss applications of the estimator, such as baseline tracking, zero trust assessment, and merger integration.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript defines segmentedness as the fraction of potential node-pair communications disallowed by policy (equivalently, one minus the edge density of the policy graph) and presents it as the first statistically principled scalar metric for network segmentation. It derives a sample-based estimator for this quantity and shows that M=97 uniformly random node-pair samples suffice for a 95% confidence interval with margin of error ±0.1, with the result claimed to be independent of total network size V. The estimator is evaluated via Monte Carlo simulations on Erdős–Rényi graphs, stochastic block models, and real enterprise network datasets.

Significance. A practical, statistically grounded metric for quantifying network segmentation would be valuable for security tasks such as zero-trust assessment and merger integration. The Monte Carlo validation on both synthetic models and real data provides useful empirical support for the estimator's accuracy under the tested regimes, and the parameter-free nature of the core definition is a strength.

major comments (2)
  1. [Abstract and estimator derivation] The sample-size derivation (abstract and corresponding estimator section) uses the infinite-population formula n = (Z² p (1-p))/E² with Z=1.96, p=0.5, E=0.1 to obtain M≈97, but contains no finite-population correction. The exact variance of the hypergeometric estimator is p(1-p)(K-M)/(M(K-1)) where K=binom(V,2); this correction is material for V≲20 and renders the independence claim incorrect.
  2. [Monte Carlo experiments] The Monte Carlo experiments (section on evaluation) are conducted exclusively on large-V graphs, so they do not expose the regime where the claimed independence from V fails or where M=97 exceeds the available number of distinct pairs.
minor comments (2)
  1. [Abstract] The abstract states that the estimator is 'normalized' but provides no explicit formula or pseudocode for the normalization step.
  2. [Evaluation on real-world datasets] Real-data preprocessing steps and any handling of missing or ambiguous policy edges are not described, which affects reproducibility of the enterprise-graph results.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and insightful comments. We appreciate the opportunity to clarify the scope of our claims regarding the sample-size independence and to strengthen the evaluation section. Below we respond point-by-point to the major comments.

read point-by-point responses
  1. Referee: [Abstract and estimator derivation] The sample-size derivation (abstract and corresponding estimator section) uses the infinite-population formula n = (Z² p (1-p))/E² with Z=1.96, p=0.5, E=0.1 to obtain M≈97, but contains no finite-population correction. The exact variance of the hypergeometric estimator is p(1-p)(K-M)/(M(K-1)) where K=binom(V,2); this correction is material for V≲20 and renders the independence claim incorrect.

    Authors: We thank the referee for pointing out the distinction between the infinite-population approximation and the exact hypergeometric sampling variance. Our derivation intentionally uses the conservative infinite-population formula (which yields a slightly larger sample size) to guarantee the desired margin of error regardless of V. For V ≳ 100 the finite-population correction factor (K-M)/(K-1) is already >0.99 when M=97, so the independence from V holds to high accuracy in all practical network sizes. We agree, however, that the claim should be qualified for very small networks. In the revision we will (i) state explicitly that the result is asymptotic for large V, (ii) include the exact variance formula, and (iii) add a short paragraph discussing the regime V < 50 where the correction becomes noticeable. revision: yes

  2. Referee: [Monte Carlo experiments] The Monte Carlo experiments (section on evaluation) are conducted exclusively on large-V graphs, so they do not expose the regime where the claimed independence from V fails or where M=97 exceeds the available number of distinct pairs.

    Authors: The referee is correct that our Monte Carlo study was performed on networks with V ≥ 100 to reflect realistic enterprise settings. We will augment the evaluation section with a new subsection containing Monte Carlo trials on small graphs (V = 10, 20, 50) both with and without the finite-population correction. These experiments will illustrate the point at which the approximation deviates and will also demonstrate the estimator’s behavior when the sampling fraction M/K becomes non-negligible. We believe this addition will make the limitations of the independence claim transparent. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation uses direct definition and external statistical formulas

full rationale

The paper defines segmentedness directly as the complement of edge density (fraction of disallowed node-pair communications), a straightforward observable graph property with no self-reference or fitted parameters. The estimator is the sample proportion of disallowed pairs under uniform random sampling, and the M=97 sample size for ±0.1 margin at 95% CI is obtained from the standard formula n=(Z²p(1-p))/E² with Z=1.96 and p=0.5. This is an external statistical result, not a paper-internal reduction or self-citation. No load-bearing steps reduce to inputs by construction, no uniqueness theorems are imported from prior author work, and no ansatz is smuggled via citation. The claim of independence from total nodes N holds under the large-population approximation explicitly used; finite-population effects for small V are a separate applicability limitation, not circularity. The derivation is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The claim rests on representing policies as static graphs and assuming uniform random sampling is practical; these are standard domain assumptions rather than new inventions or fitted values.

axioms (2)
  • domain assumption Network security policies can be represented as an undirected graph with edges for allowed node-pair communications.
    Directly invoked to equate segmentedness with the complement of edge density.
  • domain assumption Node pairs can be sampled uniformly at random from the complete set of possible pairs.
    Required for the estimator to be unbiased and for the sample-size formula to guarantee the stated margin of error.

pith-pipeline@v0.9.0 · 5464 in / 1468 out tokens · 30716 ms · 2026-05-16T08:55:47.318537+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

29 extracted references · 29 canonical work pages

  1. [1]

    Washington, DC, USA: National Academies Press, 2025, ISBN: 978- 0-309-73489-9

    National Academies of Sciences, Engineering, and Medicine,Cyber Hard Problems: Focused Steps Toward a Resilient Digital Future. Washington, DC, USA: National Academies Press, 2025, ISBN: 978- 0-309-73489-9

  2. [2]

    Specialpublication800-207,NationalInstituteofStandardsandTechnology (NIST) (2020).https://doi.org/10.6028/NIST.SP.800-207

    S. Rose, O. Borchert, S. Mitchell, and S. Connelly, “Zero trust architecture,” Nat. Inst. Standards Technol., U.S. Dept. Commerce, Gaithersburg, MD, USA, NIST Special Publication 800-207, Aug. 2020, doi: https://doi.org/10.6028/NIST.SP.800-207

  3. [3]

    Zero trust matu- rity model version 2.0,

    Cybersecurity and Infrastructure Security Agency, “Zero trust matu- rity model version 2.0,” Cybersecurity Infrastruct. Security Agency (CISA), U.S. Dept. Homeland Security, Washington, DC, USA, Tech. Rep., Apr. 2023, version 2.0. https://www.cisa.gov/resources-tools/r esources/zero-trust-maturity-model

  4. [4]

    The NIST cyber- security framework (CSF) 2.0,

    National Institute of Standards and Technology, “The NIST cyber- security framework (CSF) 2.0,” Nat. Inst. Standards Technol., U.S. Dept. Commerce, Gaithersburg, MD, USA, Tech. Rep. CSWP 29, Feb. 2024, https://www.nist.gov/cyberframework

  5. [5]

    Towards automated cyber decision support: A case study on network segmentation for security,

    N. Wagner, C. S ¸. S ¸ahin, M. Winterrose, J. Riordan, J. Pena, D. Han- son, and W. W. Streilein, “Towards automated cyber decision support: A case study on network segmentation for security,” inProc. IEEE Symp. Ser. Comput. Intell. (SSCI). Lexington, MA, USA: IEEE, 2016, pp. 1–10, doi: https://doi.org/10.1109/SSCI.2016.7849908

  6. [6]

    S3: A DFW-based scalable security state analysis frame- work for large-scale data center networks,

    A. Sabur, A. Chowdhary, D. Huang, M. Kang, A. Kim, and A. Ve- lazquez, “S3: A DFW-based scalable security state analysis frame- work for large-scale data center networks,” inProc. 22nd Int. Symp. Res. Attacks, Intrusions Defenses (RAID). Beijing, China: USENIX Assoc., Sep. 2019, pp. 473–485, ISBN: 978-1-939133-07-6

  7. [7]

    Transparent mi- crosegmentation in smart home IoT networks,

    A. Osman, A. Wasicek, S. K ¨opsell, and T. Strufe, “Transparent mi- crosegmentation in smart home IoT networks,” inProc. 3rd USENIX Workshop Hot Topics Edge Comput. (HotEdge). USENIX Assoc., Jun. 2020, pp. 1–6, ISBN: 978-1-7138-1525-9

  8. [8]

    Graph neural network based root cause analysis using multivariate time-series kpis for wireless networks

    N. Basta, M. Ikram, M. A. Kaafar, and A. Walker, “Towards a zero- trust micro-segmentation network security strategy: An evaluation framework,” inProc. IEEE/IFIP Netw. Oper. Manag. Symp. (NOMS). IEEE, 2022, pp. 1–7, doi: https://doi.org/10.1109/NOMS54207.2022 .9789888

  9. [9]

    A formal approach to network segmentation,

    N. Mhaskar, M. Alabbad, and R. Khedri, “A formal approach to network segmentation,”Comput. Secur., vol. 103, p. 102162, 2021, doi: https://doi.org/10.1016/j.cose.2020.102162

  10. [10]

    Measuring ransomware lateral move- ment susceptibility via privilege-weighted adjacency matrix expo- nentiation,

    S. Tyagi and G. Murugesan, “Measuring ransomware lateral move- ment susceptibility via privilege-weighted adjacency matrix expo- nentiation,” arXiv preprint, Aug. 2025, arXiv:2508.21005, https: //arxiv.org/abs/2508.21005

  11. [11]

    Network segmentation security with the implementation of threats,

    R. Bredesen and S. Mujeye, “Network segmentation security with the implementation of threats,” inProc. 8th Int. Conf. Softw. Eng. Inf. Manag. (ICSIM). ACM, 2025, pp. 137–141, doi: https://doi.or g/10.1145/3725899.3725920

  12. [12]

    Performance evaluation of data center network with network micro-segmentation,

    M. Mujib and R. F. Sari, “Performance evaluation of data center network with network micro-segmentation,” inProc. 12th Int. Conf. Inf. Technol. Elect. Eng. (ICITEE). IEEE, 2020, pp. 27–32, doi: https://doi.org/10.1109/ICITEE49829.2020.9271749

  13. [13]

    Network segmentation as a defense mechanism for securing enterprise networks,

    N. R. Kotha, “Network segmentation as a defense mechanism for securing enterprise networks,”Turk. J. Comput. Math. Educ., vol. 11, no. 3, pp. 3023–3030, 2020, doi: https://doi.org/10.61841/turcomat. v11i3.14942

  14. [14]

    Security chal- lenges and best practices for resilient IIoT networks: Network seg- mentation,

    R. Yatagha, K. Waedt, J. Schindler, and E. Kirdan, “Security chal- lenges and best practices for resilient IIoT networks: Network seg- mentation,” inProc. INFORMATIK 2023. Berlin, Germany: GI, 2023, pp. 2051–2070, doi: https://doi.org/10.18420/inf2023 204

  15. [15]

    Build a secure network using segmentation and micro-segmentation techniques,

    H. A. Al-Ofeishat and R. Alshorman, “Build a secure network using segmentation and micro-segmentation techniques,”Int. J. Comput. Digit. Syst., vol. 16, no. 1, pp. 1499–1508, Sep. 2024, doi: https: //doi.org/10.12785/ijcds/1601111

  16. [16]

    Securing public cloud networks with efficient role-based micro- segmentation,

    S. K. Mani, K. Hsieh, S. Segarra, R. Chandra, Y . Zhou, and S. Kan- dula, “Securing public cloud networks with efficient role-based micro- segmentation,” inProc. 22nd USENIX Symp. Netw. Syst. Des. Imple- ment. (NSDI). Philadelphia, PA, USA: USENIX Assoc., Apr. 2025, pp. 253–268, doi: https://doi.org/10.5555/3767955.3768010

  17. [17]

    A taxonomy of segmentation in network security,

    R. Dube, “A taxonomy of segmentation in network security,”IEEE Access, vol. 14, pp. 16 921–16 935, 2026, doi: https://doi.org/10.110 9/ACCESS.2026.3658250

  18. [18]

    (1994).Social network analysis: Methods and applications.Cambridge Univer- sity Press.https://doi.org/10.1017/CBO9780511815478

    S. Wasserman and K. Faust,Social Network Analysis: Methods and Applications. Cambridge, U.K.: Cambridge Univ. Press, 1994, doi: https://doi.org/10.1017/CBO9780511815478

  19. [19]

    Finding and evaluating community structure in networks

    M. E. J. Newman and M. Girvan, “Finding and evaluating community structure in networks,”Phys. Rev. E, vol. 69, no. 2, pp. 1–15, 2004, doi: https://doi.org/10.1103/PhysRevE.69.026113

  20. [20]

    Algebraic connectivity of graphs,

    M. Fiedler, “Algebraic connectivity of graphs,”Czechoslov. Math. J., vol. 23, no. 2, pp. 298–305, 1973, doi: https://doi.org/10.21136/CMJ .1973.101168

  21. [21]

    Feller,An Introduction to Probability Theory and Its Applications

    W. Feller,An Introduction to Probability Theory and Its Applications. Wiley, 1991, vol. 2, ISBN: 978-0-471-25709-7

  22. [22]

    Approximate is better than “exact

    A. Agresti and B. A. Coull, “Approximate is better than “exact” for interval estimation of binomial proportions,”Amer. Statist., vol. 52, no. 2, pp. 119–126, 1998, doi: https://doi.org/10.1080/00031305.199 8.10480550

  23. [23]

    Random graphs,

    B. Bollob ´as, “Random graphs,” inModern Graph Theory. Springer, 2011, pp. 215––252, doi: https://doi.org/10.1007/978-1-4612-0619-4 7

  24. [24]

    Holland, Kathryn Blackmond Laskey, and Samuel Leinhardt

    P. W. Holland, K. B. Laskey, and S. Leinhardt, “Stochastic block- models: First steps,”Soc. Netw., vol. 5, no. 2, pp. 109–137, 1983, doi: https://doi.org/10.1016/0378-8733(83)90021-7

  25. [25]

    A dataset of networks of computing hosts,

    O. Madani, S. A. Averineni, and S. Gandham, “A dataset of networks of computing hosts,” inProc. ACM Int. Workshop Security Privacy Analytics (IWSPA), 2022, pp. 100–104, doi: https://doi.org/10.1145/ 3510548.3519368

  26. [26]

    SNAP datasets: Cisco secure workload networks of computing hosts,

    J. Leskovec and A. Krevl, “SNAP datasets: Cisco secure workload networks of computing hosts,” [Online], 2021, available: https://snap .stanford.edu/data/cisco-networks.html. Accessed: Feb. 1, 2026

  27. [27]

    Gelman, J.B

    A. Gelman, J. B. Carlin, H. S. Stern, D. B. Dunson, A. Vehtari, and D. B. Rubin,Bayesian Data Analysis, 3rd ed., ser. Texts in Statistical Science. New York, NY , USA: Chapman and Hall/CRC, 2013, doi: https://doi.org/10.1201/b16018

  28. [28]

    Faulty use of the CIC-IDS 2017 dataset in information security research,

    R. Dube, “Faulty use of the CIC-IDS 2017 dataset in information security research,”J. Comput. Virol. Hacking Techn., vol. 20, pp. 203–211, 2023, doi: https://doi.org/10.1007/s11416-023-00509-7

  29. [29]

    Security information and event management (SIEM): Analysis, trends, and usage in critical infrastructures,

    G. Gonz ´alez-Granadillo, S. Gonz´alez-Zarzosa, and R. Diaz, “Security information and event management (SIEM): Analysis, trends, and usage in critical infrastructures,”Sensors, vol. 21, no. 14, p. 4759, 2021, doi: https://doi.org/10.3390/s21144759