pith. sign in

arxiv: 2606.07450 · v1 · pith:Y7X7T25Mnew · submitted 2026-06-05 · 💻 cs.SI · q-fin.PM· q-fin.ST

Information Networks of Stock Prices

Pith reviewed 2026-06-27 20:08 UTC · model grok-4.3

classification 💻 cs.SI q-fin.PMq-fin.ST
keywords stock price networksmutual informationminimum spanning treeplanar maximally filtered graphcommunity detectionsectoral taxonomyIndonesian capital marketnon-linear dependencies
0
0 comments X

The pith

Pearson correlation with MST and Infomap best recovers official stock sectors while mutual information with PMFG better exposes hidden cross-sector communities.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper compares 24 configurations of dependency measures, graph filters, and community detectors applied to Indonesian stock prices over 2,328 rolling windows from 2015 to 2025. It finds that Pearson correlation combined with minimum spanning trees and Infomap community detection most reliably reconstructs the market's conventional sectoral taxonomy. Planar maximally filtered graphs instead prove better at surfacing local structures and heterogeneous communities that mix sectors. Mutual information using adaptive binning detects residual non-linear dependencies more effectively than k-nearest-neighbor versions. The work positions these methods as serving different goals rather than competing for a single best result.

Core claim

Across thousands of rolling observation windows, the Pearson-MST-Infomap configuration remains the most robust for recovering conventional sectoral taxonomy from stock price correlations, yet the architectural relaxation of PMFG graphs demonstrates superiority when the task requires exposing local structures and the weave of heterogeneous communities; MI adaptive binning appears more proportional than kNN for residual information detection, and the synergy of MI and PMFG supplies an analytical lens for hidden economic sub-structures such as commodity regime cohesion that transcend formal sector boundaries.

What carries the argument

The 24 methodological configurations that combine three dependency estimators (Pearson, MI adaptive binning, MI-kNN), two graph filtering schemes (MST and PMFG), and four community decoders applied to rolling windows of stock price data.

If this is right

  • Pearson-MST-Infomap networks will align closely with the market's official industry sectors.
  • PMFG networks will surface mixed-sector communities and local structures not captured by MST.
  • MI adaptive binning will retain more traces of non-linear price dependencies than kNN approaches.
  • Commodity regimes will appear as cohesive communities that cross formal sectoral lines.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Analysts could select the method according to whether the priority is matching regulatory categories or detecting unexpected economic linkages.
  • The same configuration tests applied to other national markets might reveal analogous differences between linear and non-linear signals.
  • Non-linear dependencies preserved by MI may often trace to commodity price cycles that ignore conventional sector definitions.

Load-bearing premise

Conventional sectoral taxonomy serves as an appropriate external ground truth for judging the quality of price-based network recovery.

What would settle it

A test on the same Indonesian data or a comparable market showing that MI-PMFG recovers official sectoral labels more accurately than Pearson-MST would falsify the reported ranking of configurations.

read the original abstract

The collective movement of stock prices harbors complex interdependencies that are conventionally simplified only through a linear lens. This paper explores computed structural network representations in the Indonesian capital market by testing the limits of Pearson correlation and Mutual Information (MI) in unveiling the spectral dynamics of the market. Across 2,328 rolling observation windows from 2015 to 2025, we examine 24 methodological configurations that combine three dependency estimators (Pearson, MI adaptive binning, and MI-kNN), two graph filtering schemes (Minimum Spanning Tree/MST and Planar Maximally Filtered Graph/PMFG), and four community decoders. The empirical results unveil a fundamental reality: topological richness does not always resonate with sectoral classification precision. The Pearson, MST, and Infomap configuration is shown to remain the most robust foundation for recovering conventional sectoral taxonomy. Nevertheless, when deeper observation demands the exposition of local structures and the weave of heterogeneous communities, the architectural relaxation through PMFG demonstrates its superiority. In the realm of residual information detection, MI adaptive binning appears far more proportional than kNN; histogram-based regularization successfully tames empirical noise without sweeping away traces of non-linear dependency. Ultimately, the synergy of MI and PMFG is not positioned to dethrone the dominance of linear correlation, but rather to provide an essential analytical lens for excavating hidden economic sub-structures -- such as the cohesion of commodity regimes -- that have long transcended the rigid boundaries of the market's formal sectors.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 1 minor

Summary. The paper examines 24 methodological configurations combining three dependency estimators (Pearson correlation, MI with adaptive binning, MI-kNN), two graph filters (MST and PMFG), and four community decoders on 2,328 rolling observation windows of Indonesian stock prices from 2015 to 2025. It claims that the Pearson+MST+Infomap configuration is the most robust for recovering conventional sectoral taxonomy, while PMFG better exposes local structures and heterogeneous communities, and MI adaptive binning is superior for detecting residual non-linear dependencies such as commodity regimes.

Significance. If the comparative rankings hold under proper statistical controls, the study supplies practical guidance for choosing between linear and information-theoretic network methods in financial data, documenting explicit trade-offs between alignment with official sector labels and recovery of cross-sectoral or non-linear structures. The scale of the rolling-window design (over 2,300 windows) is a strength that could support reproducible empirical claims once quantitative support is added.

major comments (3)
  1. [Abstract] Abstract: the headline claim that 'the Pearson, MST, and Infomap configuration is shown to remain the most robust foundation for recovering conventional sectoral taxonomy' is load-bearing for the paper's central recommendation, yet the same paragraph states that MI+PMFG 'better expose commodity regimes and heterogeneous communities that transcend the rigid boundaries' of sectors. This internal tension makes the chosen ground-truth metric (match to official taxonomy) potentially circular and risks ranking the methods according to how well they reproduce the very linear, tree-like structure the paper elsewhere criticizes.
  2. [Abstract] Abstract: no statistical significance tests, error bars, or multiple-comparison corrections are reported for the community-recovery metrics despite 24 configurations evaluated over 2,328 windows; the superiority statements therefore rest on qualitative description rather than quantified evidence.
  3. [Abstract] Abstract: the evaluation treats conventional sectoral taxonomy as an unproblematic external ground truth and assumes the 2015-2025 windows contain no unaccounted regime shifts that could systematically favor one estimator/filter combination; neither assumption is justified or tested with alternative labels (e.g., commodity groupings or earnings co-movement).
minor comments (1)
  1. [Abstract] Abstract: the four community decoders are mentioned but not named, preventing readers from assessing which methods were actually compared.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We appreciate the referee's insightful comments, which highlight important aspects of our abstract and methodology. We provide point-by-point responses below and will make revisions where appropriate to address the concerns.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the headline claim that 'the Pearson, MST, and Infomap configuration is shown to remain the most robust foundation for recovering conventional sectoral taxonomy' is load-bearing for the paper's central recommendation, yet the same paragraph states that MI+PMFG 'better expose commodity regimes and heterogeneous communities that transcend the rigid boundaries' of sectors. This internal tension makes the chosen ground-truth metric (match to official taxonomy) potentially circular and risks ranking the methods according to how well they reproduce the very linear, tree-like structure the paper elsewhere criticizes.

    Authors: The abstract is designed to present both the strength of the Pearson-MST-Infomap configuration for taxonomy recovery and the advantages of PMFG and MI for revealing additional structures. This juxtaposition illustrates the paper's main contribution regarding method trade-offs rather than creating circularity. The taxonomy is used as a standard benchmark, and the paper discusses its limitations. We will revise the abstract to more clearly articulate that the evaluation uses multiple criteria and that the taxonomy match is one dimension of performance. revision: yes

  2. Referee: [Abstract] Abstract: no statistical significance tests, error bars, or multiple-comparison corrections are reported for the community-recovery metrics despite 24 configurations evaluated over 2,328 windows; the superiority statements therefore rest on qualitative description rather than quantified evidence.

    Authors: We concur that the abstract does not include statistical tests. While the manuscript body presents results averaged over the large number of windows, we did not report significance tests or error bars. In the revision, we will add these elements, including error bars on the metrics and appropriate statistical tests with corrections for multiple comparisons across the 24 configurations. revision: yes

  3. Referee: [Abstract] Abstract: the evaluation treats conventional sectoral taxonomy as an unproblematic external ground truth and assumes the 2015-2025 windows contain no unaccounted regime shifts that could systematically favor one estimator/filter combination; neither assumption is justified or tested with alternative labels (e.g., commodity groupings or earnings co-movement).

    Authors: The choice of official sectoral taxonomy follows common practice in the field to allow for direct comparison with previous studies on financial networks. The rolling window design helps account for temporal changes. Nevertheless, we did not conduct explicit tests for regime shifts or evaluate against alternative labels. We will expand the methods and discussion sections to justify the assumptions more thoroughly and include, to the extent possible, analyses with alternative groupings such as commodity sectors. revision: partial

Circularity Check

0 steps flagged

No significant circularity; empirical comparisons rest on external sectoral benchmark

full rationale

The paper conducts direct empirical tests across 24 configurations on 2,328 rolling windows of 2015-2025 Indonesian market data, measuring recovery of conventional sectoral taxonomy as an external ground truth. No derivation step reduces reported rankings to a fitted parameter, self-referential definition, or self-citation chain; the distinction between MST/Infomap for sectoral precision versus PMFG for heterogeneous communities is stated explicitly rather than derived tautologically. The analysis is self-contained against the chosen external labels.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The evaluation framework rests on the domain assumption that sectoral labels are meaningful benchmarks and on free parameters internal to the mutual-information estimators; no new entities are postulated.

free parameters (2)
  • adaptive binning parameters for MI
    Histogram construction choices in the adaptive-binning MI estimator are data-dependent and affect the retained non-linear signal.
  • k in MI-kNN estimator
    Neighbor count in the k-nearest-neighbor MI estimator is a tunable parameter that influences noise suppression versus signal retention.
axioms (1)
  • domain assumption Conventional sectoral taxonomy is a valid external ground truth for network quality
    All comparative claims about robustness are judged against recovery of these labels.

pith-pipeline@v0.9.1-grok · 5794 in / 1426 out tokens · 40530 ms · 2026-06-27T20:08:39.544722+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

26 extracted references · 1 linked inside Pith

  1. [1]

    Hierarchical structure in financial markets,

    R. N. Mantegna, "Hierarchical structure in financial markets," European Physical Journal B, vol. 11, no. 1, pp. 193-197, 1999

  2. [2]

    On Stock Market Dynamics through Ultrametricity of Minimum Spanning Tree

    Situngkir, H. & Surya, Y. (2005). "On Stock Market Dynamics through Ultrametricity of Minimum Spanning Tree". BFI Working Paper Series WPH2005

  3. [3]

    On Capturing the Spreading Dynamics over Trading Prices in the Market

    Situngkir, H. (2015). "On Capturing the Spreading Dynamics over Trading Prices in the Market". BFI Working Paper Series WP-5-2015. arXiv:1510.04690

  4. [4]

    A tool for filtering information in complex systems,

    M. Tumminello, T. Aste, T. Di Matteo, & R. N. Mantegna, "A tool for filtering information in complex systems," Proceedings of the National Academy of Sciences, vol. 102, no. 30, pp. 10421- 10426, 2005

  5. [5]

    Estimating mutual information,

    A. Kraskov, H. Stögbauer, & P. Grassberger, "Estimating mutual information," Physical Review E, vol. 69, 066138, 2004. 11

  6. [6]

    Information Theoretic Measures for Clusterings Comparison: Vari- ants, Properties, Normalization and Correction for Chance,

    N. X. Vinh, J. Epps, & J. Bailey, "Information Theoretic Measures for Clusterings Comparison: Vari- ants, Properties, Normalization and Correction for Chance," Journal of Machine Learning Research, vol. 11, pp. 2837-2854, 2010

  7. [7]

    Finding community structure in very large networks,

    A. Clauset, M. E. J. Newman, & C. Moore, "Finding community structure in very large networks," Physical Review E, vol. 70, 066111, 2004

  8. [8]

    Finding community structure in networks using the eigenvectors of matrices,

    M. E. J. Newman, "Finding community structure in networks using the eigenvectors of matrices," Physical Review E, vol. 74, 036104, 2006

  9. [9]

    T. M. Cover & J. A. Thomas, Elements of Information Theory, 2nd ed. Hoboken, NJ: Wiley, 2006

  10. [10]

    Networks in financial markets based on the mutual information rate,

    P. Fiedor, "Networks in financial markets based on the mutual information rate," Physical Review E, vol. 89, 052801, 2014

  11. [11]

    Maps of random walks on complex networks reveal community structure,

    M. Rosvall & C. T. Bergstrom, "Maps of random walks on complex networks reveal community structure," Proceedings of the National Academy of Sciences, vol. 105, no. 4, pp. 1118-1123, 2008

  12. [12]

    Comparing clusterings - an information based distance,

    M. Meila, "Comparing clusterings - an information based distance," Journal of Multivariate Analysis, vol. 98, no. 5, pp. 873-895, 2007

  13. [13]

    Resolution limit in community detection,

    S. Fortunato & M. Barthelemy, "Resolution limit in community detection," Proceedings of the National Academy of Sciences, vol. 104, no. 1, pp. 36-41, 2007

  14. [14]

    Fast unfolding of communities in large networks,

    V. D. Blondel, J.-L. Guillaume, R. Lambiotte, & E. Lefebvre, "Fast unfolding of communities in large networks," Journal of Statistical Mechanics: Theory and Experiment, P10008, 2008

  15. [15]

    Estimation of the information by an adaptive partitioning of the observation space,

    G. A. Darbellay & I. Vajda, "Estimation of the information by an adaptive partitioning of the observation space," IEEE Transactions on Information Theory, vol. 45, no. 4, pp. 1315-1321, 1999

  16. [16]

    Sample estimate of the entropy of a random vector,

    L. F. Kozachenko & N. N. Leonenko, "Sample estimate of the entropy of a random vector," Problems of Information Transmission, vol. 23, no. 2, pp. 9-16, 1987

  17. [17]

    Estimation of entropy and mutual information,

    L. Paninski, "Estimation of entropy and mutual information," Neural Computation, vol. 15, no. 6, pp. 1191-1253, 2003

  18. [18]

    A Mathematical Theory of Communication,

    C. E. Shannon, "A Mathematical Theory of Communication," Bell System Technical Journal, vol. 27, no. 3-4, pp. 379-423 and 623-656, 1948

  19. [19]

    Topology of correlation-based minimal spanning trees in real and model markets,

    G. Bonanno, G. Caldarelli, F. Lillo, & R. N. Mantegna, "Topology of correlation-based minimal spanning trees in real and model markets," Physical Review E, vol. 68, 046130, 2003

  20. [20]

    Clustering and information in correlation based financial networks,

    J.-P. Onnela, K. Kaski, & J. Kertész, "Clustering and information in correlation based financial networks," European Physical Journal B, vol. 38, no. 2, pp. 353-362, 2004

  21. [21]

    Complex networks on hyperbolic surfaces,

    T. Aste, T. Di Matteo, & S. T. Hyde, "Complex networks on hyperbolic surfaces," Physica A: Statistical Mechanics and its Applications, vol. 346, no. 1-2, pp. 20-26, 2005

  22. [22]

    Correlation based networks of equity returns sampled at different time horizons,

    M. Tumminello, T. Di Matteo, T. Aste, & R. N. Mantegna, "Correlation based networks of equity returns sampled at different time horizons," European Physical Journal B, vol. 55, no. 2, pp. 209-217, 2007

  23. [23]

    The use of dynamical networks to detect the hierarchical organization of financial market sectors,

    T. Di Matteo, F. Pozzi, & T. Aste, "The use of dynamical networks to detect the hierarchical organization of financial market sectors," European Physical Journal B, vol. 73, no. 1, pp. 3-11, 2010

  24. [24]

    Correlation structure and dynamics in volatile markets,

    T. Aste, W. Shaw, & T. Di Matteo, "Correlation structure and dynamics in volatile markets," New Journal of Physics, vol. 12, 085009, 2010

  25. [25]

    Exploring complex networks via topological embedding on surfaces,

    T. Aste, R. Gramatica, & T. Di Matteo, "Exploring complex networks via topological embedding on surfaces," Physical Review E, vol. 86, 036109, 2012

  26. [26]

    Bootstrap validation of links of a minimum spanning tree,

    F. Musciotto, L. Marotta, S. Miccichè, & R. N. Mantegna, "Bootstrap validation of links of a minimum spanning tree," Physica A: Statistical Mechanics and its Applications, vol. 512, pp. 1032- 1043, 2018. 12