pith. sign in

arxiv: 2605.27925 · v1 · pith:7G5P3ZTLnew · submitted 2026-05-27 · ❄️ cond-mat.stat-mech · math.PR· physics.data-an· stat.CO· stat.ME

Finite-size occupancy scaling of apparent fractal dimensions in stochastic trajectories

Pith reviewed 2026-06-29 10:13 UTC · model grok-4.3

classification ❄️ cond-mat.stat-mech math.PRphysics.data-anstat.COstat.ME
keywords fractal dimensionfinite-size scalingbox-countingstochastic trajectoriesoccupancy modelbias correctionrandom walksLevy flights
0
0 comments X

The pith

Finite stochastic trajectories produce biased apparent fractal dimensions that an inverted balls-in-boxes occupancy model corrects across processes.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that the apparent box-counting dimension extracted from a finite stochastic trajectory deviates from the true dimension of the limiting process because of an occupancy crossover between resolved scales and the finite number of sampled points. This crossover is captured by a balls-in-boxes occupancy law that predicts the full box-count curve, the saturation scale, and a scaling function for the normalized local slope. Data from random walks, fractional Brownian graphs, and Levy flights collapse onto a single curve under this description. Inverting the occupancy law supplies a bias correction that lowers error on controlled trajectories and transfers to held-out model classes, with a DNA-walk example showing the workflow on measured data.

Core claim

Estimating a fractal dimension from a finite stochastic trajectory is a finite-size scaling problem: the apparent box-counting exponent is shaped by an occupancy crossover between the resolved range of scales and the finite number of sampled points, and need not equal the dimension of the limiting process. We model this crossover with a balls-in-boxes occupancy law, which predicts the box-count curve, the finite-size saturation scale, and a scaling function for the normalized local slope. Across random-walk traces, fractional Brownian graphs, and Levy flights, the normalized local slope collapses onto a single crossover curve, while the windowed box-counting bias collapses when the regressio

What carries the argument

balls-in-boxes occupancy law, which predicts how trajectory points occupy boxes at different scales and thereby governs the crossover from resolved to saturated regimes

If this is right

  • The normalized local slope collapses onto a single crossover curve for random walks, fractional Brownian graphs, and Levy flights.
  • Windowed box-counting bias collapses when the regression window is positioned relative to the saturation scale.
  • The bias correction reduces error on controlled stochastic trajectories and transfers across held-out model classes.
  • Local-slope stability alone is not a reliable diagnostic of the true dimension.
  • The dominant bias is specific to point-sampled box-counting over finite scale windows.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The occupancy correction could be tested on experimental trajectories beyond the DNA-walk case to check whether the same saturation positioning works on real measurements.
  • If the collapse onto a universal curve persists for additional processes, the model would set a minimum sampling density needed for reliable dimension estimates.
  • The approach might be adapted to adjust other finite-sample estimators if they share analogous occupancy crossovers.

Load-bearing premise

The balls-in-boxes occupancy law accurately captures the crossover between resolved scales and the finite number of sampled points for the stochastic processes examined.

What would settle it

Applying the inverted occupancy correction to a new class of stochastic trajectories and finding that it does not reduce estimation error relative to uncorrected box-counting would falsify the transferability of the bias correction.

Figures

Figures reproduced from arXiv: 2605.27925 by Bon A. Koo (University of Pennsylvania), Edward Ju (California Institute of Technology).

Figure 1
Figure 1. Figure 1: Measured box-count scaling (markers) and the occupancy model ( [PITH_FULL_IMAGE:figures/full_fig_p007_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Scaling collapse of the box-counting bias against [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Cross-family data collapse. Normalized local slope [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Bias correction. (a) RMSE of the windowed box-counting estimate versus the occupancy [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Estimator comparison. Left: walk traces (true dimension [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Empirical application: the E. coli DNA walk (RefSeq NC_000913.3) versus finite-size null surrogates. (A) The walk landscape. (B) Box-counting graph dimension versus analyzed length N for the real walk and the mononucleotide and dinucleotide nulls (bands: empirical 2.5–97.5% null quantiles); all drift with N. (C) Gap between the real walk and the mononucleotide null for box-counting, DFA, and the variogram … view at source ↗
read the original abstract

Estimating a fractal dimension from a finite stochastic trajectory is a finite-size scaling problem: the apparent box-counting exponent is shaped by an occupancy crossover between the resolved range of scales and the finite number of sampled points, and need not equal the dimension of the limiting process. We model this crossover with a balls-in-boxes occupancy law, which predicts the box-count curve, the finite-size saturation scale, and a scaling function for the normalized local slope. Across random-walk traces, fractional Brownian graphs, and Levy flights, the normalized local slope collapses onto a single crossover curve, while the windowed box-counting bias collapses when the regression window is positioned relative to the saturation scale. Inverting the occupancy model gives a finite-size bias correction that reduces error on controlled stochastic trajectories and transfers across held-out model classes. Comparisons with correlation dimension, detrended fluctuation analysis, the variogram, and Higuchi's method show that the dominant bias is specific to point-sampled box-counting over finite scale windows, and that local-slope stability alone is not a reliable diagnostic. A DNA-walk example illustrates the workflow on measured data, and all figures, tables, and in-text numbers are regenerated from released single-seed code.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The paper claims that finite-size effects in box-counting fractal dimension estimates on stochastic trajectories arise from an occupancy crossover between resolved scales and finite sample points, which can be modeled by a balls-in-boxes occupancy law. This law predicts the box-count curve, saturation scale, and a scaling function for the normalized local slope. The normalized local slope collapses onto a single curve across random walks, fBm graphs, and Lévy flights; windowed bias collapses when the regression window is positioned relative to the saturation scale. Inverting the occupancy model yields a bias correction that reduces error on controlled trajectories and transfers to held-out model classes. Comparisons with correlation dimension, DFA, variogram, and Higuchi's method indicate the bias is specific to point-sampled box-counting, and local-slope stability is not a reliable diagnostic. A DNA-walk example is provided, with all results regenerated from released single-seed code.

Significance. If the occupancy law adequately captures the crossover for the examined processes, the work supplies a practical, invertible correction for finite-size bias in box-counting on trajectories, supported by explicit data collapse, transfer tests on independent simulation classes, and full reproducibility via released code. This addresses a common source of error in estimating dimensions from finite stochastic data in statistical mechanics, with the transferability across model classes and comparisons to other estimators strengthening its utility over ad-hoc window choices.

minor comments (2)
  1. The abstract and text refer to 'the occupancy law' without an explicit equation number or derivation sketch in the provided summary; adding a short inline statement of the balls-in-boxes formula (e.g., the expected occupancy as function of scale and point count) would aid readers unfamiliar with the standard model.
  2. The DNA-walk example is mentioned but no figure or table reference is given in the abstract; ensure the workflow illustration is clearly labeled with the specific saturation scale used.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive assessment and the recommendation to accept. The referee summary accurately captures the scope and results of the work.

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The derivation models the occupancy crossover with a pre-existing balls-in-boxes law (standard in the literature), inverts it to produce an explicit bias correction, and validates via data collapse plus transfer on held-out simulation classes regenerated from released code. No step reduces by construction to a fitted input, self-definition, or self-citation chain; the central result is benchmarked externally rather than forced internally.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on applying the pre-existing balls-in-boxes occupancy law as a domain model for point distributions in box-counting; no new free parameters, axioms beyond the standard law, or invented entities are introduced in the abstract.

axioms (1)
  • domain assumption Balls-in-boxes occupancy law models the point distribution and crossover in box-counting of stochastic trajectories.
    Invoked to predict the box-count curve, saturation scale, and scaling function for the normalized local slope.

pith-pipeline@v0.9.1-grok · 5762 in / 1267 out tokens · 36747 ms · 2026-06-29T10:13:31.217893+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

28 extracted references

  1. [1]

    Privman (Ed.), Finite Size Scaling and Numerical Simulation of Statistical Systems, World Scientific, 1990

    V. Privman (Ed.), Finite Size Scaling and Numerical Simulation of Statistical Systems, World Scientific, 1990

  2. [2]

    Falconer, Fractal Geometry: Mathematical Foundations and Applications, 2nd Edition, Wiley, 2003

    K. Falconer, Fractal Geometry: Mathematical Foundations and Applications, 2nd Edition, Wiley, 2003

  3. [3]

    B. B. Mandelbrot, The Fractal Geometry of Nature, W. H. Freeman, 1983. 14

  4. [4]

    Theiler, Estimating fractal dimension, Journal of the Optical Society of America A 7 (6) (1990) 1055–1073

    J. Theiler, Estimating fractal dimension, Journal of the Optical Society of America A 7 (6) (1990) 1055–1073

  5. [5]

    Buczkowski, P

    S. Buczkowski, P. Hildgen, L. Cartilier, Measurements of fractal dimension by box-counting: a critical analysis of data scatter, Physica A 252 (1–2) (1998) 23–34

  6. [6]

    Gonzato, F

    G. Gonzato, F. Mulargia, M. Ciccotti, Measuring the fractal dimensions of ideal and actual objects: implications for application in geology and geophysics, Geophysical Journal Interna- tional 142 (1) (2000) 108–116

  7. [7]

    P. Hall, A. Wood, On the performance of box-counting estimators of fractal dimension, Biometrika 80 (1) (1993) 246–251

  8. [8]

    Gneiting, H

    T. Gneiting, H. Ševčíková, D. B. Percival, Estimators of fractal dimension: Assessing the roughness of time series and spatial data, Statistical Science 27 (2) (2012) 247–277

  9. [9]

    Tsurumi, H

    S. Tsurumi, H. Takayasu, The fractal dimension in computer-simulated random walks, Physics Letters A 113 (9) (1986) 449–450

  10. [10]

    N. C. Kenkel, Sample size requirements for fractal dimension estimation, Community Ecology 14 (2) (2013) 144–152

  11. [11]

    Grassberger, Finite sample corrections to entropy and dimension estimates, Physics Letters A 128 (6–7) (1988) 369–373

    P. Grassberger, Finite sample corrections to entropy and dimension estimates, Physics Letters A 128 (6–7) (1988) 369–373

  12. [12]

    Borgani, G

    S. Borgani, G. Murante, Box-counting clustering analysis: Corrections for finite sample effects, Physical Review E 49 (6) (1994) 4907–4917

  13. [13]

    Pearson, The problem of the random walk, Nature 72 (1905) 294

    K. Pearson, The problem of the random walk, Nature 72 (1905) 294

  14. [14]

    M. D. Donsker, An invariance principle for certain probability limit theorems, Mem. Amer. Math. Soc. 6 (1951)

  15. [15]

    Billingsley, Convergence of Probability Measures, 2nd Edition, Wiley, 1999

    P. Billingsley, Convergence of Probability Measures, 2nd Edition, Wiley, 1999

  16. [16]

    S. J. Taylor, The Hausdorffα-dimensional measure of Brownian paths inn-space, Proc. Cam- bridge Philos. Soc. 49 (1953) 31–39

  17. [17]

    Mörters, Y

    P. Mörters, Y. Peres, Brownian Motion, Cambridge University Press, 2010

  18. [18]

    R. M. Blumenthal, R. K. Getoor, Some theorems on stable processes, Transactions of the American Mathematical Society 95 (2) (1960) 263–273

  19. [19]

    W. E. Pruitt, S. J. Taylor, Sample path properties of processes with stable components, Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete 12 (1969) 267–289

  20. [20]

    Metzler, J

    R. Metzler, J. Klafter, The random walk’s guide to anomalous diffusion: a fractional dynamics approach, Physics Reports 339 (1) (2000) 1–77

  21. [21]

    J. M. Chambers, C. L. Mallows, B. W. Stuck, A method for simulating stable random variables, Journal of the American Statistical Association 71 (354) (1976) 340–344

  22. [22]

    R. B. Davies, D. S. Harte, Tests for Hurst effect, Biometrika 74 (1) (1987) 95–101. 15

  23. [23]

    Grassberger, I

    P. Grassberger, I. Procaccia, Measuring the strangeness of strange attractors, Physica D 9 (1–

  24. [24]

    C.-K. Peng, S. V. Buldyrev, S. Havlin, M. Simons, H. E. Stanley, A. L. Goldberger, Mosaic organization of DNA nucleotides, Physical Review E 49 (2) (1994) 1685–1689

  25. [25]

    J. W. Kantelhardt, E. Koscielny-Bunde, H. H. A. Rego, S. Havlin, A. Bunde, Detecting long- range correlations with detrended fluctuation analysis, Physica A 295 (3–4) (2001) 441–454

  26. [26]

    Higuchi, Approach to an irregular time series on the basis of the fractal theory, Physica D 31 (2) (1988) 277–283

    T. Higuchi, Approach to an irregular time series on the basis of the fractal theory, Physica D 31 (2) (1988) 277–283

  27. [27]

    A. Eke, P. Hermán, J. B. Bassingthwaighte, G. M. Raymond, D. B. Percival, M. Cannon, I. Balla, C. Ikrényi, Physiological time series: distinguishing fractal noises from motions, Pflügers Archiv – European Journal of Physiology 439 (4) (2000) 403–415

  28. [28]

    C.-K. Peng, S. V. Buldyrev, A. L. Goldberger, S. Havlin, F. Sciortino, M. Simons, H. E. Stanley, Long-range correlations in nucleotide sequences, Nature 356 (1992) 168–170. 16 Supplementary Material This appendix reports three robustness analyses referenced in the main text: an ablation separat- ing the no-regression collapse from fitted variants (Section...