Finite-size occupancy scaling of apparent fractal dimensions in stochastic trajectories

Bon A. Koo (University of Pennsylvania); Edward Ju (California Institute of Technology)

arxiv: 2605.27925 · v1 · pith:7G5P3ZTLnew · submitted 2026-05-27 · ❄️ cond-mat.stat-mech · math.PR· physics.data-an· stat.CO· stat.ME

Finite-size occupancy scaling of apparent fractal dimensions in stochastic trajectories

Bon A. Koo (University of Pennsylvania) , Edward Ju (California Institute of Technology) This is my paper

Pith reviewed 2026-06-29 10:13 UTC · model grok-4.3

classification ❄️ cond-mat.stat-mech math.PRphysics.data-anstat.COstat.ME

keywords fractal dimensionfinite-size scalingbox-countingstochastic trajectoriesoccupancy modelbias correctionrandom walksLevy flights

0 comments

The pith

Finite stochastic trajectories produce biased apparent fractal dimensions that an inverted balls-in-boxes occupancy model corrects across processes.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that the apparent box-counting dimension extracted from a finite stochastic trajectory deviates from the true dimension of the limiting process because of an occupancy crossover between resolved scales and the finite number of sampled points. This crossover is captured by a balls-in-boxes occupancy law that predicts the full box-count curve, the saturation scale, and a scaling function for the normalized local slope. Data from random walks, fractional Brownian graphs, and Levy flights collapse onto a single curve under this description. Inverting the occupancy law supplies a bias correction that lowers error on controlled trajectories and transfers to held-out model classes, with a DNA-walk example showing the workflow on measured data.

Core claim

What carries the argument

balls-in-boxes occupancy law, which predicts how trajectory points occupy boxes at different scales and thereby governs the crossover from resolved to saturated regimes

If this is right

The normalized local slope collapses onto a single crossover curve for random walks, fractional Brownian graphs, and Levy flights.
Windowed box-counting bias collapses when the regression window is positioned relative to the saturation scale.
The bias correction reduces error on controlled stochastic trajectories and transfers across held-out model classes.
Local-slope stability alone is not a reliable diagnostic of the true dimension.
The dominant bias is specific to point-sampled box-counting over finite scale windows.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The occupancy correction could be tested on experimental trajectories beyond the DNA-walk case to check whether the same saturation positioning works on real measurements.
If the collapse onto a universal curve persists for additional processes, the model would set a minimum sampling density needed for reliable dimension estimates.
The approach might be adapted to adjust other finite-sample estimators if they share analogous occupancy crossovers.

Load-bearing premise

The balls-in-boxes occupancy law accurately captures the crossover between resolved scales and the finite number of sampled points for the stochastic processes examined.

What would settle it

Applying the inverted occupancy correction to a new class of stochastic trajectories and finding that it does not reduce estimation error relative to uncorrected box-counting would falsify the transferability of the bias correction.

Figures

Figures reproduced from arXiv: 2605.27925 by Bon A. Koo (University of Pennsylvania), Edward Ju (California Institute of Technology).

**Figure 2.** Figure 2: Scaling collapse of the box-counting bias against [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

**Figure 3.** Figure 3: Cross-family data collapse. Normalized local slope [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗

**Figure 4.** Figure 4: Bias correction. (a) RMSE of the windowed box-counting estimate versus the occupancy [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗

**Figure 5.** Figure 5: Estimator comparison. Left: walk traces (true dimension [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗

**Figure 6.** Figure 6: Empirical application: the E. coli DNA walk (RefSeq NC_000913.3) versus finite-size null surrogates. (A) The walk landscape. (B) Box-counting graph dimension versus analyzed length N for the real walk and the mononucleotide and dinucleotide nulls (bands: empirical 2.5–97.5% null quantiles); all drift with N. (C) Gap between the real walk and the mononucleotide null for box-counting, DFA, and the variogram … view at source ↗

read the original abstract

Estimating a fractal dimension from a finite stochastic trajectory is a finite-size scaling problem: the apparent box-counting exponent is shaped by an occupancy crossover between the resolved range of scales and the finite number of sampled points, and need not equal the dimension of the limiting process. We model this crossover with a balls-in-boxes occupancy law, which predicts the box-count curve, the finite-size saturation scale, and a scaling function for the normalized local slope. Across random-walk traces, fractional Brownian graphs, and Levy flights, the normalized local slope collapses onto a single crossover curve, while the windowed box-counting bias collapses when the regression window is positioned relative to the saturation scale. Inverting the occupancy model gives a finite-size bias correction that reduces error on controlled stochastic trajectories and transfers across held-out model classes. Comparisons with correlation dimension, detrended fluctuation analysis, the variogram, and Higuchi's method show that the dominant bias is specific to point-sampled box-counting over finite scale windows, and that local-slope stability alone is not a reliable diagnostic. A DNA-walk example illustrates the workflow on measured data, and all figures, tables, and in-text numbers are regenerated from released single-seed code.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper supplies a transferable bias correction for box-counting on finite trajectories by inverting the balls-in-boxes occupancy law, with supporting data collapse and cross-model tests.

read the letter

The main thing to know is that this work gives a concrete bias correction for apparent fractal dimensions from box-counting on finite stochastic trajectories. They model the occupancy crossover, invert the law to fix the bias, and show the normalized local slope collapses onto one curve across random walks, fractional Brownian graphs, and Levy flights.

What is new is the specific use of the occupancy law to derive the correction and the explicit transfer tests on held-out model classes. The paper handles the empirical side cleanly: the collapse holds, the correction reduces error on controlled cases, comparisons to correlation dimension, DFA, variogram, and Higuchi isolate the issue to point-sampled box-counting over limited windows, and single-seed code lets anyone regenerate the figures and numbers.

Soft spots are limited. The occupancy law is a standard approximation that fits the tested processes, and the collapse plus transfer results back its adequacy here, but it is not claimed to cover every possible trajectory. Window placement relative to the saturation scale needs care in practice, though the paper positions the regression accordingly. No internal contradictions or circular fitting show up.

This is for researchers who estimate fractal dimensions from sampled paths in statistical mechanics or nearby fields and want to reduce finite-size artifacts. A reader working with trajectory data would pick up a usable method and some caution about other estimators.

I would send it for peer review. The evidence is empirical, the code is released, and the claim stays scoped to what the tests support.

Referee Report

0 major / 2 minor

Summary. The paper claims that finite-size effects in box-counting fractal dimension estimates on stochastic trajectories arise from an occupancy crossover between resolved scales and finite sample points, which can be modeled by a balls-in-boxes occupancy law. This law predicts the box-count curve, saturation scale, and a scaling function for the normalized local slope. The normalized local slope collapses onto a single curve across random walks, fBm graphs, and Lévy flights; windowed bias collapses when the regression window is positioned relative to the saturation scale. Inverting the occupancy model yields a bias correction that reduces error on controlled trajectories and transfers to held-out model classes. Comparisons with correlation dimension, DFA, variogram, and Higuchi's method indicate the bias is specific to point-sampled box-counting, and local-slope stability is not a reliable diagnostic. A DNA-walk example is provided, with all results regenerated from released single-seed code.

Significance. If the occupancy law adequately captures the crossover for the examined processes, the work supplies a practical, invertible correction for finite-size bias in box-counting on trajectories, supported by explicit data collapse, transfer tests on independent simulation classes, and full reproducibility via released code. This addresses a common source of error in estimating dimensions from finite stochastic data in statistical mechanics, with the transferability across model classes and comparisons to other estimators strengthening its utility over ad-hoc window choices.

minor comments (2)

The abstract and text refer to 'the occupancy law' without an explicit equation number or derivation sketch in the provided summary; adding a short inline statement of the balls-in-boxes formula (e.g., the expected occupancy as function of scale and point count) would aid readers unfamiliar with the standard model.
The DNA-walk example is mentioned but no figure or table reference is given in the abstract; ensure the workflow illustration is clearly labeled with the specific saturation scale used.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive assessment and the recommendation to accept. The referee summary accurately captures the scope and results of the work.

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The derivation models the occupancy crossover with a pre-existing balls-in-boxes law (standard in the literature), inverts it to produce an explicit bias correction, and validates via data collapse plus transfer on held-out simulation classes regenerated from released code. No step reduces by construction to a fitted input, self-definition, or self-citation chain; the central result is benchmarked externally rather than forced internally.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on applying the pre-existing balls-in-boxes occupancy law as a domain model for point distributions in box-counting; no new free parameters, axioms beyond the standard law, or invented entities are introduced in the abstract.

axioms (1)

domain assumption Balls-in-boxes occupancy law models the point distribution and crossover in box-counting of stochastic trajectories.
Invoked to predict the box-count curve, saturation scale, and scaling function for the normalized local slope.

pith-pipeline@v0.9.1-grok · 5762 in / 1267 out tokens · 36747 ms · 2026-06-29T10:13:31.217893+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

28 extracted references

[1]

Privman (Ed.), Finite Size Scaling and Numerical Simulation of Statistical Systems, World Scientific, 1990

V. Privman (Ed.), Finite Size Scaling and Numerical Simulation of Statistical Systems, World Scientific, 1990

1990
[2]

Falconer, Fractal Geometry: Mathematical Foundations and Applications, 2nd Edition, Wiley, 2003

K. Falconer, Fractal Geometry: Mathematical Foundations and Applications, 2nd Edition, Wiley, 2003

2003
[3]

B. B. Mandelbrot, The Fractal Geometry of Nature, W. H. Freeman, 1983. 14

1983
[4]

Theiler, Estimating fractal dimension, Journal of the Optical Society of America A 7 (6) (1990) 1055–1073

J. Theiler, Estimating fractal dimension, Journal of the Optical Society of America A 7 (6) (1990) 1055–1073

1990
[5]

Buczkowski, P

S. Buczkowski, P. Hildgen, L. Cartilier, Measurements of fractal dimension by box-counting: a critical analysis of data scatter, Physica A 252 (1–2) (1998) 23–34

1998
[6]

Gonzato, F

G. Gonzato, F. Mulargia, M. Ciccotti, Measuring the fractal dimensions of ideal and actual objects: implications for application in geology and geophysics, Geophysical Journal Interna- tional 142 (1) (2000) 108–116

2000
[7]

P. Hall, A. Wood, On the performance of box-counting estimators of fractal dimension, Biometrika 80 (1) (1993) 246–251

1993
[8]

Gneiting, H

T. Gneiting, H. Ševčíková, D. B. Percival, Estimators of fractal dimension: Assessing the roughness of time series and spatial data, Statistical Science 27 (2) (2012) 247–277

2012
[9]

Tsurumi, H

S. Tsurumi, H. Takayasu, The fractal dimension in computer-simulated random walks, Physics Letters A 113 (9) (1986) 449–450

1986
[10]

N. C. Kenkel, Sample size requirements for fractal dimension estimation, Community Ecology 14 (2) (2013) 144–152

2013
[11]

Grassberger, Finite sample corrections to entropy and dimension estimates, Physics Letters A 128 (6–7) (1988) 369–373

P. Grassberger, Finite sample corrections to entropy and dimension estimates, Physics Letters A 128 (6–7) (1988) 369–373

1988
[12]

Borgani, G

S. Borgani, G. Murante, Box-counting clustering analysis: Corrections for finite sample effects, Physical Review E 49 (6) (1994) 4907–4917

1994
[13]

Pearson, The problem of the random walk, Nature 72 (1905) 294

K. Pearson, The problem of the random walk, Nature 72 (1905) 294

1905
[14]

M. D. Donsker, An invariance principle for certain probability limit theorems, Mem. Amer. Math. Soc. 6 (1951)

1951
[15]

Billingsley, Convergence of Probability Measures, 2nd Edition, Wiley, 1999

P. Billingsley, Convergence of Probability Measures, 2nd Edition, Wiley, 1999

1999
[16]

S. J. Taylor, The Hausdorffα-dimensional measure of Brownian paths inn-space, Proc. Cam- bridge Philos. Soc. 49 (1953) 31–39

1953
[17]

Mörters, Y

P. Mörters, Y. Peres, Brownian Motion, Cambridge University Press, 2010

2010
[18]

R. M. Blumenthal, R. K. Getoor, Some theorems on stable processes, Transactions of the American Mathematical Society 95 (2) (1960) 263–273

1960
[19]

W. E. Pruitt, S. J. Taylor, Sample path properties of processes with stable components, Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete 12 (1969) 267–289

1969
[20]

Metzler, J

R. Metzler, J. Klafter, The random walk’s guide to anomalous diffusion: a fractional dynamics approach, Physics Reports 339 (1) (2000) 1–77

2000
[21]

J. M. Chambers, C. L. Mallows, B. W. Stuck, A method for simulating stable random variables, Journal of the American Statistical Association 71 (354) (1976) 340–344

1976
[22]

R. B. Davies, D. S. Harte, Tests for Hurst effect, Biometrika 74 (1) (1987) 95–101. 15

1987
[23]

Grassberger, I

P. Grassberger, I. Procaccia, Measuring the strangeness of strange attractors, Physica D 9 (1–
[24]

C.-K. Peng, S. V. Buldyrev, S. Havlin, M. Simons, H. E. Stanley, A. L. Goldberger, Mosaic organization of DNA nucleotides, Physical Review E 49 (2) (1994) 1685–1689

1994
[25]

J. W. Kantelhardt, E. Koscielny-Bunde, H. H. A. Rego, S. Havlin, A. Bunde, Detecting long- range correlations with detrended fluctuation analysis, Physica A 295 (3–4) (2001) 441–454

2001
[26]

Higuchi, Approach to an irregular time series on the basis of the fractal theory, Physica D 31 (2) (1988) 277–283

T. Higuchi, Approach to an irregular time series on the basis of the fractal theory, Physica D 31 (2) (1988) 277–283

1988
[27]

A. Eke, P. Hermán, J. B. Bassingthwaighte, G. M. Raymond, D. B. Percival, M. Cannon, I. Balla, C. Ikrényi, Physiological time series: distinguishing fractal noises from motions, Pflügers Archiv – European Journal of Physiology 439 (4) (2000) 403–415

2000
[28]

C.-K. Peng, S. V. Buldyrev, A. L. Goldberger, S. Havlin, F. Sciortino, M. Simons, H. E. Stanley, Long-range correlations in nucleotide sequences, Nature 356 (1992) 168–170. 16 Supplementary Material This appendix reports three robustness analyses referenced in the main text: an ablation separat- ing the no-regression collapse from fitted variants (Section...

1992

[1] [1]

Privman (Ed.), Finite Size Scaling and Numerical Simulation of Statistical Systems, World Scientific, 1990

V. Privman (Ed.), Finite Size Scaling and Numerical Simulation of Statistical Systems, World Scientific, 1990

1990

[2] [2]

Falconer, Fractal Geometry: Mathematical Foundations and Applications, 2nd Edition, Wiley, 2003

K. Falconer, Fractal Geometry: Mathematical Foundations and Applications, 2nd Edition, Wiley, 2003

2003

[3] [3]

B. B. Mandelbrot, The Fractal Geometry of Nature, W. H. Freeman, 1983. 14

1983

[4] [4]

Theiler, Estimating fractal dimension, Journal of the Optical Society of America A 7 (6) (1990) 1055–1073

J. Theiler, Estimating fractal dimension, Journal of the Optical Society of America A 7 (6) (1990) 1055–1073

1990

[5] [5]

Buczkowski, P

S. Buczkowski, P. Hildgen, L. Cartilier, Measurements of fractal dimension by box-counting: a critical analysis of data scatter, Physica A 252 (1–2) (1998) 23–34

1998

[6] [6]

Gonzato, F

G. Gonzato, F. Mulargia, M. Ciccotti, Measuring the fractal dimensions of ideal and actual objects: implications for application in geology and geophysics, Geophysical Journal Interna- tional 142 (1) (2000) 108–116

2000

[7] [7]

P. Hall, A. Wood, On the performance of box-counting estimators of fractal dimension, Biometrika 80 (1) (1993) 246–251

1993

[8] [8]

Gneiting, H

T. Gneiting, H. Ševčíková, D. B. Percival, Estimators of fractal dimension: Assessing the roughness of time series and spatial data, Statistical Science 27 (2) (2012) 247–277

2012

[9] [9]

Tsurumi, H

S. Tsurumi, H. Takayasu, The fractal dimension in computer-simulated random walks, Physics Letters A 113 (9) (1986) 449–450

1986

[10] [10]

N. C. Kenkel, Sample size requirements for fractal dimension estimation, Community Ecology 14 (2) (2013) 144–152

2013

[11] [11]

Grassberger, Finite sample corrections to entropy and dimension estimates, Physics Letters A 128 (6–7) (1988) 369–373

P. Grassberger, Finite sample corrections to entropy and dimension estimates, Physics Letters A 128 (6–7) (1988) 369–373

1988

[12] [12]

Borgani, G

S. Borgani, G. Murante, Box-counting clustering analysis: Corrections for finite sample effects, Physical Review E 49 (6) (1994) 4907–4917

1994

[13] [13]

Pearson, The problem of the random walk, Nature 72 (1905) 294

K. Pearson, The problem of the random walk, Nature 72 (1905) 294

1905

[14] [14]

M. D. Donsker, An invariance principle for certain probability limit theorems, Mem. Amer. Math. Soc. 6 (1951)

1951

[15] [15]

Billingsley, Convergence of Probability Measures, 2nd Edition, Wiley, 1999

P. Billingsley, Convergence of Probability Measures, 2nd Edition, Wiley, 1999

1999

[16] [16]

S. J. Taylor, The Hausdorffα-dimensional measure of Brownian paths inn-space, Proc. Cam- bridge Philos. Soc. 49 (1953) 31–39

1953

[17] [17]

Mörters, Y

P. Mörters, Y. Peres, Brownian Motion, Cambridge University Press, 2010

2010

[18] [18]

R. M. Blumenthal, R. K. Getoor, Some theorems on stable processes, Transactions of the American Mathematical Society 95 (2) (1960) 263–273

1960

[19] [19]

W. E. Pruitt, S. J. Taylor, Sample path properties of processes with stable components, Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete 12 (1969) 267–289

1969

[20] [20]

Metzler, J

R. Metzler, J. Klafter, The random walk’s guide to anomalous diffusion: a fractional dynamics approach, Physics Reports 339 (1) (2000) 1–77

2000

[21] [21]

J. M. Chambers, C. L. Mallows, B. W. Stuck, A method for simulating stable random variables, Journal of the American Statistical Association 71 (354) (1976) 340–344

1976

[22] [22]

R. B. Davies, D. S. Harte, Tests for Hurst effect, Biometrika 74 (1) (1987) 95–101. 15

1987

[23] [23]

Grassberger, I

P. Grassberger, I. Procaccia, Measuring the strangeness of strange attractors, Physica D 9 (1–

[24] [24]

C.-K. Peng, S. V. Buldyrev, S. Havlin, M. Simons, H. E. Stanley, A. L. Goldberger, Mosaic organization of DNA nucleotides, Physical Review E 49 (2) (1994) 1685–1689

1994

[25] [25]

J. W. Kantelhardt, E. Koscielny-Bunde, H. H. A. Rego, S. Havlin, A. Bunde, Detecting long- range correlations with detrended fluctuation analysis, Physica A 295 (3–4) (2001) 441–454

2001

[26] [26]

Higuchi, Approach to an irregular time series on the basis of the fractal theory, Physica D 31 (2) (1988) 277–283

T. Higuchi, Approach to an irregular time series on the basis of the fractal theory, Physica D 31 (2) (1988) 277–283

1988

[27] [27]

A. Eke, P. Hermán, J. B. Bassingthwaighte, G. M. Raymond, D. B. Percival, M. Cannon, I. Balla, C. Ikrényi, Physiological time series: distinguishing fractal noises from motions, Pflügers Archiv – European Journal of Physiology 439 (4) (2000) 403–415

2000

[28] [28]

C.-K. Peng, S. V. Buldyrev, A. L. Goldberger, S. Havlin, F. Sciortino, M. Simons, H. E. Stanley, Long-range correlations in nucleotide sequences, Nature 356 (1992) 168–170. 16 Supplementary Material This appendix reports three robustness analyses referenced in the main text: an ablation separat- ing the no-regression collapse from fitted variants (Section...

1992