pith. machine review for the scientific record. sign in

arxiv: 2604.18801 · v1 · submitted 2026-04-20 · 💻 cs.LG · cs.DC

Recognition: unknown

Preserving Clusters in Error-Bounded Lossy Compression of Particle Data

Authors on Pith no claims yet

Pith reviewed 2026-05-10 05:18 UTC · model grok-4.3

classification 💻 cs.LG cs.DC
keywords lossy compressionparticle datasingle-linkage clusteringerror-bounded compressioncosmology simulationsmolecular dynamicsclustering preservationpost-compression correction
0
0 comments X

The pith

A post-decompression correction step using projected gradient descent can preserve single-linkage clustering outcomes in particle datasets while retaining competitive compression ratios.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that standard lossy compressors introduce small position errors that can break the connectivity of clusters identified by single-linkage or Friends-of-Friends methods, which are central to analyzing cosmology and molecular dynamics data. To address this, the authors add a correction stage that first locates vulnerable particle pairs through spatial partitioning and local search, then solves an optimization problem to restore the required distance relationships without violating the original error bounds. This approach works on the output of existing compressors such as SZ3 and Draco. If successful, it lets scientists compress massive particle files aggressively while still trusting the downstream cluster statistics that drive scientific conclusions.

Core claim

The central claim is that a clustering-aware correction algorithm, which identifies vulnerable pairs via spatial partitioning and local neighborhood search and then enforces consistency through projected gradient descent on a loss that penalizes pairwise distance violations, can restore single-linkage clustering results on decompressed particle data from off-the-shelf error-bounded compressors.

What carries the argument

The clustering-aware correction algorithm that combines spatial partitioning for vulnerable-pair detection with projected gradient descent on a pairwise-distance-violation loss.

If this is right

  • Single-linkage clustering results remain consistent with the uncompressed data even at higher compression ratios than previously usable.
  • The correction works as a post-processing step on output from SZ3, Draco, and similar compressors without requiring changes to those compressors.
  • GPU and distributed implementations make the method practical for the large particle counts typical in cosmology and molecular-dynamics simulations.
  • Compression performance stays competitive with SZ3, ZFP, Draco, LCP, and space-filling-curve schemes while adding the clustering guarantee.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same correction idea could be adapted to preserve other connectivity-based structures such as Friends-of-Friends halos that are also mentioned in the paper.
  • Embedding the correction inside the compressor loop rather than applying it afterward might reduce total I/O and storage costs further.
  • The technique may extend naturally to fluid-dynamics particle data where cluster-like features influence downstream physical analysis.

Load-bearing premise

The correction procedure can locate the pairs that affect cluster connectivity and adjust their positions without creating new errors that invalidate other analyses or adding so much computation that the compression benefit disappears.

What would settle it

On a cosmology or molecular-dynamics dataset, applying the method at a given error bound and finding that the single-linkage cluster membership of more than a small fraction of particles differs from the uncompressed reference would show the preservation claim does not hold.

Figures

Figures reproduced from arXiv: 2604.18801 by Congrong Ren, Franck Cappello, Hanqi Guo, Katrin Heitmann, Sheng Di.

Figure 1
Figure 1. Figure 1: Illustration of the FoF clustering algorithm. (a) An initial distribution [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Hierarchical spatial decom￾position of the 2563 HACC simu￾lation volume across 64 MPI ranks. Color-coded blocks represent indi￾vidual rank assignments, with rank IDs annotated on visible faces. For each HACC (hiRes) timestep, the 2563 domain is partitioned across 64 MPI ranks using a hierarchical, interleaved decomposition. While the z and y dimensions follow standard linear partitioning (4 and 8 segments,… view at source ↗
Figure 3
Figure 3. Figure 3: Compression ratio vs. maximum relative error of our method and baselines before and after correction on six datasets. [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Throughput of our correction method and baselines. [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: PSNR vs. BPP of our method and baselines before and after correc￾tion on EXAALT dataset. Storage Overhead [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Tight loss Ltight vs. iteration count (limited to 100) for our method correcting ZFP￾compressed EXAALT data with relative ξ = 10−3 (top) and ξ = 10−4 (bottom). Throughput. The throughput of the base compressors and our correction method with full convergence is illus￾trated in [PITH_FULL_IMAGE:figures/full_fig_p007_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Accuracy in FoF clustering results. (a) Matthews correlation coefficient [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Strong scaling analysis for the GPU implementation on the HACC [PITH_FULL_IMAGE:figures/full_fig_p009_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Weak scaling execution time breakdown and throughput for the GPU [PITH_FULL_IMAGE:figures/full_fig_p010_9.png] view at source ↗
read the original abstract

Lossy compression is widely used to reduce storage and I/O costs for large-scale particle datasets in scientific applications such as cosmology, molecular dynamics, and fluid dynamics, where clustering structures (e.g., single-linkage or Friends-of-Friends) are critical for downstream analysis; however, existing compressors typically provide only pointwise error bounds on particle positions and offer no guarantees on preserving clustering outcomes, and even small perturbations can alter cluster connectivity and compromise scientific validity. We propose a correction-based technique to preserve single-linkage clustering under lossy compression, operating on decompressed data from off-the-shelf compressors such as SZ3 and Draco. Our key contributions are threefold: (1) a clustering-aware correction algorithm that identifies vulnerable particle pairs via spatial partitioning and local neighborhood search; (2) an optimization-based formulation that enforces clustering consistency using projected gradient descent with a loss that encodes pairwise distance violations; and (3) a scalable GPU-accelerated and distributed implementation for large-scale datasets. Experiments on cosmology and molecular dynamics datasets show that our method effectively preserves clustering results while maintaining competitive compression performance compared with SZ3, ZFP, Draco, LCP, and space-filling-curve-based schemes.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes a correction-based post-processing technique to preserve single-linkage clustering in lossy-compressed particle data from cosmology and molecular dynamics simulations. It decompresses data from off-the-shelf compressors (SZ3, Draco), identifies vulnerable particle pairs via spatial partitioning and local neighborhood search, and applies projected gradient descent to enforce clustering consistency through a loss penalizing pairwise distance violations. A scalable GPU-accelerated and distributed implementation is presented, with experiments claiming preserved clustering results alongside competitive compression performance versus SZ3, ZFP, Draco, LCP, and space-filling-curve baselines.

Significance. If the method preserves both clustering fidelity and the pointwise error bounds of the base compressor, it would fill an important gap for scientific workflows where downstream clustering analysis must remain valid after compression. The GPU/distributed implementation and use of relevant large-scale datasets are practical strengths. However, the central claim of error-bounded compression is threatened by the correction step, which could make the overall contribution less impactful without resolution.

major comments (2)
  1. [Abstract] Abstract and the description of the optimization-based formulation: the projected gradient descent correction enforces single-linkage distance constraints on decompressed particles but includes no projection step to keep adjusted coordinates inside the original pointwise error balls guaranteed by the base compressor (SZ3/Draco). Consequently the end-to-end output can violate the advertised error bounds, directly undermining the title's and abstract's claim of an error-bounded technique while preserving clustering.
  2. [Experiments] Experimental evaluation (cosmology and molecular dynamics datasets): the claim that clustering is preserved 'while maintaining competitive compression performance' rests on the assumption that post-correction errors remain acceptable, yet no verification is provided that the final positions satisfy the base compressor's error bound or that the added optimization overhead does not negate compression benefits.
minor comments (2)
  1. [Abstract] The abstract lists multiple baselines (SZ3, ZFP, Draco, LCP, space-filling-curve schemes) but does not specify the exact error metrics or clustering similarity measures (e.g., cluster connectivity preservation rate) used for quantitative comparison; these should be stated explicitly in the results section.
  2. Notation for the loss function components in the projected gradient descent formulation could be clarified to avoid ambiguity in how pairwise distance violations are encoded.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback on our manuscript. We have carefully reviewed the major comments concerning the preservation of error bounds during the clustering correction step and provide point-by-point responses below, along with planned revisions to address these issues.

read point-by-point responses
  1. Referee: [Abstract] Abstract and the description of the optimization-based formulation: the projected gradient descent correction enforces single-linkage distance constraints on decompressed particles but includes no projection step to keep adjusted coordinates inside the original pointwise error balls guaranteed by the base compressor (SZ3/Draco). Consequently the end-to-end output can violate the advertised error bounds, directly undermining the title's and abstract's claim of an error-bounded technique while preserving clustering.

    Authors: We acknowledge this valid observation. The current projected gradient descent formulation minimizes the clustering loss but does not explicitly project the updated coordinates back onto the pointwise error balls provided by the base compressor after each step. This could indeed result in violations of the original error bounds. In the revised manuscript, we will update the optimization procedure to include an explicit projection step onto the intersection of the error balls and feasible clustering configurations. This will be described in the methods section, and we will revise the abstract to more precisely state the guarantees provided by the end-to-end pipeline. revision: yes

  2. Referee: [Experiments] Experimental evaluation (cosmology and molecular dynamics datasets): the claim that clustering is preserved 'while maintaining competitive compression performance' rests on the assumption that post-correction errors remain acceptable, yet no verification is provided that the final positions satisfy the base compressor's error bound or that the added optimization overhead does not negate compression benefits.

    Authors: We agree that explicit post-correction verification is required to support the claims. In the revised version, we will add experiments that report the maximum and average pointwise errors (relative and absolute) on the final corrected particle positions for both cosmology and molecular dynamics datasets, confirming adherence to the base compressor's bounds. We will also include timing and compression ratio results that isolate the overhead of the correction step, demonstrating that selective application to vulnerable pairs via spatial partitioning keeps the overall performance competitive with the baselines. revision: yes

Circularity Check

0 steps flagged

No significant circularity; independent algorithmic post-processing

full rationale

The paper presents a correction algorithm that operates on already-decompressed particle data from external compressors (SZ3, Draco). It identifies vulnerable pairs via spatial partitioning, then applies projected gradient descent to enforce single-linkage distance constraints. No equations, parameters, or claims reduce by construction to fitted inputs, self-definitions, or prior self-citations; the method is described as a novel, independent layer whose correctness is evaluated on external cosmology and molecular-dynamics datasets. The derivation chain consists of standard algorithmic steps (neighborhood search + optimization) whose validity does not presuppose the target clustering-preservation result.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The method rests on domain assumptions about particle data spatial locality and the ability of local optimization to enforce global clustering constraints without side effects; no free parameters or invented entities are explicitly introduced in the abstract.

pith-pipeline@v0.9.0 · 5517 in / 1061 out tokens · 33191 ms · 2026-05-10T05:18:22.148520+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

51 extracted references · 40 canonical work pages · 1 internal anchor

  1. [1]

    HACC: Simulating sky surveys on state-of-the-art supercomputing architectures,

    S. Habib, A. Pope, H. Finkel, N. Frontiere, K. Heitmann, D. Daniel, P. Fasel, V . Morozov, G. Zagaris, T. Peterkaet al., “HACC: Simulating sky surveys on state-of-the-art supercomputing architectures,”New Astronomy, vol. 42, pp. 49–65, 2016. [Online]. Available: https: //doi.org/10.1016/j.newast.2015.06.003

  2. [2]

    Real-time Bayesian inference at extreme scale: A digital twin for tsunami early warning applied to the Cascadia subduction zone

    N. Frontiere, J. D. Emberson, M. Buehlmann, E. M. Rangel, S. Habib, K. Heitmann, P. Larsen, V . Morozov, A. Pope, C.-A. Faucher-Gigu `ere et al., “Cosmological hydrodynamics at exascale: A trillion-particle leap in capability,” inProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2025, pp. 25–35. ...

  3. [3]

    TweTriS: Twenty trillion-atom simulation,

    N. Tchipev, S. Seckler, M. Heinen, J. Vrabec, F. Gratl, M. Horsch, M. Bernreuther, C. W. Glass, C. Niethammer, N. Hammeret al., “TweTriS: Twenty trillion-atom simulation,” The International Journal of High Performance Computing Applications, vol. 33, no. 5, pp. 838–854, 2019. [Online]. Available: https://doi.org/10.1177/1094342018819741

  4. [4]

    Accelerating amorphous polymer electrolyte screening by learning to reduce errors in molecular dynamics simulated properties,

    T. Xie, A. France-Lanord, Y . Wang, J. Lopez, M. A. Stolberg, M. Hill, G. M. Leverick, R. Gomez-Bombarelli, J. A. Johnson, Y . Shao-Horn et al., “Accelerating amorphous polymer electrolyte screening by learning to reduce errors in molecular dynamics simulated properties,” Nature communications, vol. 13, no. 1, p. 3415, 2022. [Online]. Available: https://d...

  5. [5]

    Shock-induced plasticity and phase transformation in single crystal magnesium: an interatomic potential and non-equilibrium molecular dynamics simulations,

    Z. Jian, Y . Chen, S. Xiao, L. Wang, X. Li, K. Wang, H. Deng, and W. Hu, “Shock-induced plasticity and phase transformation in single crystal magnesium: an interatomic potential and non-equilibrium molecular dynamics simulations,”Journal of Physics: Condensed Matter, vol. 34, no. 11, p. 115401, 2022. [Online]. Available: https://doi.org/10.1088/1361-648X/ac443e

  6. [6]

    Concept Drift Detection from Multi-Class Imbalanced Data Streams , year =

    K. Zhao, S. Di, M. Dmitriev, T.-L. D. Tonellot, Z. Chen, and F. Cappello, “Optimizing error-bounded lossy compression for scientific data by dynamic spline interpolation,” inProceedings of 2021 IEEE 37th International Conference on Data Engineering (ICDE). IEEE, 2021, pp. 1643–1654. [Online]. Available: https: //doi.org/10.1109/ICDE51399.2021.00145

  7. [7]

    Neutrino Production via $e^-e^+$ Collision at $Z$-boson Peak

    S. Jin, D. Tao, H. Tang, S. Di, S. Byna, Z. Lukic, and F. Cappello, “Accelerating parallel write via deeply integrating predictive lossy compression with HDF5,” inSC22: International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 2022, pp. 1–15. [Online]. Available: https: //doi.org/10.1109/SC41404.2022.00066

  8. [8]

    FFCz: Fast Fourier Correction for Spectrum- Preserving Lossy Compression of Scientific Data,

    C. Ren, R. Underwood, S. Di, E. Kutay, Z. Lukic, A. Yener, F. Cappello, and H. Guo, “FFCz: Fast Fourier Correction for Spectrum- Preserving Lossy Compression of Scientific Data,”arXiv preprint arXiv:2601.01596, 2026. [Online]. Available: https://doi.org/10.48550/ arXiv.2601.01596

  9. [9]

    LCP: Enhancing Scientific Data Management with Lossy Compression for Particles,

    L. Zhang, R. Li, C. Ren, S. Di, J. Liu, J. Huang, R. Underwood, P. Grosset, D. Tao, X. Lianget al., “LCP: Enhancing Scientific Data Management with Lossy Compression for Particles,”Proceedings of the ACM on Management of Data, vol. 3, no. 1, pp. 1–27, 2025. [Online]. Available: https://doi.org/10.1145/3709700

  10. [10]

    Toward feature-preserving 2D and 3D vector field compression,

    X. Liang, H. Guo, S. Di, F. Cappello, M. Raj, C. Liu, K. Ono, Z. Chen, and T. Peterka, “Toward feature-preserving 2D and 3D vector field compression,” inPacificVis, 2020, pp. 81–90. [Online]. Available: https://doi.org/10.1109/PacificVis48177.2020.6431

  11. [11]

    Toward feature-preserving vector field compression,

    X. Liang, S. Di, F. Cappello, M. Raj, C. Liu, K. Ono, Z. Chen, T. Peterka, and H. Guo, “Toward feature-preserving vector field compression,”IEEE Transactions on Visualization and Computer Graphics, vol. 29, no. 12, pp. 5434–5450, 2022. [Online]. Available: https://doi.org/10.1109/TVCG.2022.3214821

  12. [12]

    SZ3: A modular framework for composing prediction-based error-bounded lossy compressors,

    X. Liang, K. Zhao, S. Di, S. Li, R. Underwood, A. M. Gok, J. Tian, J. Deng, J. C. Calhoun, D. Taoet al., “SZ3: A modular framework for composing prediction-based error-bounded lossy compressors,”IEEE Transactions on Big Data, vol. 9, no. 2, pp. 485–498, 2022. [Online]. Available: https://doi.org/10.1109/TBDATA.2022.3201176

  13. [13]

    Fixed-rate compressed floating-point arrays,

    P. Lindstrom, “Fixed-rate compressed floating-point arrays,”IEEE transactions on visualization and computer graphics, vol. 20, no. 12, pp. 2674–2683, 2014. [Online]. Available: https://doi.org/10.1109/ TVCG.2014.2346458

  14. [14]

    Google Draco,

    Google, “Google Draco,” https://github.com/google/draco, 2024, ac- cessed Feb. 08, 2026

  15. [15]

    The overdensity and masses of the friends-of-friends halos and universality of halo mass function,

    S. More, A. V . Kravtsov, N. Dalal, and S. Gottl ¨ober, “The overdensity and masses of the friends-of-friends halos and universality of halo mass function,”The Astrophysical Journal Supplement Series, vol. 195, no. 1, p. 4, 2011. [Online]. Available: https: //doi.org/10.1088/0067-0049/195/1/4

  16. [16]

    Combining friend-of-friend and halo-based algorithms for the identification of galaxy groups,

    F. Rodriguez and M. Merch ´an, “Combining friend-of-friend and halo-based algorithms for the identification of galaxy groups,” Astronomy & Astrophysics, vol. 636, p. A61, 2020. [Online]. Available: https://doi.org/10.1051/0004-6361/201937423

  17. [17]

    Preferential concentration of heavy particles: a Vorono ¨ı analysis,

    R. Monchaux, M. Bourgoin, and A. Cartellier, “Preferential concentration of heavy particles: a Vorono ¨ı analysis,”Physics of Fluids, vol. 22, no. 10, 2010. [Online]. Available: https://doi.org/10.1063/1.3489987

  18. [18]

    Clustering, rotation, and swirl of inertial particles in turbulent channel flow,

    J. R. West, T. Maurel-Oujia, K. Matsuda, K. Schneider, S. S. Jain, and K. Maeda, “Clustering, rotation, and swirl of inertial particles in turbulent channel flow,”International Journal of Multiphase Flow, vol. 174, p. 104764, 2024. [Online]. Available: https: //doi.org/10.1016/j.ijmultiphaseflow.2024.104764

  19. [19]

    Extended cluster-based network modeling for coherent structures in turbulent flows,

    A. Colanera, J. M. Reumsch ¨ussel, J. P. Beuth, M. Chiatto, L. De Luca, and K. Oberleithner, “Extended cluster-based network modeling for coherent structures in turbulent flows,”Theoretical and Computational Fluid Dynamics, vol. 39, no. 1, p. 1, 2025. [Online]. Available: https://doi.org/10.1007/s00162-024-00723-z

  20. [20]

    Correlations from ion pairing and the Nernst-Einstein equation,

    A. France-Lanord and J. C. Grossman, “Correlations from ion pairing and the Nernst-Einstein equation,”Physical review letters, vol. 122, no. 13, p. 136001, 2019. [Online]. Available: https: //doi.org/10.1103/PhysRevLett.122.136001

  21. [21]

    Dissipative particle dynamics simulations of a protein-directed self-assembly of nanoparticles,

    C. Li, X. Fu, W. Zhong, and J. Liu, “Dissipative particle dynamics simulations of a protein-directed self-assembly of nanoparticles,”ACS omega, vol. 4, no. 6, pp. 10 216–10 224, 2019. [Online]. Available: https://doi.org/10.1021/acsomega.9b01078

  22. [22]

    SDRBench: Scientific Data Reduction Benchmarks,

    F. Cappello, M. Ainsworth, J. Bessac, M. Burtscher, J. Y . Choi, E. Constantinescu, S. Di, H. Guo, P. Lindstrom, and O. Tugluk, “SDRBench: Scientific Data Reduction Benchmarks,” https://sdrbench. github.io/, 2020, online; accessed Jul. 08, 2025

  23. [23]

    The halo mass function: High-redshift evolution and universality,

    Z. Luki ´c, K. Heitmann, S. Habib, S. Bashinsky, and P. M. Ricker, “The halo mass function: High-redshift evolution and universality,”The Astrophysical Journal, vol. 671, no. 2, pp. 1160–1181, 2007. [Online]. Available: https://doi.org/10.1086/523083

  24. [24]

    Precision determination of the mass function of dark matter halos,

    M. S. Warren, K. Abazajian, D. E. Holz, and L. Teodoro, “Precision determination of the mass function of dark matter halos,”The Astrophysical Journal, vol. 646, no. 2, pp. 881–885, 2006. [Online]. Available: https://doi.org/10.1086/504962

  25. [25]

    Nature , author =

    V . Springel, S. D. White, A. Jenkins, C. S. Frenk, N. Yoshida, L. Gao, J. Navarro, R. Thacker, D. Croton, J. Hellyet al., “Simulations of the formation, evolution and clustering of galaxies and quasars,” nature, vol. 435, no. 7042, pp. 629–636, 2005. [Online]. Available: https://doi.org/10.1038/nature03597

  26. [26]

    A survey on error- bounded lossy compression for scientific datasets,

    S. Di, J. Liu, K. Zhao, X. Liang, R. Underwood, Z. Zhang, M. Shah, Y . Huang, J. Huang, X. Yuet al., “A survey on error- bounded lossy compression for scientific datasets,”ACM computing surveys, vol. 57, no. 11, pp. 1–38, 2025. [Online]. Available: https://doi.org/10.1145/3733104

  27. [27]

    InProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis(Atlanta, GA, USA)(SC ’24)

    Y . Huang, S. Di, G. Li, and F. Cappello, “cuSZp2: A GPU lossy compressor with extreme throughput and optimized compression ratio,” inSC24: International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 2024, pp. 1–18. [Online]. Available: https://doi.org/10.1109/SC41406.2024.00021

  28. [28]

    Out- of-core compression and decompression of large N-dimensional scalar fields,

    L. Ibarria, P. Lindstrom, J. Rossignac, and A. Szymczak, “Out- of-core compression and decompression of large N-dimensional scalar fields,” inComputer Graphics Forum, vol. 22, no. 3. Wiley Online Library, 2003, pp. 343–348. [Online]. Available: https://doi.org/10.1111/1467-8659.00681

  29. [29]

    MPEG G-PCC,

    MPEG Group, “MPEG G-PCC,” https://github.com/MPEGGroup/ mpeg-pcc-tmc13, 2024

  30. [30]

    MPEG V-PCC,

    ——, “MPEG V-PCC,” https://github.com/MPEGGroup/ mpeg-pcc-tmc2, 2024

  31. [31]

    In2022 IEEE 38th International Conference on Data Engineering (ICDE)

    K. Zhao, S. Di, D. Perez, X. Liang, Z. Chen, and F. Cappello, “MDZ: An efficient error-bounded lossy compressor for molecular dynamics,” in2022 IEEE 38th International Conference on Data Engineering (ICDE). IEEE, 2022, pp. 27–40. [Online]. Available: https://doi.org/10.1109/ICDE53745.2022.00007

  32. [32]

    QPET: A versatile and portable quantity-of-interest-preservation framework for error-bounded lossy compression,

    J. Liu, P. Jiao, K. Zhao, X. Liang, S. Di, and F. Cappello, “QPET: A versatile and portable quantity-of-interest-preservation framework for error-bounded lossy compression,”Proceedings of the VLDB Endowment, vol. 18, no. 8, pp. 2440–2453, 2025. [Online]. Available: https://doi.org/10.14778/3742728.3742739

  33. [33]

    MSz: An efficient parallel algorithm for correcting morse-smale segmentations in error-bounded lossy compressors,

    Y . Li, X. Liang, B. Wang, Y . Qiu, L. Yan, and H. Guo, “MSz: An efficient parallel algorithm for correcting morse-smale segmentations in error-bounded lossy compressors,”IEEE Transactions on Visualization and Computer Graphics, vol. 31, no. 1, pp. 130–140, 2024. [Online]. Available: https://doi.org/10.1109/TVCG.2024.3456337

  34. [34]

    Bako, Xinyi Liu, Leilani Battle, and Zhicheng Liu

    N. Gorski, X. Liang, H. Guo, L. Yan, and B. Wang, “A general framework for augmenting lossy compressors with topological guarantees,”IEEE Transactions on Visualization and Computer Graphics, 2025. [Online]. Available: https://doi.org/10.1109/TVCG. 2025.3567054

  35. [35]

    pFoF: a highly scalable halo-finder for large cosmological data sets,

    F. Roy, V . R. Bouillot, and Y . Rasera, “pFoF: a highly scalable halo-finder for large cosmological data sets,”Astronomy & Astrophysics, vol. 564, p. A13, 2014. [Online]. Available: https://doi.org/10.1051/ 0004-6361/201322555

  36. [36]

    HACC cosmological simulations: First data release,

    K. Heitmann, T. D. Uram, H. Finkel, N. Frontiere, S. Habib, A. Pope, E. Rangel, J. Hollowed, D. Korytov, P. Larsenet al., “HACC cosmological simulations: First data release,”The Astrophysical Journal Supplement Series, vol. 244, no. 1, p. 17, 2019. [Online]. Available: https://doi.org/10.3847/1538-4365/ab3724

  37. [37]

    GADGET: a code for collisionless and gasdynamical cosmological simulations,

    V . Springel, N. Yoshida, and S. D. White, “GADGET: a code for collisionless and gasdynamical cosmological simulations,”New Astronomy, vol. 6, no. 2, pp. 79–117, 2001. [Online]. Available: https://doi.org/10.1016/S1384-1076(01)00042-2

  38. [38]

    H., Staveley-Smith, L., Campbell, L., Parker, Q., Saunders, W., & Watson, F

    D. S. Reed, R. Bower, C. S. Frenk, A. Jenkins, and T. Theuns, “The halo mass function from the dark ages through the present day,” Monthly Notices of the Royal Astronomical Society, vol. 374, no. 1, pp. 2–15, 2007. [Online]. Available: https://doi.org/10.1111/j.1365-2966. 2006.11204.x

  39. [39]

    Toward a halo mass function for precision cosmology: the limits of universality

    J. Tinker, A. V . Kravtsov, A. Klypin, K. Abazajian, M. Warren, G. Yepes, S. Gottl ¨ober, and D. E. Holz, “Toward a halo mass function for precision cosmology: The limits of universality,”The Astrophysical Journal, vol. 688, no. 2, pp. 709–728, 2008. [Online]. Available: https://doi.org/10.1086/591439

  40. [40]

    The excursion set theory of halo mass functions, halo clustering, and halo growth,

    A. R. Zentner, “The excursion set theory of halo mass functions, halo clustering, and halo growth,”International Journal of Modern Physics D, vol. 16, no. 05, pp. 763–815, 2007. [Online]. Available: https://doi.org/10.1142/S0218271807010511

  41. [41]

    A method for the construction of minimum-redundancy codes,

    D. A. Huffman, “A method for the construction of minimum-redundancy codes,”Proceedings of the IRE, vol. 40, no. 9, pp. 1098–1101, 1952. [Online]. Available: https://doi.org/10.1109/JRPROC.1952.273898

  42. [42]

    Zstandard (ZSTD),

    Y . Collet, “Zstandard (ZSTD),” https://github.com/facebook/zstd, 2015, version 1.5.7, Online; accessed Jun. 15, 2025

  43. [43]

    Beck,First-order methods in optimization

    A. Beck,First-order methods in optimization. SIAM, 2017. [Online]. Available: https://doi.org/10.1137/1.9781611974997

  44. [44]

    Adam: A Method for Stochastic Optimization

    D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014. [Online]. Available: https: //doi.org/10.48550/arXiv.1412.6980

  45. [45]

    CUDA Unbound (CUB),

    NVIDIA Corporation, “CUDA Unbound (CUB),” https://docs.nvidia. com/cuda/cub/index.html, 2024, accessed Jul. 08, 2025

  46. [46]

    J., Adams, N

    S. Habib, V . Morozov, N. Frontiere, H. Finkel, A. Pope, and K. Heitmann, “HACC: Extreme scaling and performance across diverse architectures,” inProceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, 2013, pp. 1–10. [Online]. Available: https://doi.org/10.1145/2503210.2504566

  47. [47]

    EXAALT: Exascale Atomistics for Accuracy, Length, and Time,

    Exascale Computing Project, “EXAALT: Exascale Atomistics for Accuracy, Length, and Time,” https://www.exascaleproject.org/ research-project/exaalt/, 2021, accessed: 2024-05-20

  48. [48]

    Finite Pointset Method (FPM) Viscous Fingers dataset,

    S. C. . Data, “Finite Pointset Method (FPM) Viscous Fingers dataset,” https://cloud.sdsc.edu/v1/AUTH sciviscontest/2016/README. html, 2016, accessed: 2026-03-27

  49. [49]

    Perlmutter architecture,

    N. E. R. S. C. C. (NERSC), “Perlmutter architecture,” https://docs.nersc. gov/systems/perlmutter/architecture/, 2015, accessed: 2026-03-20

  50. [50]

    Reevaluating Amdahl’s Law,

    J. L. Gustafson, “Reevaluating amdahl’s law,”Communications of the ACM, vol. 31, no. 5, pp. 532–533, 1988. [Online]. Available: https://doi.org/10.1145/42411.42415

  51. [51]

    Exploiting Lustre file joining for effective collective IO,

    W. Yu, J. Vetter, R. S. Canon, and S. Jiang, “Exploiting Lustre file joining for effective collective IO,” inSeventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid’07). IEEE, 2007, pp. 267–274. [Online]. Available: https://doi.org/10.1109/ CCGRID.2007.51