A Geometric Lens on Physics-Aligned Data Compression

Aleix Segui; Wesley Armour

arxiv: 2606.03279 · v1 · pith:S7MPK6NNnew · submitted 2026-06-02 · 💻 cs.LG

A Geometric Lens on Physics-Aligned Data Compression

Aleix Segui , Wesley Armour This is my paper

Pith reviewed 2026-06-28 11:39 UTC · model grok-4.3

classification 💻 cs.LG

keywords physics-informed compressionrate-distortion tradeofflatent space geometryanisotropic error allocationalignment diagnostictangent-space rate-distortionscientific data compressioneigenspace overlap

0 comments

The pith

Misaligned latent sensitivities create a hard limit on preserving both physical observables and reconstruction fidelity at fixed bitrate.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a local geometric theory for why physics-informed losses in learned compressors improve a target observable while degrading standard distortion at fixed rate. It traces the tradeoff to three sets of preferred directions in latent space: those induced by the entropy model, by the physical observable, and by the distortion metric. These directions determine an anisotropic allocation of compression noise. When the directions fail to align, any gain in one quantity at fixed rate forces a loss in the other, establishing a fundamental limit on simultaneous preservation. The theory is expressed as a local tangent-space rate-distortion law and is accompanied by a practical diagnostic that measures overlap of the dominant eigenspaces; experiments across domains confirm that the diagnostic tracks the observed tradeoffs.

Core claim

At each operating point the entropy model, the physical observable, and the distortion metric each induce a set of latent-space sensitivities that define preferred directions for suppressing compression noise. These directions yield an anisotropic error-allocation mechanism. When the directions are misaligned, improving preservation of the observable at fixed rate necessarily worsens standard reconstruction fidelity, establishing a fundamental limit on simultaneous preservation. The limit is formalized by a local tangent-space rate-distortion law, and an alignment diagnostic based on dominant eigenspace overlap is introduced to predict the severity of the tradeoff.

What carries the argument

Anisotropic error-allocation mechanism arising from the interaction of latent-space sensitivities induced by the entropy model, the physical observable, and the distortion metric, together with the local tangent-space rate-distortion law and the dominant-eigenspace-overlap diagnostic.

If this is right

At fixed bitrate, any improvement in the target physical observable must degrade standard reconstruction fidelity whenever the three sensitivity directions are misaligned.
The alignment diagnostic based on dominant eigenspace overlap predicts the magnitude of data-space versus physics-space tradeoffs observed in practice.
The local tangent-space rate-distortion law quantifies how the interaction of the three sensitivities governs the feasible operating points.
Anisotropic noise allocation is required to respect the distinct preferred directions when the sensitivities are not aligned.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Training procedures could be modified to encourage alignment of the three sensitivity directions rather than treating the physics loss as an independent objective.
The same geometric framing may apply to other multi-objective compression settings where one auxiliary signal competes with standard fidelity.
The diagnostic could be used at design time to decide whether a given physics-informed loss is likely to produce acceptable distortion tradeoffs before full training.
If the tangent-space approximation holds only near specific operating points, the theory may need extension to capture global rate-distortion surfaces.

Load-bearing premise

The local tangent-space approximation together with the reduction of the tradeoff to dominant eigenspace overlap are assumed to capture the essential rate-distortion behavior at operating points.

What would settle it

An experiment that measures the correlation between the alignment diagnostic and the observed tradeoff severity across a new set of physics-informed compressors and finds that high overlap does not reduce the tradeoff or that low overlap does not produce one.

Figures

Figures reproduced from arXiv: 2606.03279 by Aleix Segui, Wesley Armour.

**Figure 1.** Figure 1: As a running example, we use 2D velocity fields from PDEBench (Takamoto et al., 2022) with channels (vx, vy). The physical observable is vorticity, Q(v) = ∂yvx − ∂xvy. The figure compares pointwise compression errors for the reconstructed velocity field and for the derived vorticity. Two models are trained at the same bitrate, 0.85 bps: β = 0 corresponds to MSE-only training, while β = 0.5 includes the phy… view at source ↗

**Figure 2.** Figure 2: The Geometry of Rate. The contours represent the negative log-prior − log p(z). The curvature HR is high (steep) along the vertical axis and low (flat) along the horizontal. Both ellipses represent a noise covariance Σ with the same quantisation volume (same entropy/log det Σ). The red dashed ellipse pays a high bit-cost because it has high variance along the steep direction. The green solid ellipse is opt… view at source ↗

**Figure 3.** Figure 3: Illustration of the error mapping from latent space to physical observable space through JQ(x) Jg(z) η. where Jg(z) := ∇zgθ(z) is the decoder Jacobian. If deterministic reconstruction bias at the operating point is neglected, or treated separately, we may write δx := ˆx − x ≈ Jg(z)η. Passing this perturbation through the observable map yields Q(ˆx) ≈ Q(x) + JQ(x) δx ≈ Q(x) + JQ(x) Jg(z) η, (10) where JQ(… view at source ↗

**Figure 4.** Figure 4: Data and observable space errors for different variational autoencoder models with hyperprior or factorised entropy model. Multiple repetitions are trained for varying physics weight β, tracing the Pareto frontier. In these coordinates, Wf and Ge measure observable and signal sensitivity per unit rate cost. Physical alignment is therefore determined by the eigendirections of these ratenormalised metrics:… view at source ↗

**Figure 5.** Figure 5: At each latent point z = fϕ(x), the red and green arrows denote the dominant fidelity-sensitive and physics-sensitive directions, respectively, in the rate-whitened geometry. Their acute angle θ(x) determines the local alignment score shown in the background. Definition 5.3 (Physical Alignment Score). Let Wf(x) and Ge(x) denote the rate-whitened physics and fidelity metrics at state x. Let UW,k(x) ∈ R m×k … view at source ↗

**Figure 6.** Figure 6: Assuming Wf and Ge share a common eigenbasis, the local allocation rule decomposes by mode, with (˜σ ⋆ i ) −2 = 1 + αwei + γgei. Each stacked bar shows the corresponding combined precision: gray is the rate baseline (the 1 constant), green the physics contribution (αwei), and red the fidelity contribution (γgei). The three panels compare MSE-prioritised, physics-prioritised, and balanced allocations. 1 2 3… view at source ↗

**Figure 7.** Figure 7: Rate–distortion curves for a hyperprior model for the physics loss (left) and rate gain (right), computed as the average rate difference over a common PSNR range (Bjontegaard, 2001). models locally in latent space. For random validation samples x, we encode z = fϕ(x) and inject controlled perturbations η with prescribed covariance (diagonal and full-rank variants), forming zˆ = z + η and decoding xˆ = gθ… view at source ↗

**Figure 8.** Figure 8: shows that the second-order models remain predictive for moderate noise levels, supporting the use of HR, Weff, and Geff as local geometry descriptors. Appendix D reports additional low-bitrate experiments. Rate–distortion–physics trade-offs at fixed rate. Next we examine the empirical Pareto frontier induced by β. For each target bitrate, we train multiple models with different β and evaluate both signal … view at source ↗

**Figure 9.** Figure 9: Alignk metric for k = 8, 16, 32, averaged over data samples, on a hyperprior model. For every k, the percentage of trace coverage is indicated. and fidelity metrics are not aligned. Importantly, this behavior reflects reallocation of error into directions that are weakly sensed by Q rather than a uniform improvement. This is also observed in the additional physical observables included in Appendix D. Spec… view at source ↗

**Figure 10.** Figure 10: A sample for every data source: original (left) and two observables (middle and right). experiments, we also consider a windowed enstrophy observable, qens(x) = 1 |W| X v∈W(x) ω(v) 2 , which emphasises localised rotational activity while smoothing pixel-scale fluctuations, hence carrying meaning about larger scale energy flows. Nyx cosmological simulations. We use slices from Nyx, a massively parallel cos… view at source ↗

**Figure 11.** Figure 11: Alignment across additional domains and observables. Align16(W, G) evaluated across the added dataset–observable pairs under the same training sweep as in the main experiments. Percentages in the legend indicate trace coverage for the chosen rank. D.1. Alignment across domains and observables We compute the alignment diagnostic Alignk with fixed rank k = 16 using the rate-whitened metrics, and evaluate it… view at source ↗

**Figure 12.** Figure 12: Alignment and tradeoff magnitude for EM observables. Each point is an operating point, with horizontal position given by Alignk (W, G) and marker size indicating rate; circles report relative signal error and triangles report relative observable error [PITH_FULL_IMAGE:figures/full_fig_p021_12.png] view at source ↗

read the original abstract

In AI for Science, physics-informed losses are increasingly used to train learned compressors for scientific data, but their rate-distortion implications remain poorly understood. At fixed bitrate, these objectives often improve preservation of a target physical observable while degrading standard reconstruction fidelity. We develop a local geometric theory showing that this tradeoff is governed by the interaction of latent-space sensitivities induced by the entropy model, the physical observable, and the distortion metric. At each operating point, these induce preferred directions along which compression noise should be suppressed, yielding an anisotropic error-allocation mechanism. When these directions are misaligned, improving the observable at fixed rate necessarily worsens standard distortion, establishing a fundamental limit on simultaneous preservation. We formalise this through a local tangent-space rate-distortion law and introduce a practical alignment diagnostic based on dominant eigenspace overlap. Experiments across scientific domains test the theory and validate that the alignment diagnostic correlates with observed data- and physics-space trade-offs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper frames physics-distortion tradeoffs in compression via local latent-space geometry and an eigenspace diagnostic, but the local tangent approximation may not survive global entropy constraints.

read the letter

The main thing here is a local geometric theory that ties the tradeoff between preserving a physical observable and standard reconstruction error to misalignment of three sensitivity directions in latent space. When those directions diverge, the paper claims you cannot improve one without hurting the other at fixed rate. They back this with a tangent-space rate-distortion law and a practical diagnostic that measures overlap of the dominant eigenspaces.

What is actually new is the reduction of the problem to these preferred directions and the overlap diagnostic itself. The experiments across scientific domains are presented as evidence that the diagnostic tracks the observed tradeoffs, which gives the framing some empirical grounding.

The soft spot is the reliance on a first-order local linearization. The stress-test concern is fair: if the entropy model imposes a global constraint, or if curvature and quantization steps matter, the predicted directions could decouple from actual noise allocation. The abstract does not show how they handle that, so the claimed fundamental limit rests on an assumption that needs checking in the derivations and results.

This is for people building or analyzing learned compressors for simulation data who want a principled handle on physics-versus-fidelity choices. A reader working on rate-distortion theory or scientific ML would find the diagnostic worth testing. It deserves peer review because the framing is original and the question it addresses is real, even if the local approximation turns out to be the limiting factor.

Referee Report

3 major / 2 minor

Summary. The paper develops a local geometric theory for rate-distortion behavior in physics-informed learned compressors. It posits that sensitivities induced by the entropy model, a target physical observable, and the distortion metric define preferred directions in latent space; misalignment of the dominant eigenspaces of these operators forces a tradeoff at fixed rate, formalized via a local tangent-space rate-distortion law. An alignment diagnostic based on eigenspace overlap is introduced and shown to correlate with observed tradeoffs in experiments across scientific domains.

Significance. If the local tangent-space reduction is valid, the work supplies a mechanistic explanation for why physics-aligned objectives degrade standard fidelity and supplies a falsifiable diagnostic that could guide compressor design. The absence of free parameters in the core geometric construction and the explicit link between eigenspace overlap and empirical tradeoffs are strengths.

major comments (3)

[§3 (local tangent-space rate-distortion law)] The central claim that misalignment necessarily forces a tradeoff rests on the reduction to a local tangent-space rate-distortion law whose preferred directions are the dominant eigenspaces of the three sensitivity operators. The manuscript should demonstrate that this first-order linearization remains predictive when the entropy model imposes a global rate constraint (e.g., via explicit comparison of local noise allocation versus end-to-end optimized allocation under the same rate).
[§4 (alignment diagnostic)] The alignment diagnostic is defined from the same sensitivity operators whose misalignment is claimed to produce the tradeoff. The paper must show that the diagnostic is not tautological (i.e., that its predictive power for observed distortion/observable tradeoffs is not an artifact of the construction).
[§5 (experiments)] Experiments are said to validate the theory, yet no quantitative assessment is given of how often the local approximation fails (e.g., cases where higher-order curvature or discrete quantization reallocates noise away from the predicted directions). Such failure cases would directly test the scope of the claimed fundamental limit.

minor comments (2)

[§2] Notation for the three sensitivity operators should be introduced with explicit definitions and dimensions before their eigenspaces are discussed.
[Figures 3-5] Figure captions should state the precise operating points (rate, dataset) at which the reported alignment scores and tradeoffs were measured.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments and positive assessment of the work's significance. We address each major comment below, proposing targeted revisions to strengthen the manuscript where the points identify areas for additional validation.

read point-by-point responses

Referee: [§3 (local tangent-space rate-distortion law)] The central claim that misalignment necessarily forces a tradeoff rests on the reduction to a local tangent-space rate-distortion law whose preferred directions are the dominant eigenspaces of the three sensitivity operators. The manuscript should demonstrate that this first-order linearization remains predictive when the entropy model imposes a global rate constraint (e.g., via explicit comparison of local noise allocation versus end-to-end optimized allocation under the same rate).

Authors: The local tangent-space rate-distortion law is derived as a first-order approximation around operating points, with the global rate constraint entering through the entropy model's sensitivity operator. Our experiments already demonstrate predictive correlation with observed tradeoffs under trained models that satisfy global rate constraints. To directly address the request for explicit validation, we will revise §3 to include a comparison of the locally predicted noise allocation against the allocation realized by end-to-end optimization at matched rates, reporting quantitative agreement metrics. revision: yes
Referee: [§4 (alignment diagnostic)] The alignment diagnostic is defined from the same sensitivity operators whose misalignment is claimed to produce the tradeoff. The paper must show that the diagnostic is not tautological (i.e., that its predictive power for observed distortion/observable tradeoffs is not an artifact of the construction).

Authors: The diagnostic is constructed from the eigenspaces of the three operators, yet its value lies in its ability to predict independent experimental outcomes (distortion/observable tradeoffs measured on held-out data across domains). The experimental measurements are not generated from the diagnostic itself. In revision we will add explicit discussion clarifying this separation and include additional controls (e.g., cases where high overlap is predicted but no tradeoff is observed due to other factors) to demonstrate that the correlation is not an artifact of the shared construction. revision: yes
Referee: [§5 (experiments)] Experiments are said to validate the theory, yet no quantitative assessment is given of how often the local approximation fails (e.g., cases where higher-order curvature or discrete quantization reallocates noise away from the predicted directions). Such failure cases would directly test the scope of the claimed fundamental limit.

Authors: We agree that a quantitative characterization of approximation failures would better bound the regime of validity. In the revised §5 we will add an analysis that identifies and quantifies instances where higher-order curvature or quantization effects cause deviations from the predicted directions, including metrics on the frequency and magnitude of such failures across the reported experiments. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained geometric modeling

full rationale

The paper develops a local tangent-space rate-distortion law from the interaction of three sensitivity operators (entropy model, observable, distortion) and introduces an alignment diagnostic via dominant eigenspace overlap. No quoted equations or steps reduce a claimed prediction or fundamental limit back to a fitted parameter or self-citation by construction. The central tradeoff claim follows from the stated local linearization and eigenspace analysis rather than tautological redefinition of inputs. This is the normal case of an independent theoretical construction; external validation via experiments is noted but not required for the circularity check.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the validity of a local tangent-space approximation to rate-distortion behavior and on the assumption that dominant eigenspace overlap is a sufficient proxy for directional alignment; both are introduced without upstream justification visible in the abstract.

axioms (1)

domain assumption Local tangent-space approximation captures the essential interaction of entropy-model, observable, and distortion sensitivities
Invoked to derive the anisotropic error-allocation mechanism and the fundamental limit statement.

pith-pipeline@v0.9.1-grok · 5682 in / 1184 out tokens · 26002 ms · 2026-06-28T11:39:57.207473+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

83 extracted references · 13 canonical work pages

[1]

Langley , title =

P. Langley , title =. Proceedings of the 17th International Conference on Machine Learning (ICML 2000) , address =. 2000 , pages =

2000
[2]

T. M. Mitchell. The Need for Biases in Learning Generalizations. 1980

1980
[3]

M. J. Kearns , title =
[4]

Machine Learning: An Artificial Intelligence Approach, Vol. I. 1983

1983
[5]

R. O. Duda and P. E. Hart and D. G. Stork. Pattern Classification. 2000

2000
[6]

Suppressed for Anonymity , author=
[7]

Newell and P

A. Newell and P. S. Rosenbloom. Mechanisms of Skill Acquisition and the Law of Practice. Cognitive Skills and Their Acquisition. 1981

1981
[8]

A. L. Samuel. Some Studies in Machine Learning Using the Game of Checkers. IBM Journal of Research and Development. 1959

1959
[9]

ACM Comput

Di, Sheng and Liu, Jinyang and Zhao, Kai and Liang, Xin and Underwood, Robert and Zhang, Zhaorui and Shah, Milan and Huang, Yafan and Huang, Jiajun and Yu, Xiaodong and Ren, Congrong and Guo, Hanqi and Wilkins, Grant and Tao, Dingwen and Tian, Jiannan and Jin, Sian and Jian, Zizhe and Wang, Daoce and Rahman, Md Hasanur and Zhang, Boyuan and Song, Shihui a...

work page doi:10.1145/3733104 2025
[10]

2025 , booktitle =

Zero-shot Denoising via Neural Compression: Theoretical and Algorithmic Framework , author =. 2025 , booktitle =

2025
[11]

2019 , eprint=

Deep Variational Information Bottleneck , author=. 2019 , eprint=

2019
[12]

2015 , eprint=

Deep Learning and the Information Bottleneck Principle , author=. 2015 , eprint=

2015
[13]

Frontiers in Physics , VOLUME=

Jacobsen, Christian and Duraisamy, Karthik , TITLE=. Frontiers in Physics , VOLUME=. 2022 , URL=. doi:10.3389/fphy.2022.890910 , ISSN=

work page doi:10.3389/fphy.2022.890910 2022
[14]

Cover and Joy A

Thomas M. Cover and Joy A. Thomas , publisher =. Gaussian Channel , booktitle =. doi:10.1002/047174882X.ch9 , url =. https://onlinelibrary.wiley.com/doi/pdf/10.1002/047174882X.ch9 , year =

work page doi:10.1002/047174882x.ch9
[15]

Geophysical Journal International , volume =

Seguí, Aleix and Ugalde, Arantza and Fichtner, Andreas and Ventosa, Sergi and Morros, Josep Ramon , title =. Geophysical Journal International , volume =. 2025 , month =. doi:10.1093/gji/ggaf397 , url =

work page doi:10.1093/gji/ggaf397 2025
[16]

2018 , eprint=

Variational image compression with a scale hyperprior , author=. 2018 , eprint=

2018
[17]

2018 , eprint=

Neural Discrete Representation Learning , author=. 2018 , eprint=

2018
[18]

Applied Sciences , VOLUME =

Lee, Jaemoon and Gong, Qian and Choi, Jong and Banerjee, Tania and Klasky, Scott and Ranka, Sanjay and Rangarajan, Anand , TITLE =. Applied Sciences , VOLUME =. 2022 , NUMBER =

2022
[19]

2019 , eprint=

Fisher-Rao Metric, Geometry, and Complexity of Neural Networks , author=. 2019 , eprint=

2019
[20]

Fixed-Rate Compressed Floating-Point Arrays , year=

Lindstrom, Peter , journal=. Fixed-Rate Compressed Floating-Point Arrays , year=
[21]

2020 , eprint=

Frequency Bias in Neural Networks for Input of Non-Uniform Density , author=. 2020 , eprint=

2020
[22]

2025 , eprint=

Generative Latent Video Compression , author=. 2025 , eprint=

2025
[23]

Proceedings of the 39th ACM International Conference on Supercomputing , pages =

Jia, Wenqi and Hu, Zhewen and Liu, Youyuan and Zhang, Boyuan and Wang, Jinzhen and Liu, Jinyang and Niu, Wei and Kalafatis, Stavros and Huang, Junzhou and Jin, Sian and Wang, Daoce and Tian, Jiannan and Yin, Miao , title =. Proceedings of the 39th ACM International Conference on Supercomputing , pages =. 2025 , isbn =. doi:10.1145/3721145.3725763 , abstract =

work page doi:10.1145/3721145.3725763 2025
[24]

Scientific Error-bounded Lossy Compression with Super-resolution Neural Networks , year=

Liu, Jinyang and Di, Sheng and Jin, Sian and Zhao, Kai and Liang, Xin and Chen, Zizhong and Cappello, Franck , booktitle=. Scientific Error-bounded Lossy Compression with Super-resolution Neural Networks , year=
[25]

Exploring Autoencoder-based Error-bounded Compression for Scientific Data , year=

Liu, Jinyang and Di, Sheng and Zhao, Kai and Jin, Sian and Tao, Dingwen and Liang, Xin and Chen, Zizhong and Cappello, Franck , booktitle=. Exploring Autoencoder-based Error-bounded Compression for Scientific Data , year=
[26]

Fast Error-Bounded Lossy HPC Data Compression with SZ , year=

Di, Sheng and Cappello, Franck , booktitle=. Fast Error-Bounded Lossy HPC Data Compression with SZ , year=
[27]

and Bell, John B

Almgren, Ann S. and Bell, John B. and Lijewski, Mike J. and Lukić, Zarija and Van Andel, Ethan , title =. The Astrophysical Journal , abstract =. 2013 , month =. doi:10.1088/0004-637X/765/1/39 , url =

work page doi:10.1088/0004-637x/765/1/39 2013
[28]

Physics-informed variational autoencoders for improved robustness to environmental factors of variation , journal=

Thoreau, Romain and Risser, Laurent and Achard, V. Physics-informed variational autoencoders for improved robustness to environmental factors of variation , journal=. 2025 , month=. doi:10.1007/s10994-025-06829-7 , url=

work page doi:10.1007/s10994-025-06829-7 2025
[29]

2019 , eprint=

Practical Lossless Compression with Latent Variables using Bits Back Coding , author=. 2019 , eprint=

2019
[30]

2022 , eprint=

Scalable Hybrid Learning Techniques for Scientific Data Compression , author=. 2022 , eprint=

2022
[31]

Jiao, Pu and Di, Sheng and Guo, Hanqi and Zhao, Kai and Tian, Jiannan and Tao, Dingwen and Liang, Xin and Cappello, Franck , title =. Proc. VLDB Endow. , month = dec, pages =. 2022 , issue_date =. doi:10.14778/3574245.3574255 , abstract =

work page doi:10.14778/3574245.3574255 2022
[32]

2024 , eprint=

Attention Based Machine Learning Methods for Data Reduction with Guaranteed Error Bounds , author=. 2024 , eprint=

2024
[33]

1992 , doi =

Vector Quantization and Signal Compression , author =. 1992 , doi =

1992
[34]

TTHRESH: Tensor Compression for Multidimensional Visual Data , year=

Ballester-Ripoll, Rafael and Lindstrom, Peter and Pajarola, Renato , journal=. TTHRESH: Tensor Compression for Multidimensional Visual Data , year=
[35]

Neural Compression: From Information Theory to Applications--Workshop@ ICLR 2021 , year=

Neural data compression for physics plasma simulation , author=. Neural Compression: From Information Theory to Applications--Workshop@ ICLR 2021 , year=

2021
[36]

2025 , eprint=

Einstein Fields: A Neural Perspective To Computational General Relativity , author=. 2025 , eprint=

2025
[37]

2021 , eprint=

Sobolev Training for Physics Informed Neural Networks , author=. 2021 , eprint=

2021
[38]

, booktitle=

Momenifar, Mohammadreza and Diao, Enmao and Tarokh, Vahid and Bragg, Andrew D. , booktitle=. A Physics-Informed Vector Quantized Autoencoder for Data Compression of Turbulent Flow , year=
[39]

Boso and D.M

F. Boso and D.M. Tartakovsky , keywords =. Information geometry of physics-informed statistical manifolds and its use in data assimilation , journal =. 2022 , issn =. doi:https://doi.org/10.1016/j.jcp.2022.111438 , url =

work page doi:10.1016/j.jcp.2022.111438 2022
[40]

Journal of Machine Learning Research , volume=

The geometry and calculus of losses , author=. Journal of Machine Learning Research , volume=
[41]

2015 , publisher=

Active subspaces: Emerging ideas for dimension reduction in parameter studies , author=. 2015 , publisher=

2015
[42]

2018 , eprint=

Emergence of Invariance and Disentanglement in Deep Representations , author=. 2018 , eprint=

2018
[43]

Proceedings of the 28th International Conference on International Conference on Machine Learning , pages =

Rifai, Salah and Vincent, Pascal and Muller, Xavier and Glorot, Xavier and Bengio, Yoshua , title =. Proceedings of the 28th International Conference on International Conference on Machine Learning , pages =. 2011 , isbn =

2011
[44]

Vincent , booktitle=

Liu, Jiakun and Zhang, Wenyi and Poor, H. Vincent , booktitle=. A Rate-Distortion Framework for Characterizing Semantic Information , year=
[45]

Advances in neural information processing systems , volume=

A geometric perspective on variational autoencoders , author=. Advances in neural information processing systems , volume=
[46]

Machine Learning: Science and Technology , volume=

Autoencoders for discovering manifold dimension and coordinates in data from complex dynamical systems , author=. Machine Learning: Science and Technology , volume=. 2024 , publisher=

2024
[47]

arXiv preprint arXiv:1912.10094 , year=

Chart auto-encoders for manifold structured data , author=. arXiv preprint arXiv:1912.10094 , year=

arXiv 1912
[48]

Annual Review of Statistics and Its Application , volume=

Manifold learning: What, how, and why , author=. Annual Review of Statistics and Its Application , volume=. 2024 , publisher=

2024
[49]

Advances in Neural Information Processing Systems , volume=

Pdebench: An extensive benchmark for scientific machine learning , author=. Advances in Neural Information Processing Systems , volume=
[50]

Bjontegaard, Gisle , title =
[51]

arXiv preprint arXiv:1802.01436 , year=

Variational image compression with a scale hyperprior , author=. arXiv preprint arXiv:1802.01436 , year=

Pith/arXiv arXiv
[52]

arXiv preprint arXiv:1612.00410 , booktitle =

Alemi, Alexander A and Fischer, Ian and Dillon, Joshua V and Murphy, Kevin , title =. arXiv preprint arXiv:1612.00410 , booktitle =

Pith/arXiv arXiv
[53]

Martins and Sonia Martin-Lopez and Miguel Gonzalez-Herraez , title =

Arantza Ugalde and Hugo Latorre and Pedro Vidal and Hugo F. Martins and Sonia Martin-Lopez and Miguel Gonzalez-Herraez , title =. doi:10.7914/73K1-1369 , url =

work page doi:10.7914/73k1-1369
[54]

Distributed optimization and statistical learning via the alternating direction method of multipliers

Boyd, Stephen and Parikh, Neal and Chu, Eric and Peleato, Borja and Eckstein, Jonathan , title =. 2011 , issue_date =. doi:10.1561/2200000016 , journal =

work page doi:10.1561/2200000016 2011
[55]

2024 , doi =

Arantza Ugalde , title =. 2024 , doi =

2024
[56]

Advances in neural information processing systems , volume=

Joint autoregressive and hierarchical priors for learned image compression , author=. Advances in neural information processing systems , volume=
[57]

, title =

Hutchinson, Michael F. , title =. Communications in Statistics---Simulation and Computation , volume =
[58]

2026 , eprint=

Physics-Informed Neural Compression of High-Dimensional Plasma Data , author=. 2026 , eprint=

2026
[59]

arXiv preprint arXiv:2111.02249 , year=

Learned image compression for machine perception , author=. arXiv preprint arXiv:2111.02249 , year=

arXiv
[60]

Proceedings of the Computer Vision and Pattern Recognition Conference , pages=

Learned Image Compression with Dictionary-based Entropy Model , author=. Proceedings of the Computer Vision and Pattern Recognition Conference , pages=
[61]

International Conference on Learning Representations (ICLR) , year =

Entroformer: A Transformer-based Entropy Model for Learned Image Compression , author =. International Conference on Learning Representations (ICLR) , year =
[62]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , year =

Joint Global and Local Hierarchical Priors for Learned Image Compression , author =. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , year =
[63]

and Rodrigues, Miguel R

Shlezinger, Nir and Eldar, Yonina C. and Rodrigues, Miguel R. D. , journal=. Hardware-Limited Task-Based Quantization , year=
[64]

Journal of Machine Learning for Modeling and Computing , issn =

Matthias Chung and Richard Archibald and Paul Atzberger and Jack Michael Solomon , title =. Journal of Machine Learning for Modeling and Computing , issn =. 2025 , volume =

2025
[65]

2025 , eprint=

TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate , author=. 2025 , eprint=

2025
[66]

Shannon , title =

Claude E. Shannon , title =. Bell System Technical Journal , volume =. 1948 , doi =

1948
[67]

Shannon , title =

Claude E. Shannon , title =. IRE National Convention Record , year =
[68]

, journal=

Gersho, A. , journal=. Asymptotically optimal block quantization , year=
[69]

Proceedings of the ACM on Management of Data , volume=

Rabitq: Quantizing high-dimensional vectors with a theoretical error bound for approximate nearest neighbor search , author=. Proceedings of the ACM on Management of Data , volume=. 2024 , publisher=

2024
[70]

IEEE Transactions on Information Theory , volume=

Indirect rate distortion problems , author=. IEEE Transactions on Information Theory , volume=. 1980 , publisher=

1980
[71]

and Ziv, J

Wolf, J. and Ziv, J. , journal=. Transmission of noisy information to a noisy receiver with minimum distortion , year=
[72]

and Tsybakov, B

Dobrushin, R. and Tsybakov, B. , journal=. Information transmission with additional noise , year=
[73]

arXiv preprint physics/0004057 , year=

The information bottleneck method , author=. arXiv preprint physics/0004057 , year=

Pith/arXiv arXiv
[74]

International Conference on Machine Learning , pages=

Rethinking lossy compression: The rate-distortion-perception tradeoff , author=. International Conference on Machine Learning , pages=. 2019 , organization=

2019
[75]

SIAM review , volume=

Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions , author=. SIAM review , volume=. 2011 , publisher=

2011
[76]

SIAM Journal on Scientific Computing , volume=

Multilevel techniques for compression and reduction of scientific data-quantitative control of accuracy in derived quantities , author=. SIAM Journal on Scientific Computing , volume=. 2019 , publisher=

2019
[77]

Neural computation , volume=

Fast exact multiplication by the Hessian , author=. Neural computation , volume=. 1994 , publisher=

1994
[78]

Mathematics of computation , volume=

Numerical methods for computing angles between linear subspaces , author=. Mathematics of computation , volume=
[79]

Communications in Statistics-Simulation and Computation , volume=

A stochastic estimator of the trace of the influence matrix for Laplacian smoothing splines , author=. Communications in Statistics-Simulation and Computation , volume=. 1989 , publisher=

1989
[80]

2008 , isbn =

Vincent, Pascal and Larochelle, Hugo and Bengio, Yoshua and Manzagol, Pierre-Antoine , title =. 2008 , isbn =. doi:10.1145/1390156.1390294 , booktitle =

work page doi:10.1145/1390156.1390294 2008

Showing first 80 references.

[1] [1]

Langley , title =

P. Langley , title =. Proceedings of the 17th International Conference on Machine Learning (ICML 2000) , address =. 2000 , pages =

2000

[2] [2]

T. M. Mitchell. The Need for Biases in Learning Generalizations. 1980

1980

[3] [3]

M. J. Kearns , title =

[4] [4]

Machine Learning: An Artificial Intelligence Approach, Vol. I. 1983

1983

[5] [5]

R. O. Duda and P. E. Hart and D. G. Stork. Pattern Classification. 2000

2000

[6] [6]

Suppressed for Anonymity , author=

[7] [7]

Newell and P

A. Newell and P. S. Rosenbloom. Mechanisms of Skill Acquisition and the Law of Practice. Cognitive Skills and Their Acquisition. 1981

1981

[8] [8]

A. L. Samuel. Some Studies in Machine Learning Using the Game of Checkers. IBM Journal of Research and Development. 1959

1959

[9] [9]

ACM Comput

Di, Sheng and Liu, Jinyang and Zhao, Kai and Liang, Xin and Underwood, Robert and Zhang, Zhaorui and Shah, Milan and Huang, Yafan and Huang, Jiajun and Yu, Xiaodong and Ren, Congrong and Guo, Hanqi and Wilkins, Grant and Tao, Dingwen and Tian, Jiannan and Jin, Sian and Jian, Zizhe and Wang, Daoce and Rahman, Md Hasanur and Zhang, Boyuan and Song, Shihui a...

work page doi:10.1145/3733104 2025

[10] [10]

2025 , booktitle =

Zero-shot Denoising via Neural Compression: Theoretical and Algorithmic Framework , author =. 2025 , booktitle =

2025

[11] [11]

2019 , eprint=

Deep Variational Information Bottleneck , author=. 2019 , eprint=

2019

[12] [12]

2015 , eprint=

Deep Learning and the Information Bottleneck Principle , author=. 2015 , eprint=

2015

[13] [13]

Frontiers in Physics , VOLUME=

Jacobsen, Christian and Duraisamy, Karthik , TITLE=. Frontiers in Physics , VOLUME=. 2022 , URL=. doi:10.3389/fphy.2022.890910 , ISSN=

work page doi:10.3389/fphy.2022.890910 2022

[14] [14]

Cover and Joy A

Thomas M. Cover and Joy A. Thomas , publisher =. Gaussian Channel , booktitle =. doi:10.1002/047174882X.ch9 , url =. https://onlinelibrary.wiley.com/doi/pdf/10.1002/047174882X.ch9 , year =

work page doi:10.1002/047174882x.ch9

[15] [15]

Geophysical Journal International , volume =

Seguí, Aleix and Ugalde, Arantza and Fichtner, Andreas and Ventosa, Sergi and Morros, Josep Ramon , title =. Geophysical Journal International , volume =. 2025 , month =. doi:10.1093/gji/ggaf397 , url =

work page doi:10.1093/gji/ggaf397 2025

[16] [16]

2018 , eprint=

Variational image compression with a scale hyperprior , author=. 2018 , eprint=

2018

[17] [17]

2018 , eprint=

Neural Discrete Representation Learning , author=. 2018 , eprint=

2018

[18] [18]

Applied Sciences , VOLUME =

Lee, Jaemoon and Gong, Qian and Choi, Jong and Banerjee, Tania and Klasky, Scott and Ranka, Sanjay and Rangarajan, Anand , TITLE =. Applied Sciences , VOLUME =. 2022 , NUMBER =

2022

[19] [19]

2019 , eprint=

Fisher-Rao Metric, Geometry, and Complexity of Neural Networks , author=. 2019 , eprint=

2019

[20] [20]

Fixed-Rate Compressed Floating-Point Arrays , year=

Lindstrom, Peter , journal=. Fixed-Rate Compressed Floating-Point Arrays , year=

[21] [21]

2020 , eprint=

Frequency Bias in Neural Networks for Input of Non-Uniform Density , author=. 2020 , eprint=

2020

[22] [22]

2025 , eprint=

Generative Latent Video Compression , author=. 2025 , eprint=

2025

[23] [23]

Proceedings of the 39th ACM International Conference on Supercomputing , pages =

Jia, Wenqi and Hu, Zhewen and Liu, Youyuan and Zhang, Boyuan and Wang, Jinzhen and Liu, Jinyang and Niu, Wei and Kalafatis, Stavros and Huang, Junzhou and Jin, Sian and Wang, Daoce and Tian, Jiannan and Yin, Miao , title =. Proceedings of the 39th ACM International Conference on Supercomputing , pages =. 2025 , isbn =. doi:10.1145/3721145.3725763 , abstract =

work page doi:10.1145/3721145.3725763 2025

[24] [24]

Scientific Error-bounded Lossy Compression with Super-resolution Neural Networks , year=

Liu, Jinyang and Di, Sheng and Jin, Sian and Zhao, Kai and Liang, Xin and Chen, Zizhong and Cappello, Franck , booktitle=. Scientific Error-bounded Lossy Compression with Super-resolution Neural Networks , year=

[25] [25]

Exploring Autoencoder-based Error-bounded Compression for Scientific Data , year=

Liu, Jinyang and Di, Sheng and Zhao, Kai and Jin, Sian and Tao, Dingwen and Liang, Xin and Chen, Zizhong and Cappello, Franck , booktitle=. Exploring Autoencoder-based Error-bounded Compression for Scientific Data , year=

[26] [26]

Fast Error-Bounded Lossy HPC Data Compression with SZ , year=

Di, Sheng and Cappello, Franck , booktitle=. Fast Error-Bounded Lossy HPC Data Compression with SZ , year=

[27] [27]

and Bell, John B

Almgren, Ann S. and Bell, John B. and Lijewski, Mike J. and Lukić, Zarija and Van Andel, Ethan , title =. The Astrophysical Journal , abstract =. 2013 , month =. doi:10.1088/0004-637X/765/1/39 , url =

work page doi:10.1088/0004-637x/765/1/39 2013

[28] [28]

Physics-informed variational autoencoders for improved robustness to environmental factors of variation , journal=

Thoreau, Romain and Risser, Laurent and Achard, V. Physics-informed variational autoencoders for improved robustness to environmental factors of variation , journal=. 2025 , month=. doi:10.1007/s10994-025-06829-7 , url=

work page doi:10.1007/s10994-025-06829-7 2025

[29] [29]

2019 , eprint=

Practical Lossless Compression with Latent Variables using Bits Back Coding , author=. 2019 , eprint=

2019

[30] [30]

2022 , eprint=

Scalable Hybrid Learning Techniques for Scientific Data Compression , author=. 2022 , eprint=

2022

[31] [31]

Jiao, Pu and Di, Sheng and Guo, Hanqi and Zhao, Kai and Tian, Jiannan and Tao, Dingwen and Liang, Xin and Cappello, Franck , title =. Proc. VLDB Endow. , month = dec, pages =. 2022 , issue_date =. doi:10.14778/3574245.3574255 , abstract =

work page doi:10.14778/3574245.3574255 2022

[32] [32]

2024 , eprint=

Attention Based Machine Learning Methods for Data Reduction with Guaranteed Error Bounds , author=. 2024 , eprint=

2024

[33] [33]

1992 , doi =

Vector Quantization and Signal Compression , author =. 1992 , doi =

1992

[34] [34]

TTHRESH: Tensor Compression for Multidimensional Visual Data , year=

Ballester-Ripoll, Rafael and Lindstrom, Peter and Pajarola, Renato , journal=. TTHRESH: Tensor Compression for Multidimensional Visual Data , year=

[35] [35]

Neural Compression: From Information Theory to Applications--Workshop@ ICLR 2021 , year=

Neural data compression for physics plasma simulation , author=. Neural Compression: From Information Theory to Applications--Workshop@ ICLR 2021 , year=

2021

[36] [36]

2025 , eprint=

Einstein Fields: A Neural Perspective To Computational General Relativity , author=. 2025 , eprint=

2025

[37] [37]

2021 , eprint=

Sobolev Training for Physics Informed Neural Networks , author=. 2021 , eprint=

2021

[38] [38]

, booktitle=

Momenifar, Mohammadreza and Diao, Enmao and Tarokh, Vahid and Bragg, Andrew D. , booktitle=. A Physics-Informed Vector Quantized Autoencoder for Data Compression of Turbulent Flow , year=

[39] [39]

Boso and D.M

F. Boso and D.M. Tartakovsky , keywords =. Information geometry of physics-informed statistical manifolds and its use in data assimilation , journal =. 2022 , issn =. doi:https://doi.org/10.1016/j.jcp.2022.111438 , url =

work page doi:10.1016/j.jcp.2022.111438 2022

[40] [40]

Journal of Machine Learning Research , volume=

The geometry and calculus of losses , author=. Journal of Machine Learning Research , volume=

[41] [41]

2015 , publisher=

Active subspaces: Emerging ideas for dimension reduction in parameter studies , author=. 2015 , publisher=

2015

[42] [42]

2018 , eprint=

Emergence of Invariance and Disentanglement in Deep Representations , author=. 2018 , eprint=

2018

[43] [43]

Proceedings of the 28th International Conference on International Conference on Machine Learning , pages =

Rifai, Salah and Vincent, Pascal and Muller, Xavier and Glorot, Xavier and Bengio, Yoshua , title =. Proceedings of the 28th International Conference on International Conference on Machine Learning , pages =. 2011 , isbn =

2011

[44] [44]

Vincent , booktitle=

Liu, Jiakun and Zhang, Wenyi and Poor, H. Vincent , booktitle=. A Rate-Distortion Framework for Characterizing Semantic Information , year=

[45] [45]

Advances in neural information processing systems , volume=

A geometric perspective on variational autoencoders , author=. Advances in neural information processing systems , volume=

[46] [46]

Machine Learning: Science and Technology , volume=

Autoencoders for discovering manifold dimension and coordinates in data from complex dynamical systems , author=. Machine Learning: Science and Technology , volume=. 2024 , publisher=

2024

[47] [47]

arXiv preprint arXiv:1912.10094 , year=

Chart auto-encoders for manifold structured data , author=. arXiv preprint arXiv:1912.10094 , year=

arXiv 1912

[48] [48]

Annual Review of Statistics and Its Application , volume=

Manifold learning: What, how, and why , author=. Annual Review of Statistics and Its Application , volume=. 2024 , publisher=

2024

[49] [49]

Advances in Neural Information Processing Systems , volume=

Pdebench: An extensive benchmark for scientific machine learning , author=. Advances in Neural Information Processing Systems , volume=

[50] [50]

Bjontegaard, Gisle , title =

[51] [51]

arXiv preprint arXiv:1802.01436 , year=

Variational image compression with a scale hyperprior , author=. arXiv preprint arXiv:1802.01436 , year=

Pith/arXiv arXiv

[52] [52]

arXiv preprint arXiv:1612.00410 , booktitle =

Alemi, Alexander A and Fischer, Ian and Dillon, Joshua V and Murphy, Kevin , title =. arXiv preprint arXiv:1612.00410 , booktitle =

Pith/arXiv arXiv

[53] [53]

Martins and Sonia Martin-Lopez and Miguel Gonzalez-Herraez , title =

Arantza Ugalde and Hugo Latorre and Pedro Vidal and Hugo F. Martins and Sonia Martin-Lopez and Miguel Gonzalez-Herraez , title =. doi:10.7914/73K1-1369 , url =

work page doi:10.7914/73k1-1369

[54] [54]

Distributed optimization and statistical learning via the alternating direction method of multipliers

Boyd, Stephen and Parikh, Neal and Chu, Eric and Peleato, Borja and Eckstein, Jonathan , title =. 2011 , issue_date =. doi:10.1561/2200000016 , journal =

work page doi:10.1561/2200000016 2011

[55] [55]

2024 , doi =

Arantza Ugalde , title =. 2024 , doi =

2024

[56] [56]

Advances in neural information processing systems , volume=

Joint autoregressive and hierarchical priors for learned image compression , author=. Advances in neural information processing systems , volume=

[57] [57]

, title =

Hutchinson, Michael F. , title =. Communications in Statistics---Simulation and Computation , volume =

[58] [58]

2026 , eprint=

Physics-Informed Neural Compression of High-Dimensional Plasma Data , author=. 2026 , eprint=

2026

[59] [59]

arXiv preprint arXiv:2111.02249 , year=

Learned image compression for machine perception , author=. arXiv preprint arXiv:2111.02249 , year=

arXiv

[60] [60]

Proceedings of the Computer Vision and Pattern Recognition Conference , pages=

Learned Image Compression with Dictionary-based Entropy Model , author=. Proceedings of the Computer Vision and Pattern Recognition Conference , pages=

[61] [61]

International Conference on Learning Representations (ICLR) , year =

Entroformer: A Transformer-based Entropy Model for Learned Image Compression , author =. International Conference on Learning Representations (ICLR) , year =

[62] [62]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , year =

Joint Global and Local Hierarchical Priors for Learned Image Compression , author =. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , year =

[63] [63]

and Rodrigues, Miguel R

Shlezinger, Nir and Eldar, Yonina C. and Rodrigues, Miguel R. D. , journal=. Hardware-Limited Task-Based Quantization , year=

[64] [64]

Journal of Machine Learning for Modeling and Computing , issn =

Matthias Chung and Richard Archibald and Paul Atzberger and Jack Michael Solomon , title =. Journal of Machine Learning for Modeling and Computing , issn =. 2025 , volume =

2025

[65] [65]

2025 , eprint=

TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate , author=. 2025 , eprint=

2025

[66] [66]

Shannon , title =

Claude E. Shannon , title =. Bell System Technical Journal , volume =. 1948 , doi =

1948

[67] [67]

Shannon , title =

Claude E. Shannon , title =. IRE National Convention Record , year =

[68] [68]

, journal=

Gersho, A. , journal=. Asymptotically optimal block quantization , year=

[69] [69]

Proceedings of the ACM on Management of Data , volume=

Rabitq: Quantizing high-dimensional vectors with a theoretical error bound for approximate nearest neighbor search , author=. Proceedings of the ACM on Management of Data , volume=. 2024 , publisher=

2024

[70] [70]

IEEE Transactions on Information Theory , volume=

Indirect rate distortion problems , author=. IEEE Transactions on Information Theory , volume=. 1980 , publisher=

1980

[71] [71]

and Ziv, J

Wolf, J. and Ziv, J. , journal=. Transmission of noisy information to a noisy receiver with minimum distortion , year=

[72] [72]

and Tsybakov, B

Dobrushin, R. and Tsybakov, B. , journal=. Information transmission with additional noise , year=

[73] [73]

arXiv preprint physics/0004057 , year=

The information bottleneck method , author=. arXiv preprint physics/0004057 , year=

Pith/arXiv arXiv

[74] [74]

International Conference on Machine Learning , pages=

Rethinking lossy compression: The rate-distortion-perception tradeoff , author=. International Conference on Machine Learning , pages=. 2019 , organization=

2019

[75] [75]

SIAM review , volume=

Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions , author=. SIAM review , volume=. 2011 , publisher=

2011

[76] [76]

SIAM Journal on Scientific Computing , volume=

Multilevel techniques for compression and reduction of scientific data-quantitative control of accuracy in derived quantities , author=. SIAM Journal on Scientific Computing , volume=. 2019 , publisher=

2019

[77] [77]

Neural computation , volume=

Fast exact multiplication by the Hessian , author=. Neural computation , volume=. 1994 , publisher=

1994

[78] [78]

Mathematics of computation , volume=

Numerical methods for computing angles between linear subspaces , author=. Mathematics of computation , volume=

[79] [79]

Communications in Statistics-Simulation and Computation , volume=

A stochastic estimator of the trace of the influence matrix for Laplacian smoothing splines , author=. Communications in Statistics-Simulation and Computation , volume=. 1989 , publisher=

1989

[80] [80]

2008 , isbn =

Vincent, Pascal and Larochelle, Hugo and Bengio, Yoshua and Manzagol, Pierre-Antoine , title =. 2008 , isbn =. doi:10.1145/1390156.1390294 , booktitle =

work page doi:10.1145/1390156.1390294 2008