pith. sign in

arxiv: 1712.00012 · v2 · pith:KBQKIDGHnew · submitted 2017-11-30 · 🌌 astro-ph.CO

Generalized massive optimal data compression

classification 🌌 astro-ph.CO
keywords datacompressionoptimalstatisticsgeneralparameterscasecompressed
0
0 comments X
read the original abstract

Data compression has become one of the cornerstones of modern astronomical data analysis, with the vast majority of analyses compressing large raw datasets down to a manageable number of informative summaries. In this paper we provide a general procedure for optimally compressing $N$ data down to $n$ summary statistics, where $n$ is equal to the number of parameters of interest. We show that compression to the score function -- the gradient of the log-likelihood with respect to the parameters -- yields $n$ compressed statistics that are optimal in the sense that they preserve the Fisher information content of the data. Our method generalizes earlier work on linear Karhunen-Lo\'{e}ve compression for Gaussian data whilst recovering both lossless linear compression and quadratic estimation as special cases when they are optimal. We give a unified treatment that also includes the general non-Gaussian case as long as mild regularity conditions are satisfied, producing optimal non-linear summary statistics when appropriate. As a worked example, we derive explicitly the $n$ optimal compressed statistics for Gaussian data in the general case where both the mean and covariance depend on the parameters.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. The Hidden Geometry of Astrophysical Spectra: Path-Signatures of Line Profiles

    astro-ph.IM 2026-06 unverdicted novelty 6.0

    Defines path-signature descriptors for spectral line profiles and shows they enable unsupervised clustering of MaNGA Hα spaxels into spatially coherent classes whose stacked spectra recover large-scale velocity patterns.

  2. Dark Energy Survey Year 3 results: optimized $w$CDM simulation-based inference with weak lensing map-level hybrid statistics

    astro-ph.CO 2026-06 unverdicted novelty 6.0

    DES Y3 weak lensing analysis with hybrid map-level statistics and simulation-based inference yields S8 = 0.808 ± 0.017, Ωm = 0.325 ± 0.024, and w < -0.766, improving the figure of merit by 60% over prior state-of-the-art.