The Degeneracy Distillery
Pith reviewed 2026-06-26 08:59 UTC · model grok-4.3
The pith
The degeneracy distillery finds symbolic transformations that flatten the Fisher information matrix globally from parameter-simulation pairs alone.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By exploring the information geometry of the likelihood, degeneracies are characterized as an intrinsic property of the physical model requiring no realised data observation; symbolic coordinate transformations derived from parameter-simulation pairs then identify independent parameter effects and flatten the Fisher information in expectation globally.
What carries the argument
Estimation and symbolic flattening of the Fisher information matrix from parameter-simulation pairs to remove degeneracies.
If this is right
- Degeneracies become identifiable without any realised data observations.
- Neural posterior estimation requires up to 10 times fewer simulations while maintaining calibration.
- The resulting coordinates provide physical insight into which parameter combinations act independently on the data.
- Flattening holds globally in expectation rather than only at a single point.
Where Pith is reading between the lines
- Precomputed transformations could speed up repeated inference tasks on similar model families.
- The same machinery might expose structural deficiencies in a model by revealing degeneracies that cannot be removed.
- The approach may transfer to other simulation-heavy inverse problems where distinguishability of parameters limits performance.
Load-bearing premise
That symbolic coordinate transformations exist which can flatten the Fisher information matrix in expectation globally for the models of interest and can be reliably estimated from parameter-simulation pairs alone.
What would settle it
A test case in which no symbolic transformation flattens the expected Fisher information or in which the method requires at least as many simulations as standard posterior estimation at matched calibration.
Figures
read the original abstract
When two or more parameters or labels produce similar data, they are degenerate, or hard to distinguish. Degeneracies render both label prediction and inverse problems difficult, since both machine learning algorithms and probabilistic samplers rely on the distinguishability of data and its gradients with respect to parameters. However, identifying degeneracies in physical models or real-world datasets can be elucidating about the choice of model or the underlying process that produces the data. We present the degeneracy distillery, a method that (1) detects and (2) resolves degenerate parameter combinations (a) automatically and (b) symbolically, from parameter-data (or parameter-simulation) pairs alone, through estimation and flattening of the Fisher information matrix. By exploring the information geometry of the likelihood, we characterize degeneracies as an intrinsic property of the physical model, requiring no realised data observation. We demonstrate our approach on a range of synthetic and real-world problems, discovering symbolic coordinate transformations that identify the combinations of parameters of a model which yield independent effects on the data. The resulting coordinates flatten the Fisher information in expectation globally, in contrast to posterior-based methods that flatten only at a single point, and substantially reduce the simulation budget required for downstream neural posterior estimation. In test cases we require up to $10\times$ fewer simulations for posterior estimation at matched validation calibration whilst simultaneously gaining physical insight on the system.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents the Degeneracy Distillery, a method that detects and resolves degenerate parameter combinations automatically and symbolically from parameter-simulation pairs alone by estimating the Fisher information matrix and finding coordinate transformations that flatten it in expectation. The central claims are that the resulting coordinates achieve global flattening (unlike posterior-based methods that act only at a single point) and reduce the simulation budget for downstream neural posterior estimation by up to 10× at matched validation calibration, while also yielding physical insight.
Significance. If the claims on global flattening and simulation reduction hold with rigorous validation, the work would provide a useful tool for reparameterizing degenerate models in simulation-based inference, combining information geometry with symbolic methods to improve efficiency and interpretability without requiring realized observations.
major comments (2)
- [Abstract] Abstract: the claim that the coordinates 'flatten the Fisher information in expectation globally' is load-bearing for the novelty relative to posterior-based methods. The stress-test correctly identifies that an estimate formed by averaging over parameter-simulation pairs recovers only an averaged metric; the manuscript must supply the explicit derivation or proof (presumably in the methods) showing that the discovered symbolic transformation renders the local metric flat at every point rather than merely its expectation, particularly when the underlying information manifold has non-vanishing curvature.
- [Results] Results (or equivalent section reporting the 10× claim): the quantitative statement 'up to 10× fewer simulations for posterior estimation at matched validation calibration' requires supporting tables or figures with the exact calibration metric (e.g., coverage or C2ST), the baseline method, and error bars across the synthetic and real-world test cases; without these, the efficiency gain cannot be assessed as load-bearing evidence.
minor comments (1)
- [Abstract] Abstract: the phrase 'synthetic and real-world problems' is vague; a parenthetical list of the specific models or datasets would improve immediate readability.
Simulated Author's Rebuttal
Thank you for the opportunity to respond to the referee's report. We address the major comments point-by-point below. Where revisions are needed to clarify or strengthen the manuscript, we indicate our plans to update the next version.
read point-by-point responses
-
Referee: [Abstract] Abstract: the claim that the coordinates 'flatten the Fisher information in expectation globally' is load-bearing for the novelty relative to posterior-based methods. The stress-test correctly identifies that an estimate formed by averaging over parameter-simulation pairs recovers only an averaged metric; the manuscript must supply the explicit derivation or proof (presumably in the methods) showing that the discovered symbolic transformation renders the local metric flat at every point rather than merely its expectation, particularly when the underlying information manifold has non-vanishing curvature.
Authors: We thank the referee for this insightful comment, which helps clarify the scope of our contribution. The degeneracy distillery derives a symbolic transformation by estimating and diagonalizing the expected Fisher information matrix from parameter-simulation pairs. This yields a global coordinate system in which the expected information matrix is flattened (i.e., diagonal and scaled to identity in expectation). The 'globally' qualifier refers to the fact that the transformation is not point-dependent, unlike methods that reparameterize based on a local posterior approximation. We do not claim that the local Fisher metric is exactly flat at every individual point when the manifold has curvature; rather, the expectation over the parameter distribution is flattened. The derivation of the transformation from the averaged FIM is provided in the Methods. To prevent misinterpretation, we will revise the abstract to emphasize 'flattens the expected Fisher information globally' and add a clarifying paragraph in the Methods discussing the distinction from pointwise flattening and the role of manifold curvature. revision: yes
-
Referee: [Results] Results (or equivalent section reporting the 10× claim): the quantitative statement 'up to 10× fewer simulations for posterior estimation at matched validation calibration' requires supporting tables or figures with the exact calibration metric (e.g., coverage or C2ST), the baseline method, and error bars across the synthetic and real-world test cases; without these, the efficiency gain cannot be assessed as load-bearing evidence.
Authors: We agree that the 10× efficiency claim requires more detailed quantitative support to be fully convincing. While the manuscript presents results demonstrating reduced simulation budgets with matched calibration in figures, we will add a dedicated table in the Results section. This table will report the exact calibration metrics used (C2ST and coverage), the baseline (standard NPE without degeneracy distillery), the achieved simulation reduction factors, and error bars from repeated experiments for all synthetic and real-world cases. This will make the evidence for the efficiency gains explicit and reproducible. revision: yes
Circularity Check
No significant circularity in the derivation chain
full rationale
The paper describes a procedure that estimates the expected Fisher information matrix from parameter-simulation pairs and then applies symbolic search to discover coordinate transformations that flatten this matrix. The flattening result is the explicit output of the search step rather than a quantity presupposed by the inputs; the claim of global flattening in expectation follows directly from the optimization objective and is not equivalent to the input data by construction. No self-citation load-bearing step, uniqueness theorem imported from the same authors, or fitted parameter renamed as a prediction appears in the abstract or described method. The approach builds on the standard definition of the Fisher matrix as a measure of local distinguishability and therefore remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The Fisher information matrix estimated from parameter-simulation pairs characterizes degeneracies as an intrinsic property of the physical model.
Reference graph
Works this paper leans on
-
[1]
Aasi et al
J. Aasi et al. Advanced LIGO.Class. Quantum Grav., 32:074001, 2015. doi: 10.1088/ 0264-9381/32/7/074001
2015
-
[2]
Justin Alsing, Tom Charnock, Stephen Feeney, and Benjamin Wandelt. Fast likelihood-free cosmology with neural density estimators and active learning.Monthly Notices of the Royal Astronomical Society, Jul 2019. ISSN 1365-2966. doi: 10.1093/mnras/stz1960. URLhttp: //dx.doi.org/10.1093/mnras/stz1960
-
[3]
Priors for symbolic regression
Deaglan Bartlett, Harry Desmond, and Pedro Ferreira. Priors for symbolic regression. In ProceedingsoftheCompanionConferenceonGeneticandEvolutionaryComputation,GECCO ’23 Companion, page 2402–2411. ACM, July 2023. doi: 10.1145/3583133.3596327. URL http://dx.doi.org/10.1145/3583133.3596327
-
[4]
Bartlett, Harry Desmond, and Pedro G
Deaglan J. Bartlett, Harry Desmond, and Pedro G. Ferreira. Exhaustive symbolic regression. IEEE Transactions on Evolutionary Computation, 28(4):950–964, August 2024. ISSN 1941-
2024
-
[5]
URLhttp://dx.doi.org/10.1109/TEVC.2023
doi: 10.1109/tevc.2023.3280250. URLhttp://dx.doi.org/10.1109/TEVC.2023. 3280250
-
[6]
Weak lensing statistics as a probe of𝜔and power spectrum.Astronomy & Astrophysics, 322:1–18, 1997
Francis Bernardeau, Ludovic van Waerbeke, and Yannick Mellier. Weak lensing statistics as a probe of𝜔and power spectrum.Astronomy & Astrophysics, 322:1–18, 1997
1997
-
[7]
C. M. Biwer, Collin D. Capano, Soumi De, Miriam Cabero, Duncan A. Brown, Alexander H. Nitz, and V. Raymond. PyCBC Inference: A Python-based parameter estimation toolkit for compact binary coalescence signals.Publ. Astron. Soc. Pac., 131:024503, 2019. doi: 10.1088/1538-3873/aaef0b
-
[8]
2014, Living Reviews in Relativity, 17, 2, doi: 10.12942/lrr-2014-2
Luc Blanchet. Gravitational radiation from post-Newtonian sources and inspiralling compact binaries.Living Rev. Relativ., 17:2, 2014. doi: 10.12942/lrr-2014-2
-
[9]
Iyer, Evan Ochsner, Yi Pan, and B
Alessandra Buonanno, Bala R. Iyer, Evan Ochsner, Yi Pan, and B. S. Sathyaprakash. Com- parison of post-Newtonian templates for compact binary inspiral signals in gravitational-wave detectors.Phys. Rev. D, 80:084043, 2009. doi: 10.1103/PhysRevD.80.084043
-
[10]
Operon c++: An efficient genetic programming framework for symbolic regression
Bogdan Burlacu, Gabriel Kronberger, and Michael Kommenda. Operon c++: An efficient genetic programming framework for symbolic regression. InProceedings of the 2020 Genetic and Evolutionary Computation Conference Companion, GECCO ’20, page 1562–1570, New York, NY, USA, 2020. Association for Computing Machinery. ISBN 9781450371278. doi: 10.1145/3377929.3398...
-
[11]
Tara Dacunha, Marco Raveri, Minsu Park, Cyrille Doux, and Bhuvnesh Jain. What does a cosmologicalexperimentreallymeasure? Covariantposteriordecompositionwithnormalizing flows.Physical Review D, 105(6):063529, March 2022. doi: 10.1103/PhysRevD.105.063529
-
[12]
DensityestimationusingrealNVP
LaurentDinh,JaschaSohl-Dickstein,andSamyBengio. DensityestimationusingrealNVP. In International Conference on Learning Representations, 2017. URLhttps://openreview. net/forum?id=HkpbnH9lx
2017
-
[13]
Nikolaos Evangelou, Noah J. Wichrowski, George A. Kevrekidis, Felix Dietrich, Mahdi Kooshkbaghi, Sarah McFann, and Ioannis G. Kevrekidis. On the Parameter Combinations That Matter and on Those That do Not.arXiv e-prints, art. arXiv:2110.06717, October 2021. doi: 10.48550/arXiv.2110.06717
-
[14]
GeometricVariationalInference.Entropy, 23(7):853, July 2021
PhilippFrank,ReimarLeike,andTorstenA.Enßlin. GeometricVariationalInference.Entropy, 23(7):853, July 2021. doi: 10.3390/e23070853
-
[15]
Ali Hassani, Steven Walton, Jiachen Li, Shen Li, and Humphrey Shi
Tilmann Gneiting and Adrian E Raftery. Strictly Proper Scoring Rules, Prediction, and Es- timation.Journal of the American Statistical Association, 102(477):359–378, March 2007. ISSN 0162-1459. doi: 10.1198/016214506000001437. URLhttps://doi.org/10.1198/ 016214506000001437. _eprint: https://doi.org/10.1198/016214506000001437. 11
-
[16]
Ryan N. Gutenkunst, Joshua J. Waterfall, Fergal P. Casey, Kevin S. Brown, Christopher R. Myers, and James P. Sethna. Universally Sloppy Parameter Sensitivities in Systems Biology Models.PLoS Computational Biology, 3(10):e189, January 2007. doi: 10.1371/journal.pcbi. 0030189
-
[17]
Hethcote
Herbert W. Hethcote. The mathematics of infectious diseases.SIAM Review, 42(4):599–653,
-
[18]
doi: 10.1137/S0036144500371907
-
[19]
Matthew Ho, Deaglan Bartlett, Nicolas Chartier, Carolina Cuesta-Lazaro, Simon Ding, Axel Lapel, Pablo Lemos, Christopher C. Lovell, T. Lucas Makinen, Chirag Modi, Viraj Pandya, Shivam Pandey, Lucia Perez-Granado, Hong-Yu Shao, Georgios Valogiannis, and Benjamin Wandelt. Ltu-ili: Anall-in-oneframeworkforimplicitinferenceinastrophysicsandcosmology. arXiv:24...
arXiv 2024
-
[20]
Alston S. Householder. Unitary Triangularization of a Nonsymmetric Matrix.J. ACM, 5 (4):339–342, October 1958. ISSN 0004-5411. doi: 10.1145/320941.320947. URLhttps: //dl.acm.org/doi/10.1145/320941.320947
-
[21]
Frequency-domain gravitational waves from nonprecessing black-hole binaries
Sascha Husa, Sebastian Khan, Mark Hannam, Michael Pürrer, Frank Ohme, Xisco Jiménez Forteza, and Alejandro Bohé. Frequency-domain gravitational waves from nonprecessing black-hole binaries. I. New numerical waveforms and anatomy of the signal.Phys. Rev. D, 93: 044006, 2016. doi: 10.1103/PhysRevD.93.044006
-
[22]
Incropera, David P
Frank P. Incropera, David P. DeWitt, Theodore L. Bergman, and Adrienne S. Lavine.Funda- mentalsofHeatandMassTransfer. Wiley,Hoboken,NJ,7edition,2011. Lumped-capacitance method, Ch. 5; canonical derivation of the first-order thermal step response used in our simu- lator
2011
-
[23]
BhuvneshJainandUrošSeljak. Cosmologicalmodelpredictionsforweaklensing: Linearand nonlinear regimes.The Astrophysical Journal, 484(2):560–573, 1997. doi: 10.1086/304372
-
[24]
W.Kabsch. Asolutionforthebestrotationtorelatetwosetsofvectors.ActaCrystallographica Section A: Crystal Physics, Diffraction, Theoretical and General Crystallography, 32(5):922– 923, September 1976. ISSN 0567-7394. doi: 10.1107/S0567739476001873. URLhttps: //journals.iucr.org/a/issues/1976/05/00/a12999/. Publisher: International Union of Crystallography
-
[25]
McKendrick
William Ogilvy Kermack and Anderson G. McKendrick. A contribution to the mathematical theoryofepidemics.ProceedingsoftheRoyalSocietyofLondon.SeriesA,115(772):700–721,
-
[26]
doi: 10.1098/rspa.1927.0118
-
[27]
Frequency-domain gravitational waves from nonprecessing black-hole binaries
Sebastian Khan, Sascha Husa, Mark Hannam, Frank Ohme, Michael Pürrer, Xisco Jiménez Forteza, and Alejandro Bohé. Frequency-domain gravitational waves from nonprecessing black-hole binaries. II. A phenomenological model for the advanced detector era.Phys. Rev. D, 93:044007, 2016. doi: 10.1103/PhysRevD.93.044007
-
[28]
Martin Kilbinger. Cosmology with cosmic shear observations: a review.Reports on Progress in Physics, 78(8):086901, 2015. doi: 10.1088/0034-4885/78/8/086901
-
[29]
Winkler, and Michael Affenzeller.Symbolic Regression
Gabriel Kronberger, Bogdan Burlacu, Michael Kommenda, Stephan M. Winkler, and Michael Affenzeller.Symbolic Regression. Chapman & Hall / CRC Press, 2024
2024
-
[30]
Springer Berlin Heidelberg, New York, NY,
John Lee.Introduction to Riemannian manifolds. Springer Berlin Heidelberg, New York, NY,
-
[31]
ISBN 978-3-319-91754-2
-
[32]
Sampling- based accuracy testing of posterior estimators for general inference, 2023
Pablo Lemos, Adam Coogan, Yashar Hezaveh, and Laurence Perreault-Levasseur. Sampling- based accuracy testing of posterior estimators for general inference, 2023. URLhttps: //arxiv.org/abs/2302.03026
arXiv 2023
-
[33]
pmwd: ADifferentiableCosmolog- ical Particle-Mesh N-body Library, 2022
Yin Li, Libin Lu, Chirag Modi, Drew Jamieson, Yucheng Zhang, Yu Feng, Wenda Zhou, NgaiPokKwan,FrançoisLanusse,andLeslieGreengard. pmwd: ADifferentiableCosmolog- ical Particle-Mesh N-body Library, 2022. _eprint: 2211.09958. 12
arXiv 2022
-
[34]
Lucas Makinen, Justin Alsing, and Benjamin D
T. Lucas Makinen, Justin Alsing, and Benjamin D. Wandelt. Fishnets: Scalable neural com- pression with optimal aggregation.arxiv, 2023, Aug 2023
2023
-
[35]
Lucas Makinen, Alan Heavens, Natalia Porqueres, Tom Charnock, Axel Lapel, and Benjamin D
T. Lucas Makinen, Alan Heavens, Natalia Porqueres, Tom Charnock, Axel Lapel, and Benjamin D. Wandelt. Hybrid summary statistics: neural weak lensing inference be- yond the power spectrum.Journal of Cosmology and Astroparticle Physics, 2025(01): 095, January 2025. ISSN 1475-7516. doi: 10.1088/1475-7516/2025/01/095. URL http://dx.doi.org/10.1088/1475-7516/2...
-
[36]
Aaron Meurer, Christopher P. Smith, Mateusz Paprocki, Ondřej Čertík, Sergey B. Kirpichev, Matthew Rocklin, AMiT Kumar, Sergiu Ivanov, Jason K. Moore, Sartaj Singh, Thilina Rathnayake, Sean Vig, Brian E. Granger, Richard P. Muller, Francesco Bonazzi, Harsh Gupta, Shivam Vats, Fredrik Johansson, Fabian Pedregosa, Matthew J. Curry, Andy R. Terrel, Štěpán Rou...
-
[37]
An Elementary Introduction to Information Geometry.Entropy, 22(10):1100, October 2020
Frank Nielsen. An Elementary Introduction to Information Geometry.Entropy, 22(10):1100, October 2020. ISSN 1099-4300. doi: 10.3390/e22101100. URLhttps://www.mdpi.com/ 1099-4300/22/10/1100
-
[38]
Maskedautoregressiveflowfordensity estimation
GeorgePapamakarios,TheoPavlakou,andIainMurray. Maskedautoregressiveflowfordensity estimation. InAdvances in Neural Information Processing Systems (NeurIPS), 2017
2017
-
[39]
Normalizing flows for probabilistic modeling and inference.Journal of Machine Learning Research, 22(57):1–64, 2021
George Papamakarios, Eric Nalisnick, Danilo Jimenez Rezende, Shakir Mohamed, and Balaji Lakshminarayanan. Normalizing flows for probabilistic modeling and inference.Journal of Machine Learning Research, 22(57):1–64, 2021
2021
-
[40]
Kevrekidis, and Ronald R
Erez Peterfreund, Ofir Lindenbaum, Felix Dietrich, Tom Bertalan, Matan Gavish, Ioannis G. Kevrekidis, and Ronald R. Coifman. Local conformal autoencoder for standardized data coor- dinates.Proceedings of the National Academy of Science, 117(49):30918–30927, December
-
[41]
doi: 10.1073/pnas.2014627117
-
[42]
Information and the accuracy attainable in the estimation of statistical parameters.Bulletin of the Calcutta Mathematical Society, 37:81–91, 1945
Calyampudi Radhakrishna Rao. Information and the accuracy attainable in the estimation of statistical parameters.Bulletin of the Calcutta Mathematical Society, 37:81–91, 1945
1945
-
[43]
H. H. Rosenbrock. An Automatic Method for Finding the Greatest or Least Value of a Function.The Computer Journal, 3(3):175–184, January 1960. ISSN 0010-4620. doi: 10.1093/comjnl/3.3.175. URLhttps://doi.org/10.1093/comjnl/3.3.175. _eprint: https://academic.oup.com/comjnl/article-pdf/3/3/175/988633/030175.pdf
-
[44]
Charles J. Stone. Optimal rates of convergence for nonparametric estimators.The Annals of Statistics, 8(6):1348–1360, 1980. doi: 10.1214/aos/1176345206
-
[45]
Mark K. Transtrum and Peng Qiu. Model reduction by manifold boundaries.Phys. Rev. Lett., 113:098701, Aug 2014. doi: 10.1103/PhysRevLett.113.098701. URLhttps://link.aps. org/doi/10.1103/PhysRevLett.113.098701
-
[46]
Mark K. Transtrum, Benjamin B. Machta, and James P. Sethna. Geometry of nonlinear least squareswithapplicationstosloppymodelsandoptimization.PhysicalReviewE,83(3):036701, March 2011. doi: 10.1103/PhysRevE.83.036701
-
[47]
Mark K. Transtrum, Benjamin B. Machta, Kevin S. Brown, Bryan C. Daniels, Christopher R. Myers,andJamesP.Sethna. Perspective: Sloppinessandemergenttheoriesinphysics,biology, and beyond.J. Chem. Phys., 143(1):010901, July 2015. doi: 10.1063/1.4923066. The appendices are organised as follows. Section A proves that the Fishnet objective recovers the posterior...
-
[48]
But given thatsgn(sinh𝛽/𝛽)=1, fromtheearlierformula,weknowthatthismustbetrue
We can now substitute this into our earlier expression forsinh𝛽 sinh𝛽= 𝛽 𝜉 𝜇−𝜇 ★ 𝜎 = sgn(𝜉) 𝑣★ √︄ 1+ 𝜂2 𝜉2 𝜇−𝜇 ★ 𝜎 .(73) Since𝛽 >0, we require the conditionsgn(𝜉)=sgn(𝜇−𝜇 ★). But given thatsgn(sinh𝛽/𝛽)=1, fromtheearlierformula,weknowthatthismustbetrue. Wethereforefindthatwecanobtain𝛽from sinh𝛽= √︂ 1 2 (𝜇−𝜇 ★)2 + (𝜎−𝜎 ★)2 1 2 (𝜇−𝜇 ★)2 + (𝜎+𝜎 ★)2 2𝜎★𝜎 ,(74)...
-
[49]
If𝜇=𝜇 ★ and𝜎=𝜎 ★, then𝜉=𝜂=0
-
[50]
Otherwise,𝛽can be obtained from evaluating Eq. (74). The signs of𝜉and𝜂are given by Eq. (75). To obtain the magnitudes of𝜉and𝜂, one can use the results: (a) If𝜇=𝜇 ★, then|𝜂|=𝑣 ★𝛽and𝜉=0. (b) Otherwise, use Eq. (72) D.6 Metric and Jacobian for Geodesic Normal Coordinates Now we have functions for𝑢and𝑣as a function of𝜉and𝜂, let us find the metric in these new...
-
[51]
To verify that this has the desired behaviour, let us expand the metric around the point𝜉=𝜂=0. In this case, we obtain ˜𝑔𝑎𝑏 ≈1+ 1 6 𝜂2 −𝜂𝜉 −𝜂𝜉 𝜉 2 + O |𝑥| 3 ,(81) 23 so indeed, in these coordinates, the metric is flat at zeroth and linear order, with the correction occurring at quadratic order. Using the series expansion of Eq. (68), this can be written a...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.