Interpretability of deep-learning methods applied to large-scale structure surveys

Alexandre Refregier; Gaspard Aymerich; Tomasz Kacprzak

arxiv: 2501.18333 · v1 · submitted 2025-01-30 · 🌌 astro-ph.CO

Interpretability of deep-learning methods applied to large-scale structure surveys

Gaspard Aymerich , Tomasz Kacprzak , Alexandre Refregier This is my paper

Pith reviewed 2026-05-23 04:41 UTC · model grok-4.3

classification 🌌 astro-ph.CO

keywords interpretabilityconvolutional neural networkslarge-scale structurecosmological parametersGaussian informationnon-Gaussian informationdeep learning

0 comments

The pith

A convolutional neural network for large-scale structure surveys draws its predictions from a mix of Gaussian and non-Gaussian information, with emphasis on scales near the linear-to-nonlinear transition.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests what information drives a convolutional neural network's cosmological parameter estimates by training and evaluating the network on survey data from which specific features have been deliberately removed. This approach reveals whether the network depends on the same statistical properties that classical summary statistics use or on additional aspects of the maps. A reader would care because it offers a direct way to inspect the otherwise hidden reasoning inside a deep-learning model applied to cosmology data. The results show the network combines both Gaussian and non-Gaussian signals and weights most heavily the structures whose sizes sit at the boundary between linear and nonlinear regimes.

Core claim

Training the network on degraded large-scale structure data shows that its parameter predictions rely on a mix of both Gaussian and non-Gaussian information, and that the network places particular emphasis on structures whose scales lie at the limit between the linear and nonlinear regimes.

What carries the argument

The technique of training and predicting with input maps from which targeted information has been removed, then measuring the resulting change in constraining power.

If this is right

The network accesses information beyond what is captured by Gaussian statistics alone.
The emphasis on transitional scales implies sensitivity to mildly nonlinear structures.
The combination of information types may allow the network to break parameter degeneracies that affect traditional analyses.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the result holds, networks could be tested on mocks engineered to suppress non-Gaussian features to confirm the claimed dependence.
The same degradation method could be applied to other summary statistics to compare their information sources directly.

Load-bearing premise

Removing specific information from the survey data isolates the network's dependence on those features without the removal process itself creating new training dynamics or compensatory behaviors.

What would settle it

A measurement in which removing the claimed non-Gaussian or transitional-scale information leaves the network's error bars and bias unchanged.

Figures

Figures reproduced from arXiv: 2501.18333 by Alexandre Refregier, Gaspard Aymerich, Tomasz Kacprzak.

**Figure 1.** Figure 1: Redshift bins used for this work, chosen to be generally representative of a Stage III survey. Figure taken from Kacprzak & Fluri (2022). The sums runs over the redshift shells (that are of thickness ∆zb) and the weight for each shell is defined as: WWL b = 3 2 Ωm R ∆zb dz E(z) R zs z dz′n(z ′ ) D(z)D(z,z ′ ) D(z ′ )a(z) R ∆zb dz E(z) R zs z0 dz′n(z ′ ) (2) The mean convergence of each map is subtracted: … view at source ↗

**Figure 2.** Figure 2: Example of a simulated 900 deg² survey with 4 redshift bins, obtained by creating a mosaic of 6 × 6 individual 5 × 5 degrees maps, with Gaussian noise and Gaussian smoothing at scale R = 4 Mpc/h emcee algorithm (Foreman-Mackey et al. 2013) with the distribution given by the MDN. 200 chains of 128k samples are run for each model (or a single 1.28m chain for plotting Fig.5). 3. A novel approach to the inter… view at source ↗

**Figure 3.** Figure 3: Example of a map separated into 4 channels by a starlet transform. Only the first redshift bin is shown, but all four were included in the training. Top row is the initial map, bottom row are the 4 starlet transform channels and the corresponding scales [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: Example of a map separated into three convergence regions. Only the first redshift bin is shown, but all four were included in the training. From left to right, there is the base map, the low convergence regions, the mid convergence regions and finally the high convergence regions. Pixels outside of the range appear in yellow for both high and low κ and in deep blue for mid κ. mation that is removed. There… view at source ↗

**Figure 5.** Figure 5: Constraints on [Ωm, σ8] obtained by the CNN. The black dots mark the true value of parameters. degradations in σS 8 and H is almost identical. The slight disagreements can be explained by one main difference between the two measurements: σS 8 is not very affected by outliers when compared to the entropy, and probes mostly how tight the centre part of the distribution is. To better visualise the difference… view at source ↗

**Figure 6.** Figure 6: Network performance for various scale related degradations. Left panel is σS 8 , the constraining power on S 8, right panel is H, the information entropy. The top four rows present the performance for various smoothing scales, for both CNN and PS-neural network. The lower rows present the performance of the CNN for various scale range, obtained by keeping only certain starlet transform channels. 0.00 0.01… view at source ↗

**Figure 7.** Figure 7: CNN performance for various zero-loss transformations. Left panel is σS 8 , the constraining power on S 8, right panel is H, the information entropy. The results for 3 or 5 channels starlet transform as well as for a Fourier transform in the form of either real and imaginary parts or amplitude and phase are presented. 0.00 0.01 0.02 0.03 0.04 0.05 S8 constraining power ( S8) Reference High + FFT Amp Low + … view at source ↗

**Figure 8.** Figure 8: CNN performance for various convergence regions selections. Left panel is σS 8 , the constraining power on S 8, right panel is H, the information entropy. Low/mid/high κ denotes the low/mid/high convergence regions. The second row presents the performance of a network taking as input the high convergence regions in one channel and the Fourier transform amplitude in another, to mimic a widely used statistic… view at source ↗

**Figure 9.** Figure 9: CNN performance for redshift shuffling, redshift summing and shuffling all pixels. Left panel is σS 8 , the constraining power on S 8, right panel is H, the information entropy. Article number, page 7 of 11 [PITH_FULL_IMAGE:figures/full_fig_p007_9.png] view at source ↗

read the original abstract

Deep learning and convolutional neural networks in particular are powerful and promising tools for cosmological analysis of large-scale structure surveys. They are already providing similar performance to classical analysis methods using fixed summary statistics, are showing potential to break key degeneracies by better probe combination and will likely improve rapidly in the coming years as progress is made in the physical modelling through both software and hardware improvement. One key issue remains: unlike classical analysis, a convolutional neural network's decision process is hidden from the user as the network optimises millions of parameters with no direct physical meaning. This prevents a clear understanding of the potential limitations and biases of the analysis, making it hard to rely on as a main analysis method. In this work, we explore the behaviour of such a convolutional neural network through a novel method. Instead of trying to analyse a network a posteriori, i.e. after training has been completed, we study the impact on the constraining power of training the network and predicting parameters with degraded data where we removed part of the information. This allows us to gain an understanding of which parts and features of a large-scale structure survey are most important in the network's prediction process. We find that the network's prediction process relies on a mix of both Gaussian and non-Gaussian information, and seems to put an emphasis on structures whose scales are at the limit between linear and non-linear regimes.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper tests CNN reliance in cosmology by training on degraded maps to remove Gaussian or scale-specific information, but the degradation step risks changing training dynamics in unaccounted ways.

read the letter

The main takeaway is that training the network on versions of the survey data with targeted information removed lets you measure how much each component contributes to the final constraints. They report the model draws on both Gaussian and non-Gaussian content and leans on scales near the linear-to-nonlinear transition. That is the concrete result the abstract highlights. Doing the removal at training time rather than only after the fact is the part that differs from most post-hoc attribution work. It forces the network to adapt without the excised features, which can give a more stable signal about dependence than saliency maps or gradient-based explanations that often prove brittle. The questions they ask line up with what matters for large-scale structure: how much extra information beyond the power spectrum is being used and at what physical scales. For groups already running CNNs on weak-lensing or clustering maps, seeing a direct test of this kind is useful even if the numbers are still preliminary. The soft spot is exactly the one in the stress-test note. Removing modes or higher-order statistics can alter the variance, introduce boundary artifacts, or create new correlations in the remaining field. If those side effects are not controlled, the performance drop after retraining may reflect compensatory behavior rather than the intended information loss. The abstract supplies no power-spectrum matching, no checks on the degradation operator, and no error bars, so it is not yet possible to judge whether the attribution holds. If the full manuscript shows careful validation of the degraded fields and quantitative results with uncertainties, the method becomes more convincing. This is for cosmologists who already use or plan to use CNNs for parameter inference and want a practical check on what the model is actually doing. A reader looking for a ready-to-apply diagnostic will find the idea worth trying, with the usual caveats about implementation details. It should go to peer review because the underlying problem is real and the proposed test is straightforward to evaluate once the controls are in place.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces a method to interpret convolutional neural networks for cosmological parameter inference from large-scale structure surveys. Rather than post-hoc analysis, the authors retrain networks on deliberately degraded data with targeted removal of Gaussian versus non-Gaussian information or specific scale ranges, then measure changes in constraining power to infer which features the network relies upon. The headline result is that the network draws on a combination of both Gaussian and non-Gaussian information while emphasizing scales near the linear-to-nonlinear transition.

Significance. If the degradation protocol can be shown to isolate feature dependence without confounding changes to training dynamics or data statistics, the approach would supply a practical, forward-modeling route to interpretability that is directly relevant to ongoing and future LSS analyses. The method is original in its emphasis on retraining rather than post-training attribution and could help address the black-box concern that currently limits adoption of DL methods as primary analysis tools.

major comments (2)

[Methods (degradation protocol)] The central claim that performance degradation after targeted information removal directly reveals the network's learned reliance on Gaussian/non-Gaussian content or specific scales rests on the untested assumption that the degradation operator leaves the remaining data statistics and optimization landscape unchanged except for the excised component. No quantitative controls (power-spectrum matching, preservation of higher-order statistics, or ablation studies on the degradation operator itself) are described that would rule out compensatory training dynamics or induced artifacts.
[Abstract and Results] The abstract and provided description supply no quantitative results, error bars, or implementation details on how data degradation is performed or on the magnitude of the reported performance changes. Without these, it is not possible to assess whether the evidence supports the stated conclusion that the network 'relies on a mix' or 'puts an emphasis' on particular scales.

minor comments (2)

[Methods] Notation for the degradation operators and the precise definition of 'Gaussian' versus 'non-Gaussian' information should be introduced explicitly with equations or pseudocode.
[Introduction / Data] The manuscript would benefit from a clear statement of the cosmological parameters being inferred and the survey specifications (volume, redshift range, noise model) used in the training sets.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their thoughtful and constructive report. We address each major comment below and commit to revisions that strengthen the presentation of our degradation protocol and the quantitative support for our claims.

read point-by-point responses

Referee: [Methods (degradation protocol)] The central claim that performance degradation after targeted information removal directly reveals the network's learned reliance on Gaussian/non-Gaussian content or specific scales rests on the untested assumption that the degradation operator leaves the remaining data statistics and optimization landscape unchanged except for the excised component. No quantitative controls (power-spectrum matching, preservation of higher-order statistics, or ablation studies on the degradation operator itself) are described that would rule out compensatory training dynamics or induced artifacts.

Authors: We agree that explicit validation of the degradation operator is necessary to support the interpretability conclusions. The manuscript describes the targeted removal procedures but does not present the requested quantitative controls. We will add these in the revised methods section, including direct comparisons of the power spectrum and selected higher-order statistics before and after degradation, as well as ablation tests on the degradation parameters to check for induced artifacts or changes in training dynamics. revision: yes
Referee: [Abstract and Results] The abstract and provided description supply no quantitative results, error bars, or implementation details on how data degradation is performed or on the magnitude of the reported performance changes. Without these, it is not possible to assess whether the evidence supports the stated conclusion that the network 'relies on a mix' or 'puts an emphasis' on particular scales.

Authors: We accept that the current abstract is qualitative and lacks the requested numerical support. We will revise the abstract to report the magnitude of performance changes (e.g., relative increases in parameter uncertainties) when Gaussian or non-Gaussian information is removed, together with error bars obtained from multiple independent realizations. Implementation details of the degradation steps will be summarized concisely in the abstract or moved to a prominent position in the results section. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical ablation method is independent of its inputs

full rationale

The paper presents an interpretability study that trains CNNs on deliberately degraded survey data (removing Gaussian/non-Gaussian content or specific scales) and measures resulting changes in parameter constraints. This diagnostic is not derived from any fitted parameter that is then re-predicted, nor does it rely on self-definitional equations, uniqueness theorems imported from the authors' prior work, or ansatzes smuggled via self-citation. No load-bearing step reduces the central claim (reliance on mixed Gaussian/non-Gaussian information at linear-to-nonlinear scales) to a tautology or to the degradation operator itself. The approach is self-contained against external benchmarks of network performance on held-out data, yielding a normal non-finding of circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Based solely on the abstract, no explicit free parameters, axioms, or invented entities are described; the central claim rests on the unstated premise that the chosen degradation procedure cleanly isolates information usage.

pith-pipeline@v0.9.0 · 5776 in / 1055 out tokens · 21960 ms · 2026-05-23T04:41:42.927340+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

43 extracted references · 43 canonical work pages · 2 internal anchors

[1]

2016, Tensorflow: Large-scale machine learning on heterogeneous distributed systems, publication Title: arXiv.org

Abadi, M., Agarwal, A., Barham, P., et al. 2016, Tensorflow: Large-scale machine learning on heterogeneous distributed systems, publication Title: arXiv.org

work page 2016
[2]

Abbott, T. M. C., Aguena, M., Alarcon, A., et al. 2022, Physical Review D, 105

work page 2022
[3]

Aiola, S., Calabrese, E., Maurin, L., et al. 2020, J. Cosmol. Astropart. Phys., 2020, 047

work page 2020
[4]

A., et al

Amon, A., Gruen, D., Troxel, M. A., et al. 2022, Phys. Rev. D, 105, 023514

work page 2022
[5]

S., et al

Balkenhol, L., Dutcher, D., Mancini, A. S., et al. 2023, Phys. Rev. D, 108, 023510

work page 2023
[6]

V ., & Mellier, Y

Bernardeau, F., Waerbeke, L. V ., & Mellier, Y . 1997, Astronomy and Astro- physics, 322, 1

work page 1997
[7]

& King, L

Bridle, S. & King, L. 2007, New J. Phys., 9, 444

work page 2007
[8]

Dietrich, J. P. & Hartlap, J. 2010, Monthly Notices of the Royal Astronomical Society, 402, 1049 Article number, page 8 of 11 G. Aymerich et al.: Interpretability of deep-learning methods applied to large-scale structure surveys

work page 2010
[9]

2022, Machine Learning and Cosmology, arXiv:2203.08056 [astro-ph, physics:hep-ph, stat]

Dvorkin, C., Mishra-Sharma, S., Nord, B., et al. 2022, Machine Learning and Cosmology, arXiv:2203.08056 [astro-ph, physics:hep-ph, stat]

work page arXiv 2022
[10]

2019, Physical Review D, 100

Fluri, J., Kacprzak, T., Lucchi, A., et al. 2019, Physical Review D, 100

work page 2019
[11]

2022, A full wCDM analysis of KiDS- 1000 weak lensing maps using Deep Learning, publication Title: arXiv.org

Fluri, J., Kacprzak, T., Lucchi, A., et al. 2022, A full wCDM analysis of KiDS- 1000 weak lensing maps using Deep Learning, publication Title: arXiv.org

work page 2022
[12]

W., Lang, D., & Goodman, J

Foreman-Mackey, D., Hogg, D. W., Lang, D., & Goodman, J. 2013, Publications of the Astronomical Society of the Pacific, 125, 306

work page 2013
[13]

2018, Phys

Friedrich, O., Gruen, D., DeRose, J., et al. 2018, Phys. Rev. D, 98, 023508, pub- lisher: American Physical Society

work page 2018
[14]

Gong, Z., Halder, A., Barreira, A., Seitz, S., & Friedrich, O. 2023, J. Cosmol. Astropart. Phys., 2023, 040

work page 2023
[15]

2024, C3NN: Cosmo- logical Correlator Convolutional Neural Network – an interpretable machine learning tool for cosmological analyses, arXiv:2402.09526 [astro-ph]

Gong, Z., Halder, A., Bohrdt, A., Seitz, S., & Gebauer, D. 2024, C3NN: Cosmo- logical Correlator Convolutional Neural Network – an interpretable machine learning tool for cosmological analyses, arXiv:2402.09526 [astro-ph]

work page arXiv 2024
[16]

& Abel, T

Hahn, O. & Abel, T. 2011, Monthly Notices of the Royal Astronomical Society, 415, 2101

work page 2011
[17]

2015, Deep residual learning for image recognition, publication Title: arXiv.org

He, K., Zhang, X., Ren, S., & Sun, J. 2015, Deep residual learning for image recognition, publication Title: arXiv.org

work page 2015
[18]

2023, A&A, 672, A44

Heydenreich, S., Linke, L., Burger, P., & Schneider, P. 2023, A&A, 672, A44

work page 2023
[19]

2021, A&A, 646, A140

Heymans, C., Tröster, T., Asgari, M., et al. 2021, A&A, 646, A140

work page 2021
[20]

Hirata, C. M. & Seljak, U. 2004, Phys. Rev. D, 70, 063526, publisher: American Physical Society

work page 2004
[21]

B., & Bridle, S

Joachimi, B., Mandelbaum, R., Abdalla, F. B., & Bridle, S. L. 2011, A&A, 527, A26

work page 2011
[22]

& Fluri, J

Kacprzak, T. & Fluri, J. 2022, Phys. Rev. X, 12, 031029

work page 2022
[23]

Kingma, D. P. & Ba, J. 2017, Adam: A Method for Stochastic Optimization, arXiv:1412.6980 [cs]

work page internal anchor Pith review Pith/arXiv arXiv 2017
[24]

2012, Monthly Notices of the Royal Astronomical Society, 424, 1647

Kirk, D., Rassat, A., Host, O., & Bridle, S. 2012, Monthly Notices of the Royal Astronomical Society, 424, 1647

work page 2012
[25]

S., et al

LeCun, Y ., Boser, B., Denker, J. S., et al. 1989, Neural Computation, 1, 541, conference Name: Neural Computation

work page 1989
[26]

LeCun, Y ., Bottou, L., Bengio, Y ., & Ha, P. 1998

work page 1998
[27]

2023, Monthly Notices of the Royal Astronomical Society, 521, 2050

Lu, T., Haiman, Z., & Li, X. 2023, Monthly Notices of the Royal Astronomical Society, 521, 2050

work page 2023
[28]

V ., Pontzen, A., Nord, B., & Thiyagalingam, J

Lucie-Smith, L., Peiris, H. V ., Pontzen, A., Nord, B., & Thiyagalingam, J. 2024, Phys. Rev. D, 109, 063524

work page 2024
[29]

Matilla, J. M. Z., Sharma, M., Hsu, D., & Haiman, Z. 2020, Phys. Rev. D, 102, 123506, arXiv:2007.06529 [astro-ph]

work page arXiv 2020
[30]

2020, Sci

Pan, S., Liu, M., Forero-Romero, J., et al. 2020, Sci. China Phys. Mech. Astron., 63, 110412

work page 2020
[31]

& Lombriser, L

Piras, D. & Lombriser, L. 2024, Phys. Rev. D, 110, 023514, arXiv:2310.10717 [astro-ph] Planck Collaboration, Ade, P. A. R., Aghanim, N., et al. 2014, A&A, 571, A16 Planck Collaboration, Aghanim, N., Akrami, Y ., et al. 2020, A&A, 641, A6

work page arXiv 2024
[32]

2022, Phys

Porredon, A., Crocce, M., Elvin-Poole, J., et al. 2022, Phys. Rev. D, 106, 103530

work page 2022
[33]

2017, Comput

Potter, D., Stadel, J., & Teyssier, R. 2017, Comput. Astrophys., 4, 2

work page 2017
[34]

2016, in Proceedings of the 33rd International Conference on International Conference on Machine Learning - V olume 48, ICML’16 (New York, NY , USA: JMLR.org), 2407– 2416

Ravanbakhsh, S., Oliva, J., Fromenteau, S., et al. 2016, in Proceedings of the 33rd International Conference on International Conference on Machine Learning - V olume 48, ICML’16 (New York, NY , USA: JMLR.org), 2407– 2416

work page 2016
[35]

2003, Annu

Refregier, A. 2003, Annu. Rev. Astron. Astrophys., 41, 645

work page 2003
[36]

A Comprehensive Measurement of the Local Value of the Hubble Constant with 1 km/s/Mpc Uncertainty from the Hubble Space Telescope and the SH0ES Team

Riess, A. G., Yuan, W., Macri, L. M., et al. 2022, ApJL, 934, L7, arXiv:2112.04510 [astro-ph]

work page internal anchor Pith review Pith/arXiv arXiv 2022
[37]

J., & Müller, K.-R

Samek, W., Montavon, G., Lapuschkin, S., Anders, C. J., & Müller, K.-R. 2021, Proceedings of the IEEE, 109, 247, conference Name: Proceedings of the IEEE

work page 2021
[38]

Seetharaman, P., Wichern, G., Pardo, B., & Roux, J. L. 2020, AutoClip: Adap- tive gradient clipping for Source Separation Networks, publication Title: arXiv.org

work page 2020
[39]

Sgier, R., Réfrégier, A., Amara, A., & Nicola, A. 2019, J. Cosmol. Astropart. Phys., 2019, 044

work page 2019
[40]

Shannon, C. E. 1948, Bell System Technical Journal, 27, 379

work page 1948
[41]

2007, IEEE Transactions on Image Pro- cessing, 16, 297

Starck, J.-L., Fadili, J., & Murtagh, F. 2007, IEEE Transactions on Image Pro- cessing, 16, 297

work page 2007
[42]

D., Anglés-Alcázar, D., et al

Villaescusa-Navarro, F., Wandelt, B. D., Anglés-Alcázar, D., et al. 2022, ApJ, 928, 44

work page 2022
[43]

& Villaescusa-Navarro, F

Villanueva-Domingo, P. & Villaescusa-Navarro, F. 2021, ApJ, 907, 44 Zürcher, D., Fluri, J., Sgier, R., et al. 2022, Monthly Notices of the Royal Astro- nomical Society, 511, 2075 Article number, page 9 of 11 A&A proofs: manuscript no. aanda Appendix A: Results including intrinsic alignment In this appendix, we present the results obtained when mod- elling...

work page 2021

[1] [1]

2016, Tensorflow: Large-scale machine learning on heterogeneous distributed systems, publication Title: arXiv.org

Abadi, M., Agarwal, A., Barham, P., et al. 2016, Tensorflow: Large-scale machine learning on heterogeneous distributed systems, publication Title: arXiv.org

work page 2016

[2] [2]

Abbott, T. M. C., Aguena, M., Alarcon, A., et al. 2022, Physical Review D, 105

work page 2022

[3] [3]

Aiola, S., Calabrese, E., Maurin, L., et al. 2020, J. Cosmol. Astropart. Phys., 2020, 047

work page 2020

[4] [4]

A., et al

Amon, A., Gruen, D., Troxel, M. A., et al. 2022, Phys. Rev. D, 105, 023514

work page 2022

[5] [5]

S., et al

Balkenhol, L., Dutcher, D., Mancini, A. S., et al. 2023, Phys. Rev. D, 108, 023510

work page 2023

[6] [6]

V ., & Mellier, Y

Bernardeau, F., Waerbeke, L. V ., & Mellier, Y . 1997, Astronomy and Astro- physics, 322, 1

work page 1997

[7] [7]

& King, L

Bridle, S. & King, L. 2007, New J. Phys., 9, 444

work page 2007

[8] [8]

Dietrich, J. P. & Hartlap, J. 2010, Monthly Notices of the Royal Astronomical Society, 402, 1049 Article number, page 8 of 11 G. Aymerich et al.: Interpretability of deep-learning methods applied to large-scale structure surveys

work page 2010

[9] [9]

2022, Machine Learning and Cosmology, arXiv:2203.08056 [astro-ph, physics:hep-ph, stat]

Dvorkin, C., Mishra-Sharma, S., Nord, B., et al. 2022, Machine Learning and Cosmology, arXiv:2203.08056 [astro-ph, physics:hep-ph, stat]

work page arXiv 2022

[10] [10]

2019, Physical Review D, 100

Fluri, J., Kacprzak, T., Lucchi, A., et al. 2019, Physical Review D, 100

work page 2019

[11] [11]

2022, A full wCDM analysis of KiDS- 1000 weak lensing maps using Deep Learning, publication Title: arXiv.org

Fluri, J., Kacprzak, T., Lucchi, A., et al. 2022, A full wCDM analysis of KiDS- 1000 weak lensing maps using Deep Learning, publication Title: arXiv.org

work page 2022

[12] [12]

W., Lang, D., & Goodman, J

Foreman-Mackey, D., Hogg, D. W., Lang, D., & Goodman, J. 2013, Publications of the Astronomical Society of the Pacific, 125, 306

work page 2013

[13] [13]

2018, Phys

Friedrich, O., Gruen, D., DeRose, J., et al. 2018, Phys. Rev. D, 98, 023508, pub- lisher: American Physical Society

work page 2018

[14] [14]

Gong, Z., Halder, A., Barreira, A., Seitz, S., & Friedrich, O. 2023, J. Cosmol. Astropart. Phys., 2023, 040

work page 2023

[15] [15]

2024, C3NN: Cosmo- logical Correlator Convolutional Neural Network – an interpretable machine learning tool for cosmological analyses, arXiv:2402.09526 [astro-ph]

Gong, Z., Halder, A., Bohrdt, A., Seitz, S., & Gebauer, D. 2024, C3NN: Cosmo- logical Correlator Convolutional Neural Network – an interpretable machine learning tool for cosmological analyses, arXiv:2402.09526 [astro-ph]

work page arXiv 2024

[16] [16]

& Abel, T

Hahn, O. & Abel, T. 2011, Monthly Notices of the Royal Astronomical Society, 415, 2101

work page 2011

[17] [17]

2015, Deep residual learning for image recognition, publication Title: arXiv.org

He, K., Zhang, X., Ren, S., & Sun, J. 2015, Deep residual learning for image recognition, publication Title: arXiv.org

work page 2015

[18] [18]

2023, A&A, 672, A44

Heydenreich, S., Linke, L., Burger, P., & Schneider, P. 2023, A&A, 672, A44

work page 2023

[19] [19]

2021, A&A, 646, A140

Heymans, C., Tröster, T., Asgari, M., et al. 2021, A&A, 646, A140

work page 2021

[20] [20]

Hirata, C. M. & Seljak, U. 2004, Phys. Rev. D, 70, 063526, publisher: American Physical Society

work page 2004

[21] [21]

B., & Bridle, S

Joachimi, B., Mandelbaum, R., Abdalla, F. B., & Bridle, S. L. 2011, A&A, 527, A26

work page 2011

[22] [22]

& Fluri, J

Kacprzak, T. & Fluri, J. 2022, Phys. Rev. X, 12, 031029

work page 2022

[23] [23]

Kingma, D. P. & Ba, J. 2017, Adam: A Method for Stochastic Optimization, arXiv:1412.6980 [cs]

work page internal anchor Pith review Pith/arXiv arXiv 2017

[24] [24]

2012, Monthly Notices of the Royal Astronomical Society, 424, 1647

Kirk, D., Rassat, A., Host, O., & Bridle, S. 2012, Monthly Notices of the Royal Astronomical Society, 424, 1647

work page 2012

[25] [25]

S., et al

LeCun, Y ., Boser, B., Denker, J. S., et al. 1989, Neural Computation, 1, 541, conference Name: Neural Computation

work page 1989

[26] [26]

LeCun, Y ., Bottou, L., Bengio, Y ., & Ha, P. 1998

work page 1998

[27] [27]

2023, Monthly Notices of the Royal Astronomical Society, 521, 2050

Lu, T., Haiman, Z., & Li, X. 2023, Monthly Notices of the Royal Astronomical Society, 521, 2050

work page 2023

[28] [28]

V ., Pontzen, A., Nord, B., & Thiyagalingam, J

Lucie-Smith, L., Peiris, H. V ., Pontzen, A., Nord, B., & Thiyagalingam, J. 2024, Phys. Rev. D, 109, 063524

work page 2024

[29] [29]

Matilla, J. M. Z., Sharma, M., Hsu, D., & Haiman, Z. 2020, Phys. Rev. D, 102, 123506, arXiv:2007.06529 [astro-ph]

work page arXiv 2020

[30] [30]

2020, Sci

Pan, S., Liu, M., Forero-Romero, J., et al. 2020, Sci. China Phys. Mech. Astron., 63, 110412

work page 2020

[31] [31]

& Lombriser, L

Piras, D. & Lombriser, L. 2024, Phys. Rev. D, 110, 023514, arXiv:2310.10717 [astro-ph] Planck Collaboration, Ade, P. A. R., Aghanim, N., et al. 2014, A&A, 571, A16 Planck Collaboration, Aghanim, N., Akrami, Y ., et al. 2020, A&A, 641, A6

work page arXiv 2024

[32] [32]

2022, Phys

Porredon, A., Crocce, M., Elvin-Poole, J., et al. 2022, Phys. Rev. D, 106, 103530

work page 2022

[33] [33]

2017, Comput

Potter, D., Stadel, J., & Teyssier, R. 2017, Comput. Astrophys., 4, 2

work page 2017

[34] [34]

2016, in Proceedings of the 33rd International Conference on International Conference on Machine Learning - V olume 48, ICML’16 (New York, NY , USA: JMLR.org), 2407– 2416

Ravanbakhsh, S., Oliva, J., Fromenteau, S., et al. 2016, in Proceedings of the 33rd International Conference on International Conference on Machine Learning - V olume 48, ICML’16 (New York, NY , USA: JMLR.org), 2407– 2416

work page 2016

[35] [35]

2003, Annu

Refregier, A. 2003, Annu. Rev. Astron. Astrophys., 41, 645

work page 2003

[36] [36]

A Comprehensive Measurement of the Local Value of the Hubble Constant with 1 km/s/Mpc Uncertainty from the Hubble Space Telescope and the SH0ES Team

Riess, A. G., Yuan, W., Macri, L. M., et al. 2022, ApJL, 934, L7, arXiv:2112.04510 [astro-ph]

work page internal anchor Pith review Pith/arXiv arXiv 2022

[37] [37]

J., & Müller, K.-R

Samek, W., Montavon, G., Lapuschkin, S., Anders, C. J., & Müller, K.-R. 2021, Proceedings of the IEEE, 109, 247, conference Name: Proceedings of the IEEE

work page 2021

[38] [38]

Seetharaman, P., Wichern, G., Pardo, B., & Roux, J. L. 2020, AutoClip: Adap- tive gradient clipping for Source Separation Networks, publication Title: arXiv.org

work page 2020

[39] [39]

Sgier, R., Réfrégier, A., Amara, A., & Nicola, A. 2019, J. Cosmol. Astropart. Phys., 2019, 044

work page 2019

[40] [40]

Shannon, C. E. 1948, Bell System Technical Journal, 27, 379

work page 1948

[41] [41]

2007, IEEE Transactions on Image Pro- cessing, 16, 297

Starck, J.-L., Fadili, J., & Murtagh, F. 2007, IEEE Transactions on Image Pro- cessing, 16, 297

work page 2007

[42] [42]

D., Anglés-Alcázar, D., et al

Villaescusa-Navarro, F., Wandelt, B. D., Anglés-Alcázar, D., et al. 2022, ApJ, 928, 44

work page 2022

[43] [43]

& Villaescusa-Navarro, F

Villanueva-Domingo, P. & Villaescusa-Navarro, F. 2021, ApJ, 907, 44 Zürcher, D., Fluri, J., Sgier, R., et al. 2022, Monthly Notices of the Royal Astro- nomical Society, 511, 2075 Article number, page 9 of 11 A&A proofs: manuscript no. aanda Appendix A: Results including intrinsic alignment In this appendix, we present the results obtained when mod- elling...

work page 2021