pith. sign in

arxiv: 2507.03951 · v3 · submitted 2025-07-05 · 📡 eess.SP · q-bio.QM

Structure from Noise: Confirmation Bias in Particle Picking in Structural Biology

Pith reviewed 2026-05-19 06:36 UTC · model grok-4.3

classification 📡 eess.SP q-bio.QM
keywords cryo-EMparticle pickingtemplate matchingconfirmation biasstructure from noisemaximum likelihood estimationGaussian mixture modelcryo-ET
0
0 comments X

The pith

Template matching on pure noise yields maximum-likelihood estimates that converge to deterministic transforms of the chosen templates.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a mathematical framework to quantify bias introduced by template matching in the particle-picking stage of cryo-EM and cryo-ET pipelines. It shows that when this step is applied to data containing only noise, the downstream maximum-likelihood estimates of class means and reconstructed volumes converge to fixed, noise-dependent transformations of the user-supplied templates. A sympathetic reader would care because particle picking occurs early and its outputs feed directly into classification and 3D reconstruction, so any systematic distortion can propagate as confirmation bias. The analysis further tracks how the size of the bias varies with noise statistics, number of samples, dimension, and detection threshold, and the theory is checked with standard software on low-SNR test cases.

Core claim

When template matching is applied to pure noise, then under broad noise models, the resulting maximum-likelihood estimates converge asymptotically to deterministic, noise-dependent transforms of the user-specified templates, yielding a structure from noise effect. We further characterize how the resulting bias depends on the noise statistics, sample size, dimension, and detection threshold.

What carries the argument

The bias analysis that models template-matching detection as input to maximum-likelihood estimation in a Gaussian mixture model and to 3D volume reconstruction, proving asymptotic convergence under pure-noise conditions.

If this is right

  • In low-SNR data the extracted particle stack can contain reproducible artifacts that mimic real structures.
  • The bias in estimated class means is a fixed function of the templates and the noise covariance.
  • Detection threshold and sample size directly control the magnitude of the induced distortion.
  • Controlled experiments with common cryo-EM packages reproduce the predicted structure-from-noise artifacts.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Users may need to validate template choices against noise-only simulations before applying them to real data.
  • The same convergence phenomenon could appear in deep-learning particle pickers that implicitly encode template-like priors.
  • Bias-correction steps inserted after picking might reduce downstream distortion even when real particles are mixed with noise.

Load-bearing premise

The derivation isolates the bias by assuming the input micrographs or tomograms contain only noise with no real particles present.

What would settle it

Apply standard template-matching software to simulated pure-noise micrographs using known templates, then verify whether the reconstructed volumes equal the predicted deterministic transforms of those templates rather than random fluctuations.

Figures

Figures reproduced from arXiv: 2507.03951 by Alon Zabatani, Amnon Balanov, Tamir Bendory.

Figure 1
Figure 1. Figure 1: The cryo-electron microscope (cryo-EM) computational pipeline and the structure from noise [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Structure from noise in the particle-picking stage of the cryo-electron to [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Effect of template-matching thresholds on confirmation bias during the [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Confirmation bias in particle picking with particles present in the micro [PITH_FULL_IMAGE:figures/full_fig_p015_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Confirmation bias in Topaz particle picking. (a) [PITH_FULL_IMAGE:figures/full_fig_p017_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Comparison of confirmation bias in particle picking versus 2D classification. [PITH_FULL_IMAGE:figures/full_fig_p019_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Template-matching-induced truncation of a Gaussian distribution. (a) [PITH_FULL_IMAGE:figures/full_fig_p032_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Illustration of the relationship between the means for labeled particles [PITH_FULL_IMAGE:figures/full_fig_p037_8.png] view at source ↗
read the original abstract

The computational pipelines of single-particle cryo-electron microscopy (cryo-EM) and cryo-electron tomography (cryo-ET) include an early particle-picking stage, in which a micrograph or tomogram is scanned to extract candidate particles, typically via template matching or deep-learning-based techniques. The extracted particles are then passed to downstream tasks such as classification and 3D reconstruction. Although it is well understood empirically that particle picking can be sensitive to the choice of templates or learned priors, a quantitative theory of the bias introduced by this stage has been lacking. Here, we develop a mathematical framework for analyzing bias in template matching-based detection with concrete applications to cryo-EM and cryo-ET. We study this bias through two downstream tasks: (i) maximum-likelihood estimation of class means in a Gaussian mixture model (GMM) and (ii) 3D volume reconstruction from the extracted particle stack. We show that when template matching is applied to pure noise, then under broad noise models, the resulting maximum-likelihood estimates converge asymptotically to deterministic, noise-dependent transforms of the user-specified templates, yielding a structure from noise effect. We further characterize how the resulting bias depends on the noise statistics, sample size, dimension, and detection threshold. Finally, controlled experiments using standard cryo-EM software corroborate the theory, demonstrating reproducible structure from noise artifacts in low-SNR data.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The paper develops a mathematical framework for bias in template-matching particle picking for cryo-EM and cryo-ET. It derives that, under pure-noise inputs and broad noise models, maximum-likelihood estimates of class means in a Gaussian mixture model converge asymptotically to deterministic, noise-dependent transforms of the supplied templates. The bias is further characterized with respect to noise statistics, sample size, dimension, and detection threshold. Controlled experiments with standard cryo-EM software are presented to corroborate the theory and demonstrate reproducible structure-from-noise artifacts in low-SNR data.

Significance. If the central asymptotic result holds, the work supplies a quantitative theory for the empirically observed template sensitivity of particle picking, which can propagate confirmation bias into downstream classification and reconstruction. The application of standard maximum-likelihood asymptotics to the GMM setting together with reproducible software experiments on low-SNR data constitute clear strengths.

major comments (1)
  1. [Theoretical derivation and experimental validation sections] The derivation and convergence claim are established under the assumption of pure-noise inputs (no particles present). The practical headline claim for cryo-EM/ET pipelines, however, requires that the deterministic mapping remains load-bearing when the input contains a mixture of noise and sparse real particles. The reported low-SNR experiments do not include an explicit ablation that isolates the pure-noise contribution versus the mixed-particle case; without this, it is unclear whether the selected particle stack still converges to the same noise-dependent transform of the templates.
minor comments (1)
  1. [Bias characterization] The precise definition of the detection threshold and its dependence on the noise covariance should be stated explicitly when the bias is characterized with respect to sample size and dimension.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful reading and constructive comments on our manuscript. We address the major comment below and will revise the manuscript to incorporate the suggested clarification.

read point-by-point responses
  1. Referee: [Theoretical derivation and experimental validation sections] The derivation and convergence claim are established under the assumption of pure-noise inputs (no particles present). The practical headline claim for cryo-EM/ET pipelines, however, requires that the deterministic mapping remains load-bearing when the input contains a mixture of noise and sparse real particles. The reported low-SNR experiments do not include an explicit ablation that isolates the pure-noise contribution versus the mixed-particle case; without this, it is unclear whether the selected particle stack still converges to the same noise-dependent transform of the templates.

    Authors: We agree that the asymptotic convergence result is formally derived under pure-noise inputs, as this setting isolates the bias mechanism and permits a rigorous application of maximum-likelihood asymptotics for the GMM without confounding signal. In the low-SNR regime relevant to cryo-EM/ET, however, real particles are sparse and weak, so that template matching necessarily selects a substantial fraction of noise patches whose statistics are governed by the same deterministic mapping. Our controlled experiments already operate on low-SNR data that contain both noise and particles (as is unavoidable in real micrographs), and they reproduce the predicted artifacts, indicating that the bias remains operative. To make this explicit, we will add a dedicated subsection in the revised manuscript that (i) states the scope of the pure-noise theorem, (ii) provides a qualitative argument that the mapping continues to dominate when particle density is low and SNR precludes reliable detection independent of the template, and (iii) clarifies the headline claims accordingly. We view this as a useful strengthening rather than a fundamental limitation of the present analysis. revision: yes

Circularity Check

0 steps flagged

Central derivation uses standard ML asymptotics on GMM under pure-noise assumption; no reduction to fitted parameters or self-referential inputs.

full rationale

The paper derives its main result—that ML estimates converge to deterministic transforms of the templates when template matching is applied to pure noise—via asymptotic analysis of maximum-likelihood estimation in a Gaussian mixture model. This follows directly from classical statistical theory on consistency and bias of ML estimators under the stated noise models and does not involve any parameter fitted to the target data, self-citation load-bearing for uniqueness, or renaming of known results. The pure-noise setting is explicitly isolated as a controlled assumption to prove convergence, with experiments described only as corroboration rather than the source of the claim. No load-bearing step in the derivation chain reduces by construction to the paper's own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The framework rests on standard statistical asymptotics for maximum-likelihood estimation under Gaussian mixture models and broad noise assumptions typical in cryo-EM literature; no new entities are postulated.

axioms (2)
  • standard math Maximum-likelihood estimates converge asymptotically to deterministic transforms under the stated noise models
    Invoked to obtain the structure-from-noise convergence result for both class-mean estimation and volume reconstruction.
  • domain assumption Input data consists of pure noise without real particles
    Used to isolate the confirmation bias effect in the theoretical analysis.

pith-pipeline@v0.9.0 · 5780 in / 1318 out tokens · 41815 ms · 2026-05-19T06:36:07.927884+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

64 extracted references · 64 canonical work pages

  1. [1]

    How cryo-EM is revolutionizing structural biology

    Xiao-Chen Bai, Greg McMullan, and Sjors HW Scheres. How cryo-EM is revolutionizing structural biology. Trends in biochemical sciences, 40(1):49–57, 2015

  2. [2]

    Confirmation bias in Gaussian mixture models

    Amnon Balanov, Tamir Bendory, and Wasim Huleihel. Confirmation bias in Gaussian mixture models. arXiv preprint arXiv:2408.09718 , 2024

  3. [3]

    Einstein from noise: Statistical analysis

    Amnon Balanov, Wasim Huleihel, and Tamir Bendory. Einstein from noise: Statistical analysis. arXiv preprint arXiv:2407.05277 , 2024

  4. [4]

    Expectation-maximization for multi-reference alignment: Two pitfalls and one remedy

    Amnon Balanov, Wasim Huleihel, and Tamir Bendory. Expectation-maximization for multi-reference alignment: Two pitfalls and one remedy. arXiv preprint arXiv:2505.21435, 2025

  5. [5]

    A note on the sample complexity of multi-target detection

    Amnon Balanov, Shay Kreymer, and Tamir Bendory. A note on the sample complexity of multi-target detection. arXiv preprint arXiv:2501.11980 , 2025

  6. [6]

    Controlling the false discovery rate via knockoffs

    Rina Foygel Barber and Emmanuel J Cand` es. Controlling the false discovery rate via knockoffs. The Annals of statistics , pages 2055–2085, 2015

  7. [7]

    Structure of β-galactosidase at 3.2- ˚A resolution obtained by cryo-electron microscopy

    Alberto Bartesaghi, Doreen Matthies, Soojay Banerjee, Alan Merk, and Sriram Sub- ramaniam. Structure of β-galactosidase at 3.2- ˚A resolution obtained by cryo-electron microscopy. Proceedings of the National Academy of Sciences , 111(32):11709–11714, 2014

  8. [8]

    Single-particle cryo-electron microscopy: Mathematical theory, computational challenges, and opportunities

    Tamir Bendory, Alberto Bartesaghi, and Amit Singer. Single-particle cryo-electron microscopy: Mathematical theory, computational challenges, and opportunities. IEEE signal processing magazine, 37(2):58–76, 2020

  9. [9]

    Toward single particle reconstruction without particle picking: Breaking the detection limit

    Tamir Bendory, Nicolas Boumal, William Leeb, Eitan Levin, and Amit Singer. Toward single particle reconstruction without particle picking: Breaking the detection limit. SIAM Journal on Imaging Sciences , 16(2):886–910, 2023

  10. [10]

    Controlling the false discovery rate: a practical and powerful approach to multiple testing.Journal of the Royal statistical society: series B (Methodological), 57(1):289–300, 1995

    Yoav Benjamini and Yosef Hochberg. Controlling the false discovery rate: a practical and powerful approach to multiple testing.Journal of the Royal statistical society: series B (Methodological), 57(1):289–300, 1995. 22

  11. [11]

    Positive-unlabeled convolutional neural networks for particle picking in cryo-electron micrographs

    Tristan Bepler, Andrew Morin, Micah Rapp, Julia Brasch, Lawrence Shapiro, Alex J Noble, and Bonnie Berger. Positive-unlabeled convolutional neural networks for particle picking in cryo-electron micrographs. Nature methods, 16(11):1153–1160, 2019

  12. [12]

    Toward detecting and identifying macromolecules in a cellular context: template matching applied to electron tomograms

    Jochen B¨ ohm, Achilleas S Frangakis, Reiner Hegerl, Stephan Nickell, Dieter Typke, and Wolfgang Baumeister. Toward detecting and identifying macromolecules in a cellular context: template matching applied to electron tomograms. Proceedings of the National Academy of Sciences, 97(26):14245–14250, 2000

  13. [13]

    A complete data processing workflow for cryo-ET and subtomogram averaging

    Muyuan Chen, James M Bell, Xiaodong Shi, Stella Y Sun, Zhao Wang, and Steven J Ludtke. A complete data processing workflow for cryo-ET and subtomogram averaging. Nature methods, 16(11):1161–1168, 2019

  14. [14]

    High-confidence 3D template matching for cryo-electron tomography

    Sergio Cruz-Le´ on, Tom´ aˇ s Majtner, Patrick Hoffmann, Jan P Kreysing, Maarten W Tuijtel, Stefan L Schaefer, Katharina Geißler, Martin Beck, Beata Turoˇ nov´ a, and Ger- hard Hummer. High-confidence 3D template matching for cryo-electron tomography. Biophysical Journal, 123(3):183a, 2024

  15. [15]

    Learning mixtures of gaussians

    Sanjoy Dasgupta. Learning mixtures of gaussians. In 40th Annual Symposium on Foundations of Computer Science (Cat. No. 99CB37039) , pages 634–644. IEEE, 1999

  16. [16]

    Adam: A method for stochastic optimization

    Kingma Diederik. Adam: A method for stochastic optimization. (No Title), 2014

  17. [17]

    Probability: theory and examples, volume 49

    Rick Durrett. Probability: theory and examples, volume 49. Cambridge university press, 2019

  18. [18]

    KLT picker: Particle picking using data-driven optimal templates

    Amitay Eldar, Boris Landa, and Yoel Shkolnisky. KLT picker: Particle picking using data-driven optimal templates. Journal of structural biology , 210(2):107473, 2020

  19. [19]

    Object detection under the linear subspace model with application to cryo-EM images

    Amitay Eldar, Keren Mor Waknin, Samuel Davenport, Tamir Bendory, Armin Schwartz- man, and Yoel Shkolnisky. Object detection under the linear subspace model with application to cryo-EM images. arXiv preprint arXiv:2405.00364 , 2024

  20. [20]

    Cryo-EM heterogeneity analysis using regularized covariance estimation and kernel regression

    Marc Aur` ele Gilles and Amit Singer. Cryo-EM heterogeneity analysis using regularized covariance estimation and kernel regression. Proceedings of the National Academy of Sciences, 122(9):e2419140122, 2025

  21. [21]

    APPLE picker: Automatic particle picking, a low-effort cryo-EM framework

    Ayelet Heimowitz, Joakim And´ en, and Amit Singer. APPLE picker: Automatic particle picking, a low-effort cryo-EM framework. Journal of structural biology, 204(2):215–227, 2018

  22. [22]

    Avoiding the pitfalls of single particle cryo-electron microscopy: Einstein from noise

    Richard Henderson. Avoiding the pitfalls of single particle cryo-electron microscopy: Einstein from noise. Proceedings of the National Academy of Sciences , 110(45):18037– 18041, 2013

  23. [23]

    Application of template matching technique to particle detection in electron micrographs

    Zhong Huang and Pawel A Penczek. Application of template matching technique to particle detection in electron micrographs. Journal of Structural Biology , 145(1-2):29– 40, 2004. 23

  24. [24]

    The forensic confirmation bias: Prob- lems, perspectives, and proposed solutions

    Saul M Kassin, Itiel E Dror, and Jeff Kukucka. The forensic confirmation bias: Prob- lems, perspectives, and proposed solutions. Journal of applied research in memory and cognition, 2(1):42–52, 2013

  25. [25]

    New tools for automated cryo-EM single-particle analysis in RELION-4.0

    Dari Kimanius, Liyi Dong, Grigory Sharov, Takanori Nakane, and Sjors HW Scheres. New tools for automated cryo-EM single-particle analysis in RELION-4.0. Biochemical Journal, 478(24):4169–4185, 2021

  26. [26]

    Varieties of confirmation bias

    Joshua Klayman. Varieties of confirmation bias. Psychology of learning and motivation, 32:385–418, 1995

  27. [27]

    A stochastic approximate expectation- maximization for structure determination directly from cryo-EM micrographs

    Shay Kreymer, Amit Singer, and Tamir Bendory. A stochastic approximate expectation- maximization for structure determination directly from cryo-EM micrographs. arXiv preprint arXiv:2303.02157, 2023

  28. [28]

    On information and sufficiency

    Solomon Kullback and Richard A Leibler. On information and sufficiency. The annals of mathematical statistics , 22(1):79–86, 1951

  29. [29]

    Reply to subra- maniam, van heel, and henderson: Validity of the cryo-electron microscopy structures of the HIV-1 envelope glycoprotein complex

    Youdong Mao, Luis R Castillo-Menendez, and Joseph G Sodroski. Reply to subra- maniam, van heel, and henderson: Validity of the cryo-electron microscopy structures of the HIV-1 envelope glycoprotein complex. Proceedings of the National Academy of Sciences, 110(45):E4178–E4182, 2013

  30. [30]

    Molecular architecture of the uncleaved HIV-1 envelope glycoprotein trimer

    Youdong Mao, Liping Wang, Christopher Gu, Alon Herschhorn, Anik D´ esormeaux, Andr´ es Finzi, Shi-Hua Xiang, and Joseph G Sodroski. Molecular architecture of the uncleaved HIV-1 envelope glycoprotein trimer. Proceedings of the National Academy of Sciences, 110(30):12438–12443, 2013

  31. [31]

    Template matching and machine learning for cryo-electron tomography

    Antonio Martinez-Sanchez. Template matching and machine learning for cryo-electron tomography. Current Opinion in Structural Biology , 93:103058, 2025

  32. [32]

    Cryo-electron microscopy–a primer for the non-microscopist

    Jacqueline LS Milne, Mario J Borgnia, Alberto Bartesaghi, Erin EH Tran, Lesley A Earl, David M Schauder, Jeffrey Lengyel, Jason Pierson, Ardan Patwardhan, and Sriram Subramaniam. Cryo-electron microscopy–a primer for the non-microscopist. The FEBS journal, 280(1):28–45, 2013

  33. [33]

    Settling the polynomial learnability of mixtures of gaussians

    Ankur Moitra and Gregory Valiant. Settling the polynomial learnability of mixtures of gaussians. In 2010 IEEE 51st Annual Symposium on Foundations of Computer Science , pages 93–102. IEEE, 2010

  34. [34]

    Confirmation bias in a simulated research environment: An experimental study of scientific inference.Quarterly Journal of Experimental Psychology , 29(1):85–95, 1977

    Clifford R Mynatt, Michael E Doherty, and Ryan D Tweney. Confirmation bias in a simulated research environment: An experimental study of scientific inference.Quarterly Journal of Experimental Psychology , 29(1):85–95, 1977

  35. [35]

    Large sample estimation and hypothesis testing

    Whitney K Newey and Daniel McFadden. Large sample estimation and hypothesis testing. Handbook of econometrics, 4:2111–2245, 1994. 24

  36. [36]

    High-resolution in situ structure determination by cryo- electron tomography and subtomogram averaging using emclarity

    Tao Ni, Thomas Frosio, Luiza Mendon¸ ca, Yuewen Sheng, Daniel Clare, Benjamin A Himes, and Peijun Zhang. High-resolution in situ structure determination by cryo- electron tomography and subtomogram averaging using emclarity. Nature protocols, 17(2):421–444, 2022

  37. [37]

    The development of cryo-EM into a mainstream structural biology tech- nique

    Eva Nogales. The development of cryo-EM into a mainstream structural biology tech- nique. Nature methods, 13(1):24–27, 2016

  38. [38]

    3DFlex: determining structure and motion of flexible proteins from cryo-EM

    Ali Punjani and David J Fleet. 3DFlex: determining structure and motion of flexible proteins from cryo-EM. Nature Methods, 20(6):860–870, 2023

  39. [39]

    Cryo-EM in drug discovery: achievements, limitations and prospects

    Jean-Paul Renaud, Ashwin Chari, Claudio Ciferri, Wen-ti Liu, Herv´ e-William R´ emigy, Holger Stark, and Christian Wiesmann. Cryo-EM in drug discovery: achievements, limitations and prospects. Nature reviews Drug discovery , 17(7):471–492, 2018

  40. [40]

    Gaussian mixture models

    Douglas A Reynolds et al. Gaussian mixture models. Encyclopedia of biometrics , 741(659-663), 2009

  41. [41]

    A cryo-FIB lift-out technique enables molecular-resolution cryo-ET within native Caenorhabditis elegans tissue

    Miroslava Schaffer, Stefan Pfeffer, Julia Mahamid, Stephan Kleindiek, Tim Laugks, Sahradha Albert, Benjamin D Engel, Andreas Rummel, Andrew J Smith, Wolfgang Baumeister, et al. A cryo-FIB lift-out technique enables molecular-resolution cryo-ET within native Caenorhabditis elegans tissue. Nature methods, 16(8):757–762, 2019

  42. [42]

    RELION: implementation of a Bayesian approach to cryo-EM struc- ture determination

    Sjors HW Scheres. RELION: implementation of a Bayesian approach to cryo-EM struc- ture determination. Journal of structural biology , 180(3):519–530, 2012

  43. [43]

    Semi-automated selection of cryo-EM particles in RELION-1.3

    Sjors HW Scheres. Semi-automated selection of cryo-EM particles in RELION-1.3. Journal of structural biology , 189(2):114–122, 2015

  44. [44]

    Prevention of overfitting in cryo-EM structure determination

    Sjors HW Scheres and Shaoxia Chen. Prevention of overfitting in cryo-EM structure determination. Nature methods, 9(9):853–854, 2012

  45. [45]

    A method for the alignment of heterogeneous macromolecules from electron microscopy

    Maxim Shatsky, Richard J Hall, Steven E Brenner, and Robert M Glaeser. A method for the alignment of heterogeneous macromolecules from electron microscopy. Journal of structural biology, 166(1):67–78, 2009

  46. [46]

    A maximum-likelihood approach to single-particle image refinement

    Fred J Sigworth. A maximum-likelihood approach to single-particle image refinement. Journal of structural biology , 122(3):328–339, 1998

  47. [47]

    Computational methods for single-particle electron cryomicroscopy

    Amit Singer and Fred J Sigworth. Computational methods for single-particle electron cryomicroscopy. Annual review of biomedical data science , 3:163–190, 2020

  48. [48]

    A clustering ap- proach to multireference alignment of single-particle projections in electron microscopy

    Carlos Oscar S Sorzano, JR Bilbao-Castro, Y Shkolnisky, M Alcorlo, R Melero, G Caffarena-Fern´ andez, M Li, G Xu, R Marabini, and JM Carazo. A clustering ap- proach to multireference alignment of single-particle projections in electron microscopy. Journal of structural biology , 171(2):197–206, 2010. 25

  49. [49]

    On bias, variance, overfitting, gold standard and consensus in single-particle analysis by cryo-electron microscopy

    COS Sorzano, Amaya Jim´ enez-Moreno, David Maluenda, Marta Mart´ ınez, Erney Ram´ ırez-Aportela, James Krieger, Roberto Melero, Ana Cuervo, Javier Conesa, J Fil- ipovic, et al. On bias, variance, overfitting, gold standard and consensus in single-particle analysis by cryo-electron microscopy. Biological Crystallography, 78(4):410–423, 2022

  50. [50]

    Structure of trimeric HIV-1 envelope glycoproteins

    Sriram Subramaniam. Structure of trimeric HIV-1 envelope glycoproteins. Proceedings of the National Academy of Sciences , 110(45):E4172–E4174, 2013

  51. [51]

    The promise and the challenges of cryo-electron tomography

    Martin Turk and Wolfgang Baumeister. The promise and the challenges of cryo-electron tomography. FEBS letters, 594(20):3243–3261, 2020

  52. [52]

    van der Vaart

    Aad W. van der Vaart. Asymptotic Statistics. Cambridge University Press, 1998

  53. [53]

    Finding trimeric HIV-1 envelope glycoproteins in random noise

    Marin van Heel. Finding trimeric HIV-1 envelope glycoproteins in random noise. Pro- ceedings of the National Academy of Sciences , 110(45):E4175–E4177, 2013

  54. [54]

    SPHIRE- crYOLO is a fast and accurate fully automated particle picker for cryo-EM

    Thorsten Wagner, Felipe Merino, Markus Stabrin, Toshio Moriya, Claudia Antoni, Amir Apelbaum, Philine Hagel, Oleg Sitsel, Tobias Raisch, Daniel Prumbaum, et al. SPHIRE- crYOLO is a fast and accurate fully automated particle picker for cryo-EM. Communi- cations biology, 2(1):218, 2019

  55. [55]

    Note on the consistency of the maximum likelihood estimate

    Abraham Wald. Note on the consistency of the maximum likelihood estimate. The Annals of Mathematical Statistics , 20(4):595–601, 1949

  56. [56]

    Advances in cryo-ET data processing: meet- ing the demands of visual proteomics

    Abigail JI Watson and Alberto Bartesaghi. Advances in cryo-ET data processing: meet- ing the demands of visual proteomics. Current Opinion in Structural Biology, 87:102861, 2024

  57. [57]

    Cryo-EM structure of the Plasmodium falciparum 80S ribosome bound to the anti-protozoan drug emetine

    Wilson Wong, Xiao-chen Bai, Alan Brown, Israel S Fernandez, Eric Hanssen, Melanie Condron, Yan Hong Tan, Jake Baum, and Sjors HW Scheres. Cryo-EM structure of the Plasmodium falciparum 80S ribosome bound to the anti-protozoan drug emetine. Elife, 3:e03080, 2014

  58. [58]

    Atomic- resolution protein structure determination by cryo-em

    Ka Man Yip, Niels Fischer, Elham Paknia, Ashwin Chari, and Holger Stark. Atomic- resolution protein structure determination by cryo-em. Nature, 587(7832):157–161, 2020

  59. [59]

    Advances in cryo-electron tomography and subtomogram averaging and classification

    Peijun Zhang. Advances in cryo-electron tomography and subtomogram averaging and classification. Current opinion in structural biology , 58:249–258, 2019

  60. [60]

    CryoDRGN: recon- struction of heterogeneous cryo-EM structures using neural networks

    Ellen D Zhong, Tristan Bepler, Bonnie Berger, and Joseph H Davis. CryoDRGN: recon- struction of heterogeneous cryo-EM structures using neural networks. Nature methods, 18(2):176–185, 2021. 26 Appendix Appendix organization. In Appendix A, we detail the empirical setup and procedures used for the simulations presented in Section 3. Appendix B provides a th...

  61. [61]

    What is the relationship between the mean of each component gℓ in the true mixture model (B.7) and the corresponding template xℓ when the number of particles M → ∞? (answered in Proposition C.1)

  62. [62]

    In the 2D classification process based on a GMM model, how do the templates {xℓ}L−1 ℓ=0 relate to the GMM maximum likelihood estimators of the means {ˆµℓ}L−1 ℓ=0 , as specified in (B.5)? (answered in Theorem C.2) Proposition C.1 examines the relationship between the mean of a single component gℓ in the mixture model (B.7) and the corresponding template xℓ...

  63. [63]

    − − →α · xℓ, (C.4) for a constant α ≥ T , independent of ℓ

    For N → ∞, m(T ) ℓ = 1 |A(T ) ℓ | X yi∈A(T ) ℓ yi a.s. − − →α · xℓ, (C.4) for a constant α ≥ T , independent of ℓ

  64. [64]

    L−1X k=0 wkfk (z; µk) # = arg max {µk}L−1 k=0 L−1X ℓ=0 πℓ Z dz g ℓ (z) log

    For N → ∞, and T → ∞, lim T →∞ lim N →∞ m(T ) ℓ T = xℓ, (C.5) where the convergence is almost surely, and for every ℓ ∈ [L]. This result proves that averaging the noise observations {yi}N −1 i=0 , whose correlation with the template xℓ exceeds a specified threshold, converges almost surely to the template xℓ, scaled by a factor α. The scaling factor α is ...