pith. machine review for the scientific record. sign in

arxiv: 2605.09832 · v1 · submitted 2026-05-11 · 💻 cs.LG

Recognition: 2 theorem links

· Lean Theorem

Modeling Atomic Conformational Ensembles of Proteins via Test-Time Supervision of Boltz-2 on Cryo-EM Density Maps

Authors on Pith no claims yet

Pith reviewed 2026-05-12 04:51 UTC · model grok-4.3

classification 💻 cs.LG
keywords cryo-EMprotein conformational ensemblestest-time supervisionatomic model buildingBoltz-2fine-tuningdensity mapsensemble prediction
0
0 comments X

The pith

Fine-tuning Boltz-2 directly on raw cryo-EM density maps builds accurate atomic conformational ensembles.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a technique to fine-tune pre-trained static structure models such as Boltz-2 directly on ensembles of cryo-EM density maps, skipping the conventional step of first extracting atomic structures from those maps. This direct supervision produces atomic conformations that fit the observed densities with higher accuracy than existing model-building pipelines. The resulting method, CryoSampler, also generates diverse conformations for new sequences within the same protein family even when no cryo-EM data for those sequences is provided. Accurate ensemble knowledge matters because protein function depends on the range of shapes a sequence can adopt, and current approaches are bottlenecked by scarce high-quality atomic training data from either simulation or experiment.

Core claim

Direct test-time supervision of Boltz-2 on raw cryo-EM map ensembles allows the model to output diverse atomic conformations that match the density maps, achieving superior accuracy in atomic model building compared with prior methods while also demonstrating preliminary generalization to unseen sequences in the same protein family without requiring additional maps.

What carries the argument

Test-time supervision of Boltz-2 on cryo-EM density map ensembles, which directly optimizes the pre-trained model to generate atomic structures consistent with the observed densities.

If this is right

  • Atomic model building into heterogeneous cryo-EM maps becomes more accurate and does not require an intermediate atomic-structure extraction stage.
  • Sequence-to-ensemble predictors can be trained directly on raw experimental measurements instead of derived atomic data.
  • After fine-tuning on one set of maps, the model can predict diverse conformations for new sequences in the same protein family without needing cryo-EM data for those sequences.
  • This approach opens a route to next-generation ensemble prediction models whose training data comes straight from measurements rather than simulations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same direct-supervision idea could reduce dependence on expensive molecular-dynamics runs for generating ensemble training sets.
  • If the generalization holds across more families, the method might support routine ensemble modeling for proteins that lack experimental maps.
  • The technique suggests that test-time adaptation of structure predictors could be applied to other noisy experimental signals beyond cryo-EM.
  • Wider adoption might accelerate integration of conformational data into structure-based drug design by supplying atomic ensembles directly from existing cryo-EM collections.

Load-bearing premise

The atomic conformations produced by the fine-tuned model accurately reflect the true underlying conformational ensemble rather than artifacts introduced by the fine-tuning process or biases from the original pre-trained model.

What would settle it

Independent validation of the generated ensembles against time-resolved experimental data or long molecular-dynamics trajectories for the same proteins, checking whether the sampled conformations and their populations match the observed dynamics.

Figures

Figures reproduced from arXiv: 2605.09832 by Axel Levy, Fr\'ed\'eric Poitevin, Gordon Wetzstein, Jay Shenoy, Miro Astore, Sonya M. Hanson.

Figure 1
Figure 1. Figure 1: Our method, CryoSampler, fine-tunes Boltz-2 [ [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: CryoSampler is a latent diffusion model that is trained in two stages: in the first, we [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Model building performance comparison visualized the TRPV3 channel protein [ [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Ensemble prediction performance comparison. In this experiment, we train on an ensemble [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
read the original abstract

Knowledge of a protein's atomic conformational ensemble is critical to determining its function, yet state-of-the-art ensemble prediction models are limited by lack of high-quality conformational data from simulation or experiment. Recent advances in heterogeneous reconstruction for cryo-electron microscopy (cryo-EM) have enabled scientists to visualize ensembles of density maps for larger proteins and complexes not typically accessible through simulation, but building atomic models into these maps remains a challenge. Traditionally, ensemble prediction models are trained via a two-stage process: experimental density maps are converted into atomic structural ensembles through model building, after which these structures are used to train sequence-to-atomic ensemble predictors. In this work, we propose a new principle for fine-tuning pre-trained static structure prediction models such as Boltz-2 directly on raw cryo-EM maps, bypassing the two-stage process. We apply this technique to the problem of atomic model building by fine-tuning Boltz-2 to generate atomic conformations from an input ensemble of cryo-EM maps, achieving superior model building accuracy compared to prior work. Beyond overfitting to individual map ensembles, our method, CryoSampler, also shows preliminary evidence of in-domain generalization after fine-tuning, sampling diverse atomic conformations for an unseen sequences within the same protein family without requiring cryo-EM data. These capabilities indicate that CryoSampler holds the potential to train next-generation atomic ensemble prediction models directly on raw cryo-EM measurements.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces CryoSampler, a test-time fine-tuning procedure that supervises the pre-trained Boltz-2 model directly on ensembles of raw cryo-EM density maps to generate atomic conformational ensembles. It claims this bypasses the conventional two-stage pipeline (map-to-structure followed by structure-based training), yields superior model-building accuracy relative to prior methods, and provides preliminary evidence of in-domain generalization by sampling diverse conformations for unseen sequences within the same protein family without requiring additional cryo-EM input.

Significance. If the generalization result is rigorously validated, the approach could enable direct training of sequence-to-ensemble predictors on experimental measurements, reducing reliance on intermediate atomic model building and potentially improving accuracy for larger systems where simulation data are limited.

major comments (2)
  1. The central generalization claim (sampling for unseen family-member sequences after map-supervised fine-tuning) is load-bearing for the 'next-generation ensemble predictors' narrative yet receives no quantitative support: no held-out sequence ablation, no comparison against the unfine-tuned Boltz-2 baseline, and no agreement metrics with independent MD or experimental ensembles are reported for the unseen cases. This leaves open whether observed diversity reflects a learned family prior or pre-existing model biases.
  2. Abstract and results: the assertion of 'superior model building accuracy' is presented without any numerical metrics, baseline comparisons, error bars, or description of overfitting controls, preventing assessment of whether the data actually support the accuracy claim.
minor comments (2)
  1. The abstract would be strengthened by including at least one key quantitative result (e.g., RMSD or FSC improvement) to ground the 'superior accuracy' statement.
  2. Clarify the precise fine-tuning objective and regularization used to prevent map-specific overfitting, as this directly affects the plausibility of the generalization result.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed comments. We address each major comment point by point below, providing the strongest honest defense of the manuscript while acknowledging where revisions are warranted to improve clarity and rigor.

read point-by-point responses
  1. Referee: The central generalization claim (sampling for unseen family-member sequences after map-supervised fine-tuning) is load-bearing for the 'next-generation ensemble predictors' narrative yet receives no quantitative support: no held-out sequence ablation, no comparison against the unfine-tuned Boltz-2 baseline, and no agreement metrics with independent MD or experimental ensembles are reported for the unseen cases. This leaves open whether observed diversity reflects a learned family prior or pre-existing model biases.

    Authors: We agree that the generalization results are presented as preliminary and would be strengthened by the quantitative elements noted. In the revised manuscript we have added a direct comparison of the fine-tuned CryoSampler outputs against the unfine-tuned Boltz-2 baseline on the unseen sequences, along with quantitative agreement metrics (e.g., conformational RMSD distributions and overlap with known family variability) derived from available experimental structures. We have also clarified the experimental setup to show that the observed diversity exceeds what the base model produces. A full multi-family held-out sequence ablation and systematic MD/experimental ensemble comparisons remain outside the scope of the current study owing to the substantial computational cost of repeated fine-tuning runs and the scarcity of matched high-resolution ensemble data; we have therefore revised the text to emphasize the preliminary character of the generalization evidence and to moderate the associated claims. revision: partial

  2. Referee: Abstract and results: the assertion of 'superior model building accuracy' is presented without any numerical metrics, baseline comparisons, error bars, or description of overfitting controls, preventing assessment of whether the data actually support the accuracy claim.

    Authors: We acknowledge that the original abstract and results sections lacked the quantitative detail required for rigorous evaluation. The revised manuscript now includes explicit numerical metrics (RMSD to reference atomic models, precision-recall on density fit, and other standard model-building scores) with direct comparisons to prior methods, error bars computed across multiple independent fine-tuning runs, and a dedicated paragraph describing the overfitting controls (early stopping on a held-out map subset, regularization terms, and monitoring of validation loss during test-time adaptation). These additions are placed in both the abstract and the main results section so that readers can directly assess the strength of the accuracy claims. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical fine-tuning procedure is self-contained

full rationale

The paper describes an empirical fine-tuning procedure (CryoSampler) that applies test-time supervision of a pre-trained model (Boltz-2) directly to raw cryo-EM density maps for atomic model building. Claims of superior accuracy and preliminary in-domain generalization are presented as experimental outcomes rather than mathematical derivations or predictions that reduce to fitted inputs by construction. No self-definitional equations, fitted parameters renamed as predictions, load-bearing self-citations, or ansatz smuggling appear in the provided text. The approach bypasses a two-stage pipeline via direct supervision, with results treated as observable rather than forced by definition.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Review based only on abstract; no explicit free parameters, axioms, or invented entities are stated. The approach implicitly assumes that density-map supervision can substitute for atomic-structure supervision without introducing new unstated biases.

pith-pipeline@v0.9.0 · 5573 in / 1220 out tokens · 38844 ms · 2026-05-12T04:51:46.986354+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

35 extracted references · 35 canonical work pages

  1. [1]

    Protein structure prediction has reached the single-structure frontier.Nature Methods, 20(2):170–173, 2023

    Thomas J Lane. Protein structure prediction has reached the single-structure frontier.Nature Methods, 20(2):170–173, 2023

  2. [2]

    Modeling conforma- tional states of proteins with alphafold.Current Opinion in Structural Biology, 81:102645, 2023

    Davide Sala, Felix Engelberger, Hassane S Mchaourab, and Jens Meiler. Modeling conforma- tional states of proteins with alphafold.Current Opinion in Structural Biology, 81:102645, 2023

  3. [3]

    Structural biology is solved—now what?Nature methods, 19(1):24–26, 2022

    Abbas Ourmazd, Keith Moffat, and Eaton Edward Lattman. Structural biology is solved—now what?Nature methods, 19(1):24–26, 2022

  4. [4]

    Highly accurate protein structure prediction with alphafold.nature, 596(7873):583–589, 2021

    John Jumper, Richard Evans, Alexander Pritzel, Tim Green, Michael Figurnov, Olaf Ron- neberger, Kathryn Tunyasuvunakool, Russ Bates, Augustin Žídek, Anna Potapenko, et al. Highly accurate protein structure prediction with alphafold.nature, 596(7873):583–589, 2021

  5. [5]

    Accurate structure prediction of biomolecular interactions with alphafold 3.Nature, 630(8016):493–500, 2024

    Josh Abramson, Jonas Adler, Jack Dunger, Richard Evans, Tim Green, Alexander Pritzel, Olaf Ronneberger, Lindsay Willmore, Andrew J Ballard, Joshua Bambrick, et al. Accurate structure prediction of biomolecular interactions with alphafold 3.Nature, 630(8016):493–500, 2024

  6. [6]

    The protein data bank.Nucleic acids research, 28(1):235–242, 2000

    Helen M Berman, John Westbrook, Zukang Feng, Gary Gilliland, Talapady N Bhat, Helge Weissig, Ilya N Shindyalov, and Philip E Bourne. The protein data bank.Nucleic acids research, 28(1):235–242, 2000

  7. [7]

    3d variability analysis: Resolving continuous flexibility and dis- crete heterogeneity from single particle cryo-em.Journal of structural biology, 213(2):107702, 2021

    Ali Punjani and David J Fleet. 3d variability analysis: Resolving continuous flexibility and dis- crete heterogeneity from single particle cryo-em.Journal of structural biology, 213(2):107702, 2021

  8. [8]

    Cryodrgn: reconstruction of heterogeneous cryo-em structures using neural networks.Nature methods, 18(2):176–185, 2021

    Ellen D Zhong, Tristan Bepler, Bonnie Berger, and Joseph H Davis. Cryodrgn: reconstruction of heterogeneous cryo-em structures using neural networks.Nature methods, 18(2):176–185, 2021

  9. [9]

    Cryodrgn- ai: neural ab initio reconstruction of challenging cryo-em and cryo-et datasets.Nature Methods, Jun 2025

    Axel Levy, Rishwanth Raghu, Ryan Feathers, Michal Grzadkowski, Frederic Poitevin, Jake D Johnston, Francesca Vallese, Oliver B Clarke, Gordon Wetzstein, and Ellen D Zhong. Cryodrgn- ai: neural ab initio reconstruction of challenging cryo-em and cryo-et datasets.Nature Methods, Jun 2025

  10. [10]

    Automated model building and protein identification in cryo-em maps.Nature, 628(8007):450– 457, 2024

    Kiarash Jamali, Lukas Käll, Rui Zhang, Alan Brown, Dari Kimanius, and Sjors HW Scheres. Automated model building and protein identification in cryo-em maps.Nature, 628(8007):450– 457, 2024

  11. [11]

    McCoy, Suresh Banjara, Hiroki Okumura, Eve Napier, Pietro Fontana, Amir R

    Alisia Fadini, Minhuan Li, Airlie J. McCoy, Suresh Banjara, Hiroki Okumura, Eve Napier, Pietro Fontana, Amir R. Khan, Luca Jovine, Thomas C. Terwilliger, Randy J. Read, Doeke R. Hekstra, and Mohammed AlQuraishi. Alphafold as a prior: experimental structure determination conditioned on a pretrained neural network.Nature Methods, April 2026

  12. [12]

    Rishwanth Raghu, Axel Levy, Gordon Wetzstein, and Ellen D. Zhong. Multiscale guidance of protein structure prediction with heterogeneous cryo-em data.39th Conference on Neural Information Processing Systems (NeurIPS), 2025

  13. [13]

    Boltz-2: Towards accurate and efficient binding affinity prediction.BioRxiv, 2025

    Saro Passaro, Gabriele Corso, Jeremy Wohlwend, Mateo Reveiz, Stephan Thaler, Vignesh Ram Somnath, Noah Getz, Tally Portnoi, Julien Roy, Hannes Stark, et al. Boltz-2: Towards accurate and efficient binding affinity prediction.BioRxiv, 2025

  14. [14]

    Scalable emulation of protein equilibrium ensembles with generative deep learning.Science, page eadv9817, 2025

    Sarah Lewis, Tim Hempel, José Jiménez-Luna, Michael Gastegger, Yu Xie, Andrew YK Foong, Victor García Satorras, Osama Abdin, Bastiaan S Veeling, Iryna Zaporozhets, et al. Scalable emulation of protein equilibrium ensembles with generative deep learning.Science, page eadv9817, 2025

  15. [15]

    Emsley, B

    P. Emsley, B. Lohkamp, W. G. Scott, and K. Cowtan. Features and development ofCoot.Acta Crystallographica Section D, 66(4):486–501, Apr 2010

  16. [16]

    Tristan Ian Croll.ISOLDE: a physically realistic environment for model building into low- resolution electron-density maps.Acta Crystallographica Section D, 74(6):519–530, Jun 2018. 11

  17. [17]

    Boltz-1 democratizing biomolecular interaction modeling.BioRxiv, pages 2024–11, 2025

    Jeremy Wohlwend, Gabriele Corso, Saro Passaro, Noah Getz, Mateo Reveiz, Ken Leidal, Wojtek Swiderski, Liam Atkinson, Tally Portnoi, Itamar Chinn, et al. Boltz-1 democratizing biomolecular interaction modeling.BioRxiv, pages 2024–11, 2025

  18. [18]

    Building molecular model series from heterogeneous cryoem structures using gaussian mixture models and deep neural networks.Communications Biology, 8(1):798, 2025

    Muyuan Chen. Building molecular model series from heterogeneous cryoem structures using gaussian mixture models and deep neural networks.Communications Biology, 8(1):798, 2025

  19. [19]

    The landscape of machine learning approaches for modeling protein conformational ensembles.Current Opinion in Structural Biology, 98:103253, 2026

    Samuel Sledzieski and Sonya M Hanson. The landscape of machine learning approaches for modeling protein conformational ensembles.Current Opinion in Structural Biology, 98:103253, 2026

  20. [20]

    Predicting multiple conformations via sequence clustering and alphafold2.Nature, 625(7996):832–839, 2024

    Hannah K Wayment-Steele, Adedolapo Ojoawo, Renee Otten, Julia M Apitz, Warintra Pit- sawong, Marc Hömberger, Sergey Ovchinnikov, Lucy Colwell, and Dorothee Kern. Predicting multiple conformations via sequence clustering and alphafold2.Nature, 625(7996):832–839, 2024

  21. [21]

    High-throughput prediction of protein conformational distributions with subsam- pled alphafold2.nature communications, 15(1):2464, 2024

    Gabriel Monteiro da Silva, Jennifer Y Cui, David C Dalgarno, George P Lisi, and Brenda M Rubenstein. High-throughput prediction of protein conformational distributions with subsam- pled alphafold2.nature communications, 15(1):2464, 2024

  22. [22]

    Steering conformational sampling in boltz-2 via pair representation scaling.bioRxiv, pages 2026–01, 2026

    Shosuke Suzuki and Toshiyuki Amagasa. Steering conformational sampling in boltz-2 via pair representation scaling.bioRxiv, pages 2026–01, 2026

  23. [23]

    Unlocking hidden biomolecular conformational landscapes in diffusion models at inference time.39th Conference on Neural Information Processing Systems (NeurIPS), 2025

    Daniel D Richman, Jessica Karaguesian, Carl-Mikael Suomivuori, and Ron O Dror. Unlocking hidden biomolecular conformational landscapes in diffusion models at inference time.39th Conference on Neural Information Processing Systems (NeurIPS), 2025

  24. [24]

    Alphafold meets flow matching for generating protein ensembles

    Bowen Jing, Bonnie Berger, and Tommi Jaakkola. Alphafold meets flow matching for generating protein ensembles. InForty-first International Conference on Machine Learning, 2024

  25. [25]

    Eman2: an extensible image processing suite for electron microscopy.Journal of structural biology, 157(1):38–46, 2007

    Guang Tang, Liwei Peng, Philip R Baldwin, Deepinder S Mann, Wen Jiang, Ian Rees, and Steven J Ludtke. Eman2: an extensible image processing suite for electron microscopy.Journal of structural biology, 157(1):38–46, 2007

  26. [26]

    Macromolecular structure determination using x-rays, neutrons and electrons: recent developments in phenix

    Dorothee Liebschner, Pavel V Afonine, Matthew L Baker, Gábor Bunkóczi, Vincent B Chen, Tristan I Croll, Bradley Hintze, L-W Hung, Swati Jain, Airlie J McCoy, et al. Macromolecular structure determination using x-rays, neutrons and electrons: recent developments in phenix. Biological Crystallography, 75(10):861–877, 2019

  27. [27]

    The structure of lipid nanodisc-reconstituted trpv3 reveals the gating mechanism.Nature Structural & Molecular Biology, 27(7):645–652, 2020

    Hiroto Shimada, Tsukasa Kusakizako, TH Dung Nguyen, Tomohiro Nishizawa, Tomoya Hino, Makoto Tominaga, and Osamu Nureki. The structure of lipid nanodisc-reconstituted trpv3 reveals the gating mechanism.Nature Structural & Molecular Biology, 27(7):645–652, 2020

  28. [28]

    Cryo-em reveals integrin-mediated tgf-β activation without release from latent tgf-β.Cell, 180(3):490– 501, 2020

    Melody G Campbell, Anthony Cormier, Saburo Ito, Robert I Seed, Andrew J Bondesson, Jianlong Lou, James D Marks, Jody L Baron, Yifan Cheng, and Stephen L Nishimura. Cryo-em reveals integrin-mediated tgf-β activation without release from latent tgf-β.Cell, 180(3):490– 501, 2020

  29. [29]

    Selective g protein signaling driven by substance p–neurokinin receptor dynamics.Nature chemical biology, 18(1):109–115, 2022

    Julian A Harris, Bryan Faust, Arisbel B Gondin, Marc André Dämgen, Carl-Mikael Suomivuori, Nicholas A Veldhuis, Yifan Cheng, Ron O Dror, David M Thal, and Aashish Manglik. Selective g protein signaling driven by substance p–neurokinin receptor dynamics.Nature chemical biology, 18(1):109–115, 2022

  30. [30]

    Cryo-em of human p-glycoprotein reveals an intermediate occluded conformation during active drug transport.Nature Communications, 16(1):3619, 2025

    Alan T Culbertson and Maofu Liao. Cryo-em of human p-glycoprotein reveals an intermediate occluded conformation during active drug transport.Nature Communications, 16(1):3619, 2025

  31. [31]

    Structural insight into trpv5 channel function and modulation

    Shangyu Dang, Mark K van Goor, Daniel Asarnow, YongQiang Wang, David Julius, Yifan Cheng, and Jenny van der Wijst. Structural insight into trpv5 channel function and modulation. Proceedings of the National Academy of Sciences, 116(18):8869–8878, 2019

  32. [32]

    Prody: protein dynamics inferred from theory and experiments.Bioinformatics, 27(11):1575–1577, 2011

    Ahmet Bakan, Lidio M Meireles, and Ivet Bahar. Prody: protein dynamics inferred from theory and experiments.Bioinformatics, 27(11):1575–1577, 2011. 12

  33. [33]

    cryosparc: algorithms for rapid unsupervised cryo-em structure determination.Nature methods, 14(3):290–296, 2017

    Ali Punjani, John L Rubinstein, David J Fleet, and Marcus A Brubaker. cryosparc: algorithms for rapid unsupervised cryo-em structure determination.Nature methods, 14(3):290–296, 2017

  34. [34]

    breathing

    Elaine C Meng, Thomas D Goddard, Eric F Pettersen, Greg S Couch, Zach J Pearson, John H Morris, and Thomas E Ferrin. Ucsf chimerax: Tools for structure building and analysis.Protein Science, 32(11):e4792, 2023. A Technical appendices and supplementary material A.1 Model Building Evaluation With Full MolProbity Metrics, Runtime, and Ablation Study We provi...

  35. [35]

    A.3.4 IntegrinαV β8 The data were acquired from EMPIAR-10345 [28] and processed with CryoSPARC and 3DV A to obtain heterogeneous volumes

    in order to remove background. A.3.4 IntegrinαV β8 The data were acquired from EMPIAR-10345 [28] and processed with CryoSPARC and 3DV A to obtain heterogeneous volumes. Four evenly-spaced maps were selected along the first principal component, and the volumes were resampled at 2.5 angstrom per voxel resolution and thresholded at minimum 0.28 intensity in ...