pith. sign in

arxiv: 2605.15564 · v1 · pith:RIZCPFH5new · submitted 2026-05-15 · 💻 cs.LG · cs.CE· eess.IV

CrystalBoltz: End-to-End Protein Structure Determination via Experiment-Guided Diffusion for X-Ray Crystallography

Pith reviewed 2026-05-20 20:11 UTC · model grok-4.3

classification 💻 cs.LG cs.CEeess.IV
keywords protein structure determinationX-ray crystallographydiffusion modelsgenerative modelsBayesian inferencestructure refinementstructure-factor amplitudes
0
0 comments X

The pith

CrystalBoltz conditions a pre-trained diffusion model on X-ray structure-factor amplitudes to sample and refine protein structures directly from diffraction data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

X-ray crystallography leaves the phases of diffracted beams unmeasured, turning structure determination into an inverse problem that requires models to be both physically plausible and consistent with observed amplitudes. CrystalBoltz starts with a generative model pre-trained on many known protein structures and adapts it to draw samples from the posterior distribution given new experimental measurements. These samples then undergo atomic coordinate and B-factor refinement. Across multiple test datasets the resulting models show lower coordinate RMSD and lower R-factors than the best baselines while cutting runtime by a factor of 33. A reader would care because the method reduces reliance on slow, expert-driven manual refinement and speeds up the path from raw diffraction data to usable atomic models.

Core claim

CrystalBoltz casts crystallographic refinement as Bayesian inference over atomic structures and operates directly on structure-factor amplitudes. It moves from unguided generation with a pre-trained prior over protein structures to experiment-guided posterior sampling, followed by atomic coordinate and B-factor refinement.

What carries the argument

Experiment-guided posterior sampling that conditions a pre-trained diffusion prior over protein structures on measured structure-factor amplitudes.

If this is right

  • Atomic models can be obtained with lower coordinate errors than strongest existing baselines.
  • R-factors improve relative to the same baselines.
  • Runtime drops by a factor of 33 compared with existing experimentally guided refinement.
  • The workflow applies across multiple protein crystallography datasets without per-target retraining.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar conditioning of generative priors could be tested on other experimental modalities that also produce incomplete data, such as electron microscopy.
  • If the prior continues to generalize, the method may reduce the amount of manual intervention needed for novel or low-resolution targets.
  • Faster end-to-end pipelines could increase throughput in structural biology projects that rely on repeated structure determination.

Load-bearing premise

A generative model pre-trained on existing protein structures can be conditioned on fresh experimental diffraction data without major loss of physical consistency.

What would settle it

New crystallographic datasets in which CrystalBoltz produces higher coordinate RMSD or higher R-factors than conventional experimentally guided refinement would falsify the reported performance gains.

Figures

Figures reproduced from arXiv: 2605.15564 by Alec Follmer, Frederic Poitevin, Gordon Wetzstein, Huanghao Mai, Jay Shenoy, Minseo Kim.

Figure 1
Figure 1. Figure 1: Algorithm Overview. CrystalBoltz has two phases. Phase 1 runs Boltz diffusion condi￾tioned on the protein sequence and known crystallographic parameters (unit cell and space group). Sampling begins unguided; once a coarse structure has formed, experimental guidance is switched on at step tg and continues to t = 0. Phase 2 takes the resulting structure and refines atomic coordinates and B-factors against th… view at source ↗
Figure 2
Figure 2. Figure 2: Qualitative results on PDB 4NTZ. CrystalBoltz can correct large conformation change while other baselines cannot. Not only is there a significant improvement on the global RMSD, the R-factor is also reduced. the deposited PDB coordinates. The crystallographic R-factors Rwork and Rfree (lower is better) [7] measure normalized L1 agreement between calculated and experimental amplitudes on the working and hel… view at source ↗
Figure 3
Figure 3. Figure 3: Solvent ablation on PDB 8DWN. Bulk-solvent term consistently improves CC. w/ B-factor w/o B-factor PDB [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 5
Figure 5. Figure 5: Qualitative results on PDB 8DWN. Another example showing that CrystalBoltz can correct large conformation change. A.4 Hyperparameter choices With the total number of diffusion sampling steps being T = 200 for phase 1 of CrystalBoltz, we chose the guidance start time as tg = 50 which is when the backbone structure is roughly recovered by the Boltz-2 prior. We recommend visual assessment of the structure at … view at source ↗
read the original abstract

Generative models trained on public databases of protein structures, most of which have been determined by X-ray crystallography, now provide powerful priors for structure prediction. However, they are not readily conditioned on the measurements from a new crystallographic experiment, limiting their use for X-ray structure determination. In crystallography, the measured structure-factor amplitudes do not by themselves determine an electron density map or atomic structure because the associated phases are unobserved and must be inferred. Structure determination therefore remains an inverse problem in which candidate models must be both structurally plausible and consistent with measured diffraction data, often requiring substantial manual refinement by human experts. Emerging methods aim to incorporate experimental information more directly into predictive and refinement workflows. We present CrystalBoltz, a generative framework that casts crystallographic refinement as Bayesian inference over atomic structures and operates directly on structure-factor amplitudes. CrystalBoltz moves from unguided generation with a pre-trained prior over protein structures to experiment-guided posterior sampling, followed by atomic coordinate and B-factor refinement. Across multiple protein crystallography datasets, CrystalBoltz attains lower coordinate RMSD and lower R-factors than the strongest baselines considered, while reducing runtime by a factor of 33 relative to existing experimentally guided refinement.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript introduces CrystalBoltz, a generative framework that treats X-ray crystallographic structure determination as Bayesian inference. It starts from a pre-trained diffusion prior over protein structures (trained on PDB entries) and conditions this prior on measured structure-factor amplitudes to produce posterior samples of atomic coordinates and B-factors; these samples are then subjected to conventional atomic refinement. The central empirical claim is that the resulting models achieve lower coordinate RMSD and lower R-factors than the strongest baselines while delivering a 33-fold runtime reduction relative to existing experimentally guided refinement pipelines.

Significance. If the conditioning step can be shown to produce physically consistent structures without substantial domain shift from the PDB training distribution, the method would constitute a meaningful advance in automated structure solution. It would demonstrate that diffusion-based generative priors can be effectively fused with experimental likelihoods in a manner that both accelerates refinement and improves final model quality, a result with direct implications for high-throughput crystallography.

major comments (3)
  1. [§3.2] §3.2 (Posterior sampling): the manuscript does not specify whether structure-factor amplitudes enter the reverse SDE through a differentiable forward model, classifier-free guidance, or an auxiliary likelihood term. Without this detail it is impossible to assess whether the sampled structures remain physically consistent or whether the final refinement step is still required to achieve the reported metrics.
  2. [Table 2, §4.3] Table 2 and §4.3: the reported 33× runtime reduction and RMSD/R-factor gains are presented without error bars, without the number of independent runs, and without explicit listing of the baseline methods’ hyper-parameters or convergence criteria. These omissions prevent evaluation of whether the improvements are statistically robust or sensitive to implementation details.
  3. [§4.1] §4.1 (Datasets): the claim that the pre-trained prior generalizes to new experimental amplitudes rests on an unverified assumption of limited domain shift. No ablation is shown that isolates the effect of the conditioning mechanism from the subsequent refinement stage.
minor comments (2)
  1. [Figure 3] Figure 3 caption: the electron-density isosurface threshold is not stated, making visual comparison of the maps difficult to reproduce.
  2. [Eq. (7)] Notation: the symbol for the structure-factor amplitude likelihood is introduced without an explicit definition linking it to the standard crystallographic |F| term.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive comments on our manuscript describing CrystalBoltz. We have carefully considered each major comment and provide point-by-point responses below, indicating where revisions will be made to address the concerns.

read point-by-point responses
  1. Referee: [§3.2] §3.2 (Posterior sampling): the manuscript does not specify whether structure-factor amplitudes enter the reverse SDE through a differentiable forward model, classifier-free guidance, or an auxiliary likelihood term. Without this detail it is impossible to assess whether the sampled structures remain physically consistent or whether the final refinement step is still required to achieve the reported metrics.

    Authors: We agree that the description of the posterior sampling procedure in §3.2 lacks sufficient technical detail. The structure-factor amplitudes are incorporated using an auxiliary likelihood term that is added to the reverse SDE, computed via a differentiable forward model that simulates the diffraction process from atomic coordinates. This approach ensures that the generated samples are guided towards physical consistency with the experimental data. The subsequent atomic refinement step is retained to fine-tune B-factors and resolve any minor inconsistencies, as is standard in crystallographic workflows. In the revised manuscript, we will expand §3.2 with a detailed description of this mechanism, including the mathematical formulation of the likelihood term and a note on the role of refinement. revision: yes

  2. Referee: [Table 2, §4.3] Table 2 and §4.3: the reported 33× runtime reduction and RMSD/R-factor gains are presented without error bars, without the number of independent runs, and without explicit listing of the baseline methods’ hyper-parameters or convergence criteria. These omissions prevent evaluation of whether the improvements are statistically robust or sensitive to implementation details.

    Authors: The referee correctly identifies that additional statistical details would strengthen the empirical claims. We will revise Table 2 to include error bars representing standard deviations over 5 independent runs for each metric. We will also add a new subsection or appendix that lists the hyper-parameters used for CrystalBoltz and all baseline methods, along with their convergence criteria. This will enable readers to better evaluate the robustness of the reported 33× speedup and quality improvements. revision: yes

  3. Referee: [§4.1] §4.1 (Datasets): the claim that the pre-trained prior generalizes to new experimental amplitudes rests on an unverified assumption of limited domain shift. No ablation is shown that isolates the effect of the conditioning mechanism from the subsequent refinement stage.

    Authors: We appreciate this point regarding the need for more rigorous validation of generalization. While the test proteins were chosen to be distinct from the PDB training set, we acknowledge that an explicit analysis of domain shift and an ablation study would be beneficial. In the revised manuscript, we will add an ablation experiment that compares the performance of the full CrystalBoltz pipeline against a version without the experiment-guided conditioning (i.e., using only the prior followed by refinement). Additionally, we will include a quantitative assessment of structural similarity between the training distribution and the test cases to address the domain shift concern. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected in derivation chain

full rationale

The paper presents CrystalBoltz as a new generative framework that shifts from a pre-trained diffusion prior over protein structures to experiment-guided posterior sampling conditioned on structure-factor amplitudes, followed by refinement. No equations or steps in the provided abstract or description reduce claimed outputs (lower RMSD, R-factors, 33x speedup) to inputs by construction, nor do they rely on self-citations for uniqueness theorems or ansatzes that would make the central Bayesian inference claim tautological. The method introduces novel conditioning and sampling procedures that are evaluated empirically on external datasets, rendering the derivation self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no explicit free parameters, axioms, or invented entities; the central claim rests on the assumption that pre-trained diffusion priors transfer effectively to new experimental conditioning.

pith-pipeline@v0.9.0 · 5767 in / 944 out tokens · 47238 ms · 2026-05-20T20:11:33.156037+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

47 extracted references · 47 canonical work pages · 2 internal anchors

  1. [1]

    Ballard, Joshua Bambrick, Sebastian W

    Josh Abramson, Jonas Adler, Jack Dunger, Richard Evans, Tim Green, Alexander Pritzel, Olaf Ronneberger, Laurence Willmore, Andrew J. Ballard, Joseph Bambrick, Sebastian Bodenstein, et al. Accurate structure prediction of biomolecular interactions with alphafold 3.Nature, 2024. doi: 10.1038/s41586-024-07487-w

  2. [2]

    Towards automated crystallographic structure refinement with phenix

    Pavel V Afonine, Ralf W Grosse-Kunstleve, Nathaniel Echols, Jeffrey J Headd, Nigel W Moriarty, Marat Mustyakimov, Thomas C Terwilliger, Alexandre Urzhumtsev, Peter H Zwart, and Paul D Adams. Towards automated crystallographic structure refinement with phenix. refine.Biological crystallography, 68(4):352–367, 2012

  3. [3]

    Bulk-solvent and overall scaling revisited: faster calculations, improved results.Biological Crystallography, 69(4): 625–634, 2013

    PV Afonine, RW Grosse-Kunstleve, PD Adams, and A Urzhumtsev. Bulk-solvent and overall scaling revisited: faster calculations, improved results.Biological Crystallography, 69(4): 625–634, 2013

  4. [4]

    Open- fold: retraining alphafold2 yields new insights into its learning mechanisms and capacity for generalization.Nature methods, 21(8):1514–1524, 2024

    Gustaf Ahdritz, Nazim Bouatta, Christina Floristean, Sachin Kadyan, Qinghui Xia, William Gerecke, Timothy J O’Donnell, Daniel Berenberg, Ian Fisk, Niccolò Zanichelli, et al. Open- fold: retraining alphafold2 yields new insights into its learning mechanisms and capacity for generalization.Nature methods, 21(8):1514–1524, 2024

  5. [5]

    Kinch, R

    Minkyung Baek, Frank DiMaio, Ivan Anishchenko, Justas Dauparas, Sergey Ovchinnikov, Gyu Rie Lee, Jue Wang, Qian Cong, Lisa N. Kinch, R. Dustin Schaeffer, Claudia Millán, Hahnbeom Park, Carson Adams, Caleb R. Glassman, Andy DeGiovanni, Jose H. Pereira, Andria V . Rodrigues, Alberdina A. van Dijk, Ana C. Ebrecht, Diederik J. Opperman, Theo Sagmeister, Chris...

  6. [6]

    Berman, John Westbrook, Zukang Feng, Gary Gilliland, T

    Helen M. Berman, John Westbrook, Zukang Feng, Gary Gilliland, T. N. Bhat, Helge Weissig, Ilya N. Shindyalov, and Philip E. Bourne. The Protein Data Bank.Nucleic Acids Research, 28 (1):235–242, 2000. doi: 10.1093/nar/28.1.235

  7. [7]

    Free R value: a novel statistical quantity for assessing the accuracy of crystal structures.Nature, 355(6359):472–475, 1992

    Axel T Brünger. Free R value: a novel statistical quantity for assessing the accuracy of crystal structures.Nature, 355(6359):472–475, 1992

  8. [8]

    De novo design of all-atom biomolecular interactions with rfdiffusion3.bioRxiv, 2025

    Jasper Butcher, Rohith Krishna, Raktim Mitra, Rafael I Brent, Yanjing Li, Nathaniel Corley, Paul T Kim, Jonathan Funk, Simon Mathis, Saman Salike, et al. De novo design of all-atom biomolecular interactions with rfdiffusion3.bioRxiv, 2025

  9. [9]

    Diffusion posterior sampling for general noisy inverse problems

    Hyungjin Chung, Jeongsol Kim, Michael Thompson Mccann, Marc Louis Klasky, and Jong Chul Ye. Diffusion posterior sampling for general noisy inverse problems. InThe Eleventh Inter- national Conference on Learning Representations, 2023. URL https://openreview.net/ forum?id=OnD9zGAGT0k

  10. [10]

    A Survey on Diffusion Models for Inverse Problems

    Giannis Daras, Hyungjin Chung, Chieh-Hsin Lai, Yuki Mitsufuji, Jong Chul Ye, Peyman Milanfar, Alexandros G. Dimakis, and Mauricio Delbracio. A survey on diffusion models for inverse problems, 2024. URLhttps://arxiv.org/abs/2410.00083

  11. [11]

    Sampling alternative conformational states of transporters and receptors with alphafold2.elife, 11:e75751, 2022

    Diego Del Alamo, Davide Sala, Hassane S Mchaourab, and Jens Meiler. Sampling alternative conformational states of transporters and receptors with alphafold2.elife, 11:e75751, 2022

  12. [12]

    McCoy, Thomas C

    Alisia Fadini, Minhuan Li, Airlie J. McCoy, Thomas C. Terwilliger, Randy J. Read, Doeke R. Hekstra, and Mohammed AlQuraishi. Alphafold as a prior: experimental structure determination conditioned on a pretrained neural network.Nature Methods, 23(7):785–795, 2026. doi: 10.1038/s41592-026-03047-4

  13. [13]

    French and K

    S. French and K. Wilson. On the treatment of negative intensity observations.Acta Crystallo- graphica Section A, 34(4):517–525, 1978. doi: 10.1107/S0567739478001114

  14. [14]

    Facing the phase problem.IUCrJ, 10(5):521–543, 2023

    Wayne A Hendrickson. Facing the phase problem.IUCrJ, 10(5):521–543, 2023. doi: 10.1107/ S2052252523006449. 10

  15. [15]

    John Jumper, Richard Evans, Alexander Pritzel, Tim Green, Michael Figurnov, Olaf Ron- neberger, Kathryn Tunyasuvunakool, Russ Bates, Augustin Zıdek, Anna Potapenko, Alex Bridgland, Clemens Meyer, Simon A. A. Kohl, Andrew J. Ballard, Andrew Cowie, Bernardino Romera-Paredes, Stanislav Nikolov, Rishub Jain, Jonas Adler, Trevor Back, Stig Petersen, David Reim...

  16. [16]

    xds.Biological crystallography, 66(2):125–132, 2010

    Wolfgang Kabsch. xds.Biological crystallography, 66(2):125–132, 2010

  17. [17]

    Afsample2 predicts multiple conformations and ensembles with alphafold2.Communications biology, 8(1):373, 2025

    Yogesh Kalakoti and Björn Wallner. Afsample2 predicts multiple conformations and ensembles with alphafold2.Communications biology, 8(1):373, 2025

  18. [18]

    The success rate of processed predicted models in molecular replacement: implications for experimental phasing in the alphafold era.Biological Crystallography, 80(11), 2024

    Ronan M Keegan, Adam J Simpkin, and Daniel J Rigden. The success rate of processed predicted models in molecular replacement: implications for experimental phasing in the alphafold era.Biological Crystallography, 80(11), 2024

  19. [19]

    Dual ascent diffusion for inverse problems

    Minseo Kim, Axel Levy, and Gordon Wetzstein. Dual ascent diffusion for inverse problems. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2026

  20. [20]

    Chan, Sara Fridovich-Keil, Frederic Poitevin, Ellen D

    Axel Levy, Eric R. Chan, Sara Fridovich-Keil, Frederic Poitevin, Ellen D. Zhong, and Gordon Wetzstein. Solving inverse problems in protein space using diffusion-based priors, 2024. URL https://arxiv.org/abs/2406.04239

  21. [21]

    Sfcalculator: connecting deep generative models and crystallography.bioRxiv, pages 2025–01, 2025

    Minhuan Li, Kevin M Dalton, and Doeke Romke Hekstra. Sfcalculator: connecting deep generative models and crystallography.bioRxiv, pages 2025–01, 2025

  22. [22]

    Robust Inference-Time Steering of Protein Diffusion Models via Embedding Optimization

    Minhuan Li, Jiequn Han, Pilar Cossio, and Luhuan Wu. Robust inference-time steering of protein diffusion models via embedding optimization.arXiv preprint arXiv:2602.05285, 2026

  23. [23]

    Afonine, Matthew L

    Dorothee Liebschner, Pavel V . Afonine, Matthew L. Baker, Gábor Bunkóczi, Vincent B. Chen, Tristan I. Croll, Bradley Hintze, Li-Wei Hung, Swati Jain, Airlie J. McCoy, Nigel W. Moriarty, Robert D. Oeffner, Billy K. Poon, Mikhail G. Prisant, Randy J. Read, Jane S. Richardson, David C. Richardson, Michael D. Sammito, Oleg V . Sobolev, Daniel H. Stockwell, Th...

  24. [24]

    doi: 10.1107/S2059798319011471

  25. [25]

    Inverse problems with experiment-guided alphafold

    Sai Advaith Maddipatla, Nadav Bojan, Meital Bojan, Sanketh Vedula, Ailie Marx, Paul Schanda, and Alexander Bronstein. Inverse problems with experiment-guided alphafold. InICLR 2025 Workshop on Generative and Experimental Perspectives for Biomolecular Design, 2025. URL https://openreview.net/forum?id=1gp130uxfw

  26. [26]

    Implications of alphafold2 for crys- tallographic phasing by molecular replacement.Acta Crystallographica Section D: Structural Biology, 78(1):1–13, 2022

    Airlie J McCoy, Massimo D Sammito, and Randy J Read. Implications of alphafold2 for crys- tallographic phasing by molecular replacement.Acta Crystallographica Section D: Structural Biology, 78(1):1–13, 2022. doi: 10.1107/S2059798321012122

  27. [27]

    G. N. Murshudov, A. A. Vagin, and E. J. Dodson. Refinement of macromolecular structures by the maximum-likelihood method.Acta Crystallographica Section D: Biological Crystallogra- phy, 53(3):240–255, 1997. doi: 10.1107/S0907444996012255

  28. [28]

    G. N. Murshudov, P. Skubák, A. A. Lebedev, N. S. Pannu, R. A. Steiner, R. A. Nicholls, M. D. Winn, F. Long, and A. A. Vagin. Refmac5 for the refinement of macromolecular crystal structures.Acta Crystallographica Section D: Biological Crystallography, 67(4):355–367,

  29. [29]

    doi: 10.1107/S0907444911001314

  30. [30]

    Putting alphafold models to work with phenix

    Robert D Oeffner, Tristan I Croll, Claudia Millán, Billy K Poon, Christopher J Schlicksup, Randy J Read, and Tom C Terwilliger. Putting alphafold models to work with phenix. pro- cess_predicted_model and isolde.Biological Crystallography, 78(11):1303–1314, 2022. 11

  31. [31]

    Boltz-2: Towards accurate and efficient binding affinity prediction.bioRxiv, 2025

    Saro Passaro, Gabriele Corso, Jeremy Wohlwend, Mateo Reveiz, Stephan Thaler, Vignesh Ram Somnath, Noah Getz, Tally Portnoi, Julien Roy, Hannes Stark, David Kwabi-Addo, Dominique Beaini, Tommi Jaakkola, and Regina Barzilay. Boltz-2: Towards accurate and efficient binding affinity prediction.bioRxiv, 2025

  32. [32]

    Rishwanth Raghu, Axel Levy, Gordon Wetzstein, and Ellen D. Zhong. Multiscale guidance of protein structure prediction with heterogeneous cryo-em data. InNeurIPS, 2025

  33. [33]

    Randy J. Read. Structure-factor probabilities for related structures.Acta Crystallographica Section A, 46(11):900–912, 1990. doi: 10.1107/S0108767390005529

  34. [34]

    A log-likelihood-gain intensity target for crystallographic phasing that accounts for experimental error.Biological Crystallography, 72(3):375–387, 2016

    Randy J Read and Airlie J McCoy. A log-likelihood-gain intensity target for crystallographic phasing that accounts for experimental error.Biological Crystallography, 72(3):375–387, 2016

  35. [35]

    Complementary Science

    Gale Rhodes.Crystallography Made Crystal Clear: A Guide for Users of Macromolecular Models. Complementary Science. Academic Press, Amsterdam, 3rd edition, 2006. ISBN 978-0-12-587073-3

  36. [36]

    Garland Science, New York, 1st edition, 2009

    Bernhard Rupp.Biomolecular Crystallography: Principles, Practice, and Application to Structural Biology. Garland Science, New York, 1st edition, 2009. ISBN 9780429258756

  37. [37]

    Pergamon Press, Oxford, 1976

    Ramachandran Srinivasan and Soundarajan Parthasarathy.Some Statistical Applications in X-ray Crystallography. Pergamon Press, Oxford, 1976. ISBN 0080180469

  38. [38]

    Speach_af: Sampling protein ensembles and conformational heterogeneity with alphafold2.PLoS computational biology, 18(8):e1010483, 2022

    Richard A Stein and Hassane S Mchaourab. Speach_af: Sampling protein ensembles and conformational heterogeneity with alphafold2.PLoS computational biology, 18(8):e1010483, 2022

  39. [39]

    The phase problem.Biological Crystallography, 59(11):1881–1890, 2003

    Garry Taylor. The phase problem.Biological Crystallography, 59(11):1881–1890, 2003

  40. [40]

    Protenix-advancing structure prediction through a comprehensive alphafold3 reproduction.BioRxiv, pages 2025–01, 2025

    ByteDance AML AI4Science Team, Xinshi Chen, Yuxuan Zhang, Chan Lu, Wenzhi Ma, Jiaqi Guan, Chengyue Gong, Jincai Yang, Hanyu Zhang, Ke Zhang, et al. Protenix-advancing structure prediction through a comprehensive alphafold3 reproduction.BioRxiv, pages 2025–01, 2025

  41. [41]

    Terwilliger et al

    Thomas C. Terwilliger et al. Alphafold predictions are valuable hypotheses and accelerate but do not replace experimental structure determination.Nature Methods, 2023. doi: 10.1038/ s41592-023-02031-9

  42. [42]

    Alphafold-guided molecular replacement for solving challenging crystal structures.Acta Crystallographica Section D: Structural Biology, 81:4–21, 2025

    Wei Wang, Zhening Gong, and Wayne A Hendrickson. Alphafold-guided molecular replacement for solving challenging crystal structures.Acta Crystallographica Section D: Structural Biology, 81:4–21, 2025. doi: 10.1107/S2059798324011999

  43. [43]

    Predicting multiple conformations via sequence clustering and alphafold2.Nature, 625(7996):832–839, 2024

    Hannah K Wayment-Steele, Adedolapo Ojoawo, Renee Otten, Julia M Apitz, Warintra Pit- sawong, Marc Hömberger, Sergey Ovchinnikov, Lucy Colwell, and Dorothee Kern. Predicting multiple conformations via sequence clustering and alphafold2.Nature, 625(7996):832–839, 2024

  44. [44]

    xia2: an expert system for macromolecular crystallography data reduction

    Graeme Winter. xia2: an expert system for macromolecular crystallography data reduction. Journal of Applied Crystallography, 43(1):186–190, 2010. doi: 10.1107/S0021889809045701

  45. [45]

    Dials: implementation and evaluation of a new integration package.Biological Crystal- lography, 74(2):85–97, 2018

    Graeme Winter, David G Waterman, James M Parkhurst, Aaron S Brewster, Richard J Gildea, Markus Gerstel, Luis Fuentes-Montero, Melanie V ollmar, Tara Michels-Clark, Iris D Young, et al. Dials: implementation and evaluation of a new integration package.Biological Crystal- lography, 74(2):85–97, 2018

  46. [46]

    Boltz-1: Democratizing biomolecular interaction modeling.bioRxiv, 2024

    Jeremy Wohlwend, Gabriele Corso, Saro Passaro, Mateo Reveiz, Ken Leidal, Wojtek Swiderski, Tally Portnoi, Itamar Chinn, Jacob Silterra, Tommi Jaakkola, and Regina Barzilay. Boltz-1: Democratizing biomolecular interaction modeling.bioRxiv, 2024. doi: 10.1101/2024.11.19. 624167. URLhttps://www.biorxiv.org/content/10.1101/2024.11.19.624167v2

  47. [47]

    − |Eo(⃗h)|2 + σA|Ec(⃗h)| 2 Σ2 ⃗h # I0 2σA|Eo(⃗h)||Ec(⃗h)| Σ2 ⃗h ! , (11) pc |Eo(⃗h)|;|E c(⃗h)| =

    Bingliang Zhang, Wenda Chu, Julius Berner, Chenlin Meng, Anima Anandkumar, and Yang Song. Improving diffusion inverse problem solving with decoupled noise annealing. In International Conference on Learning Representations, 2025. 12 A Technical Appendices and Supplementary Material A.1 Rice distribution likelihood This section provides the derivation and d...