pith. sign in

arxiv: 2602.06020 · v3 · pith:NTUPU3ZYnew · submitted 2026-02-05 · 💻 cs.LG · q-bio.BM

Two Stages of Folding: Convergent Mechanisms in AI Protein Folding Trunks

classification 💻 cs.LG q-bio.BM
keywords pairwisefoldingfeaturesmodelstrunksacrossblockscharge
0
0 comments X
read the original abstract

How do protein structure prediction models fold proteins? We investigate this question through causal interventions on the folding trunks of ESMFold, OpenFold, and Boltz-1. Across all three models, we find a shared two-stage computational structure. In the first stage, early blocks initialize pairwise biochemical signals: features like charge propagate from sequence into pairwise representations through architecture-specific pathways. In the second stage, late blocks develop pairwise spatial features: distance and contact information accumulate in the pairwise representation. We verify these mechanisms causally by showing that steering charge and distance features induces predictable structural changes. Furthermore, these representations are functionally interchangeable: pairwise states can be linearly aligned and substituted across models. Together, these results suggest that folding trunks with different architectures, inputs, and training procedures converge on a shared representational organization for mapping sequence chemistry into spatial geometry.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.