pith. sign in

FoldSAE: Learning to Steer Protein Folding Through Sparse Representations

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it
abstract

RFdiffusion is a popular and well-established model for generation of protein structures. However, this generative process offers limited insight into its internal representations and how they contribute to the final protein structure. Concurrently, recent work in mechanistic interpretability has successfully used Sparse Autoencoders (SAEs) to discover interpretable features within neural networks. We combine these concepts by applying SAE to the internal representations of RFdiffusion to uncover secondary structure-specific features and establish a relationship between them and generated protein structures. Building on these insights, we introduce a novel steering mechanism that enables precise control of secondary structure formation through a tunable hyperparameter, while simultaneously revealing interpretable block and neuron-level representations within RFdiffusion. Our work pioneers a new framework for making RFdiffusion more interpretable, demonstrating how understanding internal features can be directly translated into precise control over the protein design process.

fields

cs.LG 1

years

2026 1

verdicts

UNVERDICTED 1

clear filters

representative citing papers

citing papers explorer

Showing 1 of 1 citing paper after filters.

  • VFUSE: Virulent Feature Understanding with Sparse autoEncoders cs.LG · 2026-06-08 · unverdicted · none · ref 39 · internal anchor

    VFUSE applies sparse autoencoders to diffusion-transformer activations in RoseTTAFold3 and RFDiffusion3 to find monosemantic features that detect hazardous protein designs with AUROC up to 0.84.