pith. machine review for the scientific record. sign in

arxiv: 2602.16548 · v2 · submitted 2026-02-18 · 💻 cs.LG

Recognition: 1 theorem link

· Lean Theorem

RIDER: 3D RNA Inverse Design with Reinforcement Learning-Guided Diffusion

Authors on Pith no claims yet

Pith reviewed 2026-05-15 21:16 UTC · model grok-4.3

classification 💻 cs.LG
keywords RNA inverse design3D structurediffusion modelreinforcement learningstructural similaritysynthetic biologysequence designGNN
0
0 comments X

The pith

Reinforcement learning fine-tunes a diffusion model to design RNA sequences whose 3D folds match target structures far more closely than sequence-recovery methods allow.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents RIDER as a two-stage approach: first pre-train a graph neural network diffusion model conditioned on a target 3D RNA structure to generate candidate sequences, then refine it with policy-gradient reinforcement learning whose rewards come from four 3D self-consistency metrics. Earlier methods maximized recovery of the native sequence, yet that metric does not guarantee structural fidelity because many sequences can fold into the same shape. By shifting the objective to direct structural similarity, RIDER reports more than doubling the agreement on the chosen metrics while also producing sequences that differ from the native one. This matters for synthetic biology because reliable 3D designs are needed to engineer RNAs that perform intended regulatory or catalytic roles in cells.

Core claim

RIDER first pre-trains a GNN-based generative diffusion model on target 3D structures and reaches a 9 percent gain in native sequence recovery over prior methods. It then applies an improved policy-gradient algorithm that uses four task-specific 3D self-consistency metrics as rewards. The resulting model improves structural similarity by more than 100 percent across all four metrics and yields designed sequences that are measurably different from the native sequences yet still fold more consistently with the target.

What carries the argument

Policy-gradient fine-tuning of a pre-trained GNN diffusion model driven by four 3D self-consistency reward functions

If this is right

  • Native sequence recovery rises by 9 percent relative to prior state-of-the-art methods.
  • Structural similarity more than doubles on every self-consistency metric tested.
  • The method produces RNA sequences that differ from the native sequence while still improving structural match.
  • Optimization occurs directly on folding consistency rather than on sequence identity alone.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The designs could shorten the experimental validation loop in RNA therapeutics by supplying sequences already closer to the desired fold.
  • Scaling the same reward-driven fine-tuning to larger or multi-domain RNAs would test whether the reported gains persist outside the current test set.
  • Replacing one or more of the computational rewards with direct experimental measurements could close the remaining gap between predicted and actual folding.

Load-bearing premise

The four chosen 3D self-consistency metrics are reliable stand-ins for whether the designed sequence will fold correctly and function as intended in practice.

What would settle it

An experimental folding assay or functional assay on the output sequences that shows they do not achieve the reported structural similarity or biological activity.

Figures

Figures reproduced from arXiv: 2602.16548 by Biao Luo, Ke Li, Tianmeng Hu, Yongzheng Cui.

Figure 1
Figure 1. Figure 1: Visualization of sequences (a), (b), and (c), and their corresponding 3D structures (α), (β), and (γ) predicted by RhoFold [42]. Although sequences (a) and (b) differ by only 3 nucleotides, and (b) and (c) by 5 nucleotides, their folded structures exhibit clear differences. recovery (NSR) [25]. Optimizing solely for NSR is problematic for RNA design. Unlike proteins, the relationship be￾tween RNA sequence … view at source ↗
Figure 2
Figure 2. Figure 2: Overview of the RIDER framework. RNA tertiary structures are processed by a GVP￾GNN encoder to produce structural embeddings. These embeddings condition the diffusion model for sequence generation, which is further optimized by RL to maximize structural similarity. are equivariant to 3D rotations and translations. These embeddings compactly summarize the local and global geometry of the RNA backbone and se… view at source ↗
Figure 3
Figure 3. Figure 3: Results of supervised learning pre-training. A. Comparison of native sequence recovery on the test set. RIDE is compared against RiboDiffusion and gRNAde. The best NSR among 16 designs per target (sampled at temperature 0.1) is reported. B. Relationship between NSR and structural similarity (GDT TS and RMSD) for RIDE designs. Color denotes RMSD. C. Correlation among GDT TS, TM-score, and RMSD for the desig… view at source ↗
Figure 4
Figure 4. Figure 4: Results of reinforcement learning fine-tuning. A. GDT TS comparison on 14 RNA struc￾tures of interest [9] for gRNAde, RIDE (pre-trained), and RIDER (fine-tuned with Rgdt rmsd). B. Com￾parison of native sequence recovery before (RIDE) and after (RIDER) RL fine-tuning. Color indicates GDT TS after RL fine-tuning. The results for the other two metrics are provided in Appendix D.4. Native sequence recovery doe… view at source ↗
Figure 5
Figure 5. Figure 5: Visualization of designed examples. Structures folded from sequences generated by RIDER are shown in color, while the target structures are shown in semi-transparent yellow [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: The proposed framework consists of two stages: RIDE (pre-training via supervised learning) and RIDER (fine-tuning via reinforcement learning). A multi-layer GVP-GNN encoder-decoder back￾bone is used to iteratively denoise RNA sequences conditioned on the target 3D structure. Structural fidelity is evaluated using a separate RNA structure predictor and fed back through a reward function [PITH_FULL_IMAGE:fi… view at source ↗
Figure 7
Figure 7. Figure 7: Comparison of RIDER, RIDER-RWD, and RIDER-ADV on the GDT TS metric across the test set. Bar heights represent the mean, and red error bars indicate the Standard Error of the Mean (SEM). 12.820 4.825 3.368 RMSD 0 10 20 30 40 50 RIDER RIDER-ADV RIDER-RWD [PITH_FULL_IMAGE:figures/full_fig_p020_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Comparison of RIDER, RIDER-RWD, and RIDER-ADV on the RMSD metric across the test set. Bar heights represent the mean, and red error bars indicate the SEM. C 3D Structure Self-Consistency Metrics To evaluate how accurately a designed RNA sequence folds back into its intended 3D backbone struc￾ture, we adopt three complementary global metrics: Root Mean Square Deviation (RMSD), Global _ 20/27 [PITH_FULL_IMA… view at source ↗
Figure 9
Figure 9. Figure 9: Comparison of RIDER, RIDER-RWD, and RIDER-ADV on the TM-score metric across the test set. Bar heights represent the mean, and red error bars indicate the SEM. Distance Test Total Score (GDT TS), and Template Modeling Score (TM-score). These metrics cap￾ture distinct aspects of structural similarity and are defined as follows. C.1 Root-Mean-Square Deviation (RMSD) RMSD quantifies the average atomic displace… view at source ↗
Figure 10
Figure 10. Figure 10: GDT TS scores of RIDER, RIDER-RWD, and RIDER-ADV on 14 target RNA structures. RIDER-RWD RIDER-ADV RIDER RMSD 0 5 10 15 20 25 30 RNA 1 2 3 4 5 6 7 8 9 10 11 12 13 14 [PITH_FULL_IMAGE:figures/full_fig_p022_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: RMSD scores of RIDER, RIDER-RWD, and RIDER-ADV on 14 target RNA structures. Lower values indicate better structural alignment. RIDER-RWD RIDER-ADV RIDER TM-score 0 0.2 0.4 0.6 0.8 RNA 1 2 3 4 5 6 7 8 9 10 11 12 13 14 [PITH_FULL_IMAGE:figures/full_fig_p022_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: TM-score values of RIDER, RIDER-RWD, and RIDER-ADV on 14 target RNA structures. Higher values indicate better structural similarity. This length correction makes TM-score more stringent for short structures (smaller d0) and more permissive for longer ones, enabling fair comparison across proteins or RNAs of varying lengths. _ 22/27 [PITH_FULL_IMAGE:figures/full_fig_p022_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Comparison of four reward functions (Rtm , Rrmsd , Rgdt , Rgdt rmsd) on the GDT TS metric across the test set. Bar heights represent the mean, and red error bars indicate the SEM. C.4 Comparison of Metrics RMSD provides a direct measure of average atomic deviation and is highly sensitive to outliers. GDT TS quantifies the percentage of residues that fall within multiple distance thresholds, empha￾sizing w… view at source ↗
Figure 14
Figure 14. Figure 14: Comparison of four reward functions on the RMSD metric across the test set. Bar heights represent the mean, and red error bars indicate the SEM. TM-score 0 0.2 0.4 0.6 0.8 Rgdt-rmsd Rgdt Rrmsd Rtm [PITH_FULL_IMAGE:figures/full_fig_p024_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: Comparison of four reward functions on the TM-score metric across the test set. Bar heights represent the mean, and red error bars indicate the SEM. D.2 Reward Functions Figures 13, 14, and 15 present the evaluation results of the four reward functions we designed, measured across all three structural similarity metrics on the test set. Among them, the reward function based solely on TM-score, Rtm, perfor… view at source ↗
Figure 16
Figure 16. Figure 16: RMSD comparison of gRNAde, RIDE, and RIDER on 14 RNA structures of interest. Lower RMSD indicates better structural alignment. gRNAde RIDE RIDER TM-score 0 0.2 0.4 0.6 0.8 RNA 1 2 3 4 5 6 7 8 9 10 11 12 13 14 [PITH_FULL_IMAGE:figures/full_fig_p026_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: TM-score comparison of gRNAde, RIDE, and RIDER on 14 RNA structures of interest. Higher TM-score reflects better global fold similarity. E Discussion E.1 Future work & limitations The proposed RIDER framework is flexible and can be extended in several directions. Future work may explore the design of multi-objective reward functions that incorporate additional factors such as sequence diversity or functio… view at source ↗
read the original abstract

The inverse design of RNA three-dimensional (3D) structures is crucial for engineering functional RNAs in synthetic biology and therapeutics. While recent deep learning approaches have advanced this field, they are typically optimized and evaluated using native sequence recovery, which is a limited surrogate for structural fidelity, since different sequences can fold into similar 3D structures and high recovery does not necessarily indicate correct folding. To address this limitation, we propose RIDER, an RNA Inverse DEsign framework with Reinforcement learning that directly optimizes for 3D structural similarity. First, we develop and pre-train a GNN-based generative diffusion model conditioned on the target 3D structure, achieving a 9% improvement in native sequence recovery over state-of-the-art methods. Then, we fine-tune the model with an improved policy gradient algorithm using four task-specific reward functions based on 3D self-consistency metrics. Experimental results show that RIDER improves structural similarity by over 100% across all metrics and discovers designs that are distinct from native sequences.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces RIDER, a two-stage framework for 3D RNA inverse design. A GNN-based diffusion model is first pre-trained to generate sequences conditioned on target 3D structures, yielding a 9% gain in native sequence recovery over prior methods. The model is then fine-tuned via an improved policy-gradient RL algorithm whose rewards are four 3D self-consistency metrics; the resulting designs are reported to improve structural similarity by >100% on the same metrics while producing sequences distinct from the native ones.

Significance. If the self-consistency metrics prove to be faithful proxies for functional folding, RIDER would represent a meaningful advance by shifting optimization from sequence recovery to direct structural fidelity. The pre-training improvement and the explicit use of RL to escape native-sequence bias are positive elements. The significance is currently conditional on independent validation that higher metric scores translate to improved biological function.

major comments (2)
  1. [Abstract] Abstract: the central claim of >100% improvement in structural similarity is evaluated on the identical four 3D self-consistency metrics that serve as RL rewards. Because the evaluation is performed on the reward functions themselves, the reported gains are expected by construction; the manuscript must demonstrate that these metrics correlate with functional properties (binding, catalysis, or independent folding free-energy calculations) outside the RL loop.
  2. [Experimental section] Experimental section (assumed §4): no ablation isolating the contribution of the RL fine-tuning stage versus the pre-trained diffusion model alone is described, nor are statistical tests or confidence intervals provided for the >100% structural gains. Without these controls it is impossible to attribute the improvement specifically to the RL component or to rule out overfitting to the reward metrics.
minor comments (1)
  1. [Abstract] Abstract: the phrase 'improved policy gradient algorithm' is used without reference to the specific variant or modification; a brief parenthetical description or citation would improve clarity.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for the detailed and constructive comments. We agree that additional controls are needed to strengthen attribution of improvements to the RL stage and to better contextualize the self-consistency metrics. We will revise the manuscript to incorporate the requested ablation study, statistical analyses, and expanded discussion while clarifying the scope of the current computational work.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central claim of >100% improvement in structural similarity is evaluated on the identical four 3D self-consistency metrics that serve as RL rewards. Because the evaluation is performed on the reward functions themselves, the reported gains are expected by construction; the manuscript must demonstrate that these metrics correlate with functional properties (binding, catalysis, or independent folding free-energy calculations) outside the RL loop.

    Authors: We acknowledge that the >100% structural gains are measured on the same four self-consistency metrics used as RL rewards; this is by design because the framework deliberately shifts optimization from native sequence recovery (addressed in pre-training) to direct structural fidelity. The RL stage enables discovery of non-native sequences that achieve higher structural scores, which would not be possible under a pure recovery objective. In the revision we will add a discussion section citing prior literature on the correlation of these metrics with folding accuracy in RNA structure prediction benchmarks. However, new independent functional validation (e.g., binding assays or free-energy calculations outside the training loop) lies beyond the scope of this computational study. revision: partial

  2. Referee: [Experimental section] Experimental section (assumed §4): no ablation isolating the contribution of the RL fine-tuning stage versus the pre-trained diffusion model alone is described, nor are statistical tests or confidence intervals provided for the >100% structural gains. Without these controls it is impossible to attribute the improvement specifically to the RL component or to rule out overfitting to the reward metrics.

    Authors: We agree that an explicit ablation and statistical reporting are required. The revised manuscript will include a new ablation table comparing the pre-trained diffusion model alone against the RL-fine-tuned model on all four structural metrics. We will also add paired statistical tests (e.g., Wilcoxon signed-rank) with p-values and 95% confidence intervals for the reported gains to quantify the RL contribution and address overfitting concerns. revision: yes

standing simulated objections not resolved
  • Demonstrating that the self-consistency metrics correlate with functional properties (binding, catalysis, or independent folding free-energy calculations) outside the RL loop

Circularity Check

0 steps flagged

No significant circularity in claimed results or derivation chain

full rationale

The paper describes an empirical pipeline: pre-train a GNN diffusion model for native sequence recovery, then apply RL fine-tuning with four 3D self-consistency metrics as rewards, and report experimental improvements on those same metrics versus baselines. No equations, derivations, or first-principles steps are shown that reduce the reported gains to self-referential definitions or fitted inputs by construction. The >100% structural similarity claim is framed as an outcome of applying the RL procedure to held-out test cases, which is a standard non-circular empirical result rather than a logical reduction to the inputs. The method is self-contained against external benchmarks and does not rely on load-bearing self-citations or ansatzes that collapse into the target claim.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no explicit free parameters, axioms, or invented entities; the method description implies standard diffusion and RL components without additional postulates.

pith-pipeline@v0.9.0 · 5480 in / 970 out tokens · 17500 ms · 2026-05-15T21:16:44.334102+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

58 extracted references · 58 canonical work pages · 3 internal anchors

  1. [1]

    Accurate structure prediction of biomolecular interactions with alphafold 3.Nature, 630(8016):493–500, 2024

    Josh Abramson, Jonas Adler, Jack Dunger, Richard Evans, Tim Green, Alexander Pritzel, Olaf Ronneberger, Lindsay Willmore, Andrew J Ballard, Joshua Bambrick, et al. Accurate structure prediction of biomolecular interactions with alphafold 3.Nature, 630(8016):493–500, 2024

  2. [2]

    Rnasolo: a repository of cleaned pdb-derived rna 3d structures.Bioinformatics, 38(14):3668–3670, 2022

    Bartosz Adamczyk, Maciej Antczak, and Marta Szachniuk. Rnasolo: a repository of cleaned pdb-derived rna 3d structures.Bioinformatics, 38(14):3668–3670, 2022

  3. [3]

    Ali, Abhinav Mittal, and David H

    Sara E. Ali, Abhinav Mittal, and David H. Mathews. RNA secondary structure analysis using RNAstructure.Curr. Protoc., 3(7):e846, 2023

  4. [4]

    Fejes, Frank Hutter, Holger H

    Mirela Andronescu, Anthony P. Fejes, Frank Hutter, Holger H. Hoos, and Anne Condon. A new algorithm for RNA secondary structure design.J. Mol. Biol., 336(3):607–24, 2004

  5. [5]

    Rock, scissors, paper: How rna structure informs function.The Plant Cell, 35(6):1671–1707, 2023

    Sarah M Assmann, Hong-Li Chou, and Philip C Bevilacqua. Rock, scissors, paper: How rna structure informs function.The Plant Cell, 35(6):1671–1707, 2023

  6. [6]

    Constitutional AI: Harmlessness from AI Feedback

    Yuntao Bai, Saurav Kadavath, Sandipan Kundu, Amanda Askell, Jackson Kernion, Andy Jones, Anna Chen, Anna Goldie, Azalia Mirhoseini, Cameron McKinnon, et al. Constitutional ai: Harmlessness from ai feedback.arXiv preprint arXiv:2212.08073, 2022. /book-open10/27

  7. [7]

    Training diffusion models with reinforcement learning

    Kevin Black, Michael Janner, Yilun Du, Ilya Kostrikov, and Sergey Levine. Training diffusion models with reinforcement learning. InThe Twelfth International Conference on Learning Rep- resentations, 2024

  8. [8]

    INFO-RNA–a fast approach to inverse RNA folding.Bioinfor- matics, 22(15):1823–31, 2006

    Anke Busch and Rolf Backofen. INFO-RNA–a fast approach to inverse RNA folding.Bioinfor- matics, 22(15):1823–31, 2006

  9. [9]

    Atomic accuracy in predicting and designing noncanonical rna structure.Nature methods, 7(4):291–294, 2010

    Rhiju Das, John Karanicolas, and David Baker. Atomic accuracy in predicting and designing noncanonical rna structure.Nature methods, 7(4):291–294, 2010

  10. [10]

    Robust deep learning–based protein sequence design using proteinmpnn.Science, 378(6615):49–56, 2022

    Justas Dauparas, Ivan Anishchenko, Nathaniel Bennett, Hua Bai, Robert J Ragotte, Lukas F Milles, Basile IM Wicky, Alexis Courbet, Rob J de Haas, Neville Bethel, et al. Robust deep learning–based protein sequence design using proteinmpnn.Science, 378(6615):49–56, 2022

  11. [11]

    Dykstra, Matias Kaplan, and Christina D

    Peter B. Dykstra, Matias Kaplan, and Christina D. Smolke. Engineering synthetic RNA devices for cell control.Nat. Rev. Genet, 23:215–228, 2022

  12. [12]

    Solving the RNA design problem with reinforcement learning.PLoS computational biology, 14(6):e1006176, 2018

    Peter Eastman, Jade Shi, Bharath Ramsundar, and Vijay S Pande. Solving the RNA design problem with reinforcement learning.PLoS computational biology, 14(6):e1006176, 2018

  13. [13]

    ERD: a fast and reliable tool for RNA design including constraints.BMC Bioinformatics, 16(20), 2015

    Ali Esmaili-Taheri and Mohammad Ganjtabesh. ERD: a fast and reliable tool for RNA design including constraints.BMC Bioinformatics, 16(20), 2015

  14. [14]

    Multi-objective genetic algorithm for pseudoknotted RNA sequence design.Bioinformatics, 30(9):1250–1258, 2014

    Ali Esmaili-Taheri, Mohammad Ganjtabesh, and Morteza Mohammad-Noori. Multi-objective genetic algorithm for pseudoknotted RNA sequence design.Bioinformatics, 30(9):1250–1258, 2014

  15. [15]

    Dpok: Reinforcement learning for fine-tuning text-to-image diffusion models.Advances in Neural Information Processing Systems, 36:79858–79885, 2023

    Ying Fan, Olivia Watkins, Yuqing Du, Hao Liu, Moonkyung Ryu, Craig Boutilier, Pieter Abbeel, Mohammad Ghavamzadeh, Kangwook Lee, and Kimin Lee. Dpok: Reinforcement learning for fine-tuning text-to-image diffusion models.Advances in Neural Information Processing Systems, 36:79858–79885, 2023

  16. [16]

    RNAiFOLD: a constraint programming algorithm for RNA inverse folding and molecular design.J

    Juan Antonio Garcia-Martin, Peter Clote, and Ivan Dotu. RNAiFOLD: a constraint programming algorithm for RNA inverse folding and molecular design.J. Bioinform. Comput. Biol., 11(2), 2013

  17. [17]

    RNAiFold 2.0: a web server and software to design custom and Rfam-based RNA molecules.Nucleic Acids Res., 43(W1):W513– 21, 2015

    Juan Antonio Garcia-Martin, Peter Clote, and Ivan Dotu. RNAiFold 2.0: a web server and software to design custom and Rfam-based RNA molecules.Nucleic Acids Res., 43(W1):W513– 21, 2015

  18. [18]

    DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

    Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Ruoyu Zhang, Runxin Xu, Qihao Zhu, Shirong Ma, Peiyi Wang, Xiao Bi, et al. Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning.arXiv preprint arXiv:2501.12948, 2025

  19. [19]

    The emerging field of RNA nanotechnology.Nat

    Peixuan Guo. The emerging field of RNA nanotechnology.Nat. Nanotechnol, 5:833–842, 2010

  20. [20]

    Denoising diffusion probabilistic models.Advances in neural information processing systems, 33:6840–6851, 2020

    Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models.Advances in neural information processing systems, 33:6840–6851, 2020

  21. [21]

    I. L. Hofacker, W. Fontana, P. F. Stadler, L. S. Bonhoeffer, M. Tacker, and P. Schuster. Fast folding and comparison of RNA secondary structures.Monatsh. Chem., 125:167–188, 1994

  22. [22]

    Fast folding and comparison of RNA secondary structures.Monatshefte fur chemie, 125:167–167, 1994

    Ivo L Hofacker, Walter Fontana, Peter F Stadler, L Sebastian Bonhoeffer, Manfred Tacker, Peter Schuster, et al. Fast folding and comparison of RNA secondary structures.Monatshefte fur chemie, 125:167–167, 1994

  23. [23]

    Ribodiffusion: tertiary structure- based rna inverse folding with generative diffusion models.Bioinformatics, 40(Supplement 1): i347–i356, 2024

    Han Huang, Ziqian Lin, Dongchen He, Liang Hong, and Yu Li. Ribodiffusion: tertiary structure- based rna inverse folding with generative diffusion models.Bioinformatics, 40(Supplement 1): i347–i356, 2024. /book-open11/27

  24. [24]

    Learning from protein structure with geometric vector perceptrons

    Bowen Jing, Stephan Eismann, Patricia Suriana, Raphael John Lamarre Townshend, and Ron Dror. Learning from protein structure with geometric vector perceptrons. InInternational Con- ference on Learning Representations, 2021

  25. [25]

    grnade: Geometric deep learning for 3d rna inverse design

    Chaitanya K Joshi, Arian R Jamasb, Ramon Vi˜ nas, Charles Harris, Simon V Mathis, Alex Morehead, Rishabh Anand, and Pietro Li` o. grnade: Geometric deep learning for 3d rna inverse design. InProc. International Conference on Learning Representations, 2025

  26. [26]

    A solution for the best rotation to relate two sets of vectors.Foundations of Crystallography, 32(5):922–923, 1976

    Wolfgang Kabsch. A solution for the best rotation to relate two sets of vectors.Foundations of Crystallography, 32(5):922–923, 1976

  27. [27]

    Champion-level drone racing using deep reinforcement learning.Nature, 620 (7976):982–987, 2023

    Elia Kaufmann, Leonard Bauersfeld, Antonio Loquercio, Matthias M¨ uller, Vladlen Koltun, and Davide Scaramuzza. Champion-level drone racing using deep reinforcement learning.Nature, 620 (7976):982–987, 2023

  28. [28]

    antaRNA: ant colony-based RNA sequence design.Bioinformatics, 31(19):3114–3121, 2015

    Robert Kleinkauf, Martin Mann, and Rolf Backofen. antaRNA: ant colony-based RNA sequence design.Bioinformatics, 31(19):3114–3121, 2015

  29. [29]

    Macromolecular modeling and design in rosetta: recent methods and frameworks.Nature methods, 17(7):665–680, 2020

    Julia Koehler Leman, Brian D Weitzner, Steven M Lewis, Jared Adolf-Bryfogle, Nawsad Alam, Rebecca F Alford, Melanie Aprahamian, David Baker, Kyle A Barlow, Patrick Barth, et al. Macromolecular modeling and design in rosetta: recent methods and frameworks.Nature methods, 17(7):665–680, 2020

  30. [30]

    Integrating end-to-end learning with deep geometrical potentials for ab initio rna structure pre- diction.Nature Communications, 14(1):5745, 2023

    Yang Li, Chengxin Zhang, Chenjie Feng, Robin Pearce, P Lydia Freddolino, and Yang Zhang. Integrating end-to-end learning with deep geometrical potentials for ab initio rna structure pre- diction.Nature Communications, 14(1):5745, 2023

  31. [31]

    From sentences to sequences: Rethinking languages in biological system.arXiv preprint arXiv:2507.00953, 2025

    Ke Liu, Shuaike Shen, and Hao Chen. From sentences to sequences: Rethinking languages in biological system.arXiv preprint arXiv:2507.00953, 2025

  32. [32]

    ViennaRNA package 2.0.Algorithms for molecular biology, 6:1–14, 2011

    Ronny Lorenz, Stephan H Bernhart, Christian H¨ oner zu Siederdissen, Hakim Tafer, Christoph Flamm, Peter F Stadler, and Ivo L Hofacker. ViennaRNA package 2.0.Algorithms for molecular biology, 6:1–14, 2011

  33. [33]

    Markham and Michael Zuker

    Nicholas R. Markham and Michael Zuker. UNAFold: software for nucleic acid folding and hy- bridization.Methods Mol Biol., 453:3–31, 2008

  34. [34]

    Shape-guided rna structure homology search and motif discovery.Nature Communications, 13(1):1722, 2022

    Edoardo Morandi, Martijn J van Hemert, and Danny Incarnato. Shape-guided rna structure homology search and motif discovery.Nature Communications, 13(1):1722, 2022

  35. [35]

    Training language models to follow instructions with human feedback.Advances in neural information processing systems, 35:27730– 27744, 2022

    Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, et al. Training language models to follow instructions with human feedback.Advances in neural information processing systems, 35:27730– 27744, 2022

  36. [36]

    Rna regulations and functions decoded by transcriptome-wide rna structure probing.Genomics, proteomics & bioinformatics, 15(5):267– 278, 2017

    Meiling Piao, Lei Sun, and Qiangfeng Cliff Zhang. Rna regulations and functions decoded by transcriptome-wide rna structure probing.Genomics, proteomics & bioinformatics, 15(5):267– 278, 2017

  37. [37]

    Automated 3d structure composition for large rnas.Nucleic acids research, 40(14):e112–e112, 2012

    Mariusz Popenda, Marta Szachniuk, Maciej Antczak, Katarzyna J Purzycka, Piotr Lukasiak, Natalia Bartol, Jacek Blazewicz, and Ryszard W Adamiak. Automated 3d structure composition for large rnas.Nucleic acids research, 40(14):e112–e112, 2012

  38. [38]

    Learning to design RNA

    Frederic Runge, Danny Stoll, Stefan Falkner, and Frank Hutter. Learning to design RNA. In Proceedings of the International Conference on Learning Representations, 2019

  39. [39]

    Proximal Policy Optimization Algorithms

    John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. Proximal policy optimization algorithms.arXiv preprint arXiv:1707.06347, 2017. /book-open12/27

  40. [40]

    A decade of riboswitches.Cell, 152:17–24, 2013

    Alexander Serganov and Evgeny Nudler. A decade of riboswitches.Cell, 152:17–24, 2013

  41. [41]

    Ribozymes, riboswitches and beyond: regulation of gene expression without proteins.Nature Reviews Genetics, 8(10):776–790, 2007

    Alexander Serganov and Dinshaw J Patel. Ribozymes, riboswitches and beyond: regulation of gene expression without proteins.Nature Reviews Genetics, 8(10):776–790, 2007

  42. [42]

    Accurate rna 3d structure prediction using a language model-based deep learning approach.Nature Methods, pages 1–12, 2024

    Tao Shen, Zhihang Hu, Siqi Sun, Di Liu, Felix Wong, Jiuming Wang, Jiayang Chen, Yixuan Wang, Liang Hong, Jin Xiao, et al. Accurate rna 3d structure prediction using a language model-based deep learning approach.Nature Methods, pages 1–12, 2024

  43. [43]

    Mastering the game of go without human knowledge.nature, 550(7676):354–359, 2017

    David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton, et al. Mastering the game of go without human knowledge.nature, 550(7676):354–359, 2017

  44. [44]

    Denoising diffusion implicit models

    Jiaming Song, Chenlin Meng, and Stefano Ermon. Denoising diffusion implicit models. InInter- national Conference on Learning Representations, 2021

  45. [45]

    Score-based generative modeling through stochastic differential equations

    Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative modeling through stochastic differential equations. InInternational Conference on Learning Representations, 2021

  46. [46]

    Taft, Ken C

    Ryan J. Taft, Ken C. Pang, Timothy R. Mercer, Marcel Dinger, and John S. Mattick. Non-coding RNAs: regulators of disease.J. Pathol., 220(2):126–39, 2010

  47. [47]

    Rdesign: Hierarchical data-efficient representation learning for tertiary structure-based rna design

    Cheng Tan, Yijie Zhang, Zhangyang Gao, Bozhen Hu, Siyuan Li, Zicheng Liu, and Stan Z Li. Rdesign: Hierarchical data-efficient representation learning for tertiary structure-based rna design. InProc. International Conference on Learning Representations, 2024

  48. [48]

    R3design: deep tertiary structure-based rna sequence design and beyond.Briefings in Bioinformatics, 26(1):bbae682, 2025

    Cheng Tan, Yijie Zhang, Zhangyang Gao, Hanqun Cao, Siyuan Li, Siqi Ma, Mathieu Blanchette, and Stan Z Li. R3design: deep tertiary structure-based rna sequence design and beyond.Briefings in Bioinformatics, 26(1):bbae682, 2025

  49. [49]

    Multi-objective genetic algorithm for pseudoknotted RNA sequence design.Front

    Akito Taneda. Multi-objective genetic algorithm for pseudoknotted RNA sequence design.Front. Genet., 26:3–36, 2012

  50. [50]

    Grandmaster level in starcraft ii using multi-agent reinforcement learning.nature, 575(7782):350–354, 2019

    Oriol Vinyals, Igor Babuschkin, Wojciech M Czarnecki, Micha¨ el Mathieu, Andrew Dudzik, Jun- young Chung, David H Choi, Richard Powell, Timo Ewalds, Petko Georgiev, et al. Grandmaster level in starcraft ii using multi-agent reinforcement learning.nature, 575(7782):350–354, 2019

  51. [51]

    Enhancing code llms with reinforcement learning in code generation.arXiv preprint arXiv:2412.20367, 2024

    Junqiao Wang, Zeng Zhang, Yangfan He, Yuyang Song, Tianyu Shi, Yuchen Li, Hengyuan Xu, Kunyu Wu, Guangwu Qian, Qiuwu Chen, et al. Enhancing code llms with reinforcement learning in code generation.arXiv preprint arXiv:2412.20367, 2024

  52. [52]

    trrosettarna: automated prediction of rna 3d structure with transformer network.Nature communications, 14(1):7266, 2023

    Wenkai Wang, Chenjie Feng, Renmin Han, Ziyi Wang, Lisha Ye, Zongyang Du, Hong Wei, Fa Zhang, Zhenling Peng, and Jianyi Yang. trrosettarna: automated prediction of rna 3d structure with transformer network.Nature communications, 14(1):7266, 2023

  53. [53]

    Deep generative design of rna aptamers using structural predictions.Nature Computational Science, pages 1–11, 2024

    Felix Wong, Dongchen He, Aarti Krishnan, Liang Hong, Alexander Z Wang, Jiuming Wang, Zhihang Hu, Satotaka Omori, Alicia Li, Jiahua Rao, et al. Deep generative design of rna aptamers using structural predictions.Nature Computational Science, pages 1–11, 2024

  54. [54]

    Processing and analysis of casp3 protein structure predictions.Proteins: Structure, Function, and Bioinformatics, 37(S3): 22–29, 1999

    Adam Zemla, ˇCeslovas Venclovas, John Moult, and Krzysztof Fidelis. Processing and analysis of casp3 protein structure predictions.Proteins: Structure, Function, and Bioinformatics, 37(S3): 22–29, 1999

  55. [55]

    Advances and opportunities in RNA structure experimental determination and computational modeling.Nature methods, 19 (10):1193–1207, 2022

    Jinsong Zhang, Yuhan Fei, Lei Sun, and Qiangfeng Cliff Zhang. Advances and opportunities in RNA structure experimental determination and computational modeling.Nature methods, 19 (10):1193–1207, 2022

  56. [56]

    Scoring function for automated assessment of protein structure template quality.Proteins: Structure, Function, and Bioinformatics, 57(4):702–710, 2004

    Yang Zhang and Jeffrey Skolnick. Scoring function for automated assessment of protein structure template quality.Proteins: Structure, Function, and Bioinformatics, 57(4):702–710, 2004. /book-open13/27

  57. [57]

    Large-scale reinforcement learning for diffusion models

    Yinan Zhang, Eric Tzeng, Yilun Du, and Dmitry Kislyuk. Large-scale reinforcement learning for diffusion models. InEuropean Conference on Computer Vision, pages 1–17. Springer, 2024

  58. [58]

    RNA de- sign via structure-aware multifrontier ensemble optimization.Bioinformatics, 39(Supplement 1): i563–i571, 2023

    Tianshuo Zhou, Ning Dai, Sizhen Li, Max Ward, David H Mathews, and Liang Huang. RNA de- sign via structure-aware multifrontier ensemble optimization.Bioinformatics, 39(Supplement 1): i563–i571, 2023. /book-open14/27 A Algorithm Pseudocode This section provides detailed pseudocode for the key algorithmic components of our proposed method. Specifically, we ...