pith. sign in

arxiv: 2511.02622 · v2 · pith:ASXCIV4Gnew · submitted 2025-11-04 · 🧬 q-bio.BM · physics.bio-ph· physics.comp-ph

Machine Learning for RNA Secondary Structure Prediction: a review of current methods and challenges

Pith reviewed 2026-05-21 19:23 UTC · model grok-4.3

classification 🧬 q-bio.BM physics.bio-phphysics.comp-ph
keywords RNA secondary structure predictionmachine learningdeep learninggeneralization crisisRNA foundation modelshomology-aware benchmarkingpseudoknotsdynamic ensembles
0
0 comments X

The pith

Machine learning models for RNA secondary structure prediction fail to generalize to new families, prompting stricter homology-aware benchmarking and the rise of foundation models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This review charts the move from thermodynamic calculations to data-driven machine learning and deep learning methods that learn RNA folding patterns directly from examples. These approaches have delivered clear gains in accuracy on structures similar to those in the training sets. The central observation is that even the strongest models break down when presented with RNA families absent from training data. This generalization failure has driven the adoption of evaluation protocols that keep training and test sequences from the same evolutionary family. To tackle the root problem of scarce labeled structures, the field has begun training foundation models on large unlabeled RNA sequence collections while flagging open problems such as pseudoknot prediction and modeling of structural dynamics.

Core claim

The authors establish that the field of RNA secondary structure prediction has entered a data-driven era dominated by machine learning models, yet these models exhibit a generalization crisis when applied to RNA families not represented in their training data. This crisis stems from overfitting to limited and homologous examples. The response has been a shift to homology-aware benchmarking that prevents leakage between training and test sets. To overcome data scarcity, RNA foundation models are emerging that learn from massive unlabeled sequence corpora. The review further identifies persistent hurdles including accurate prediction of pseudoknots, scaling to long transcripts, incorporation,

What carries the argument

The generalization crisis, the observed failure of high-accuracy models on new RNA families, which drives the call for homology-aware evaluation and foundation models trained on unlabeled sequences.

If this is right

  • Homology-aware benchmarking will give more trustworthy estimates of how well models will perform on novel sequences.
  • RNA foundation models trained on unlabeled data will reduce reliance on scarce labeled structures and improve performance across families.
  • Future methods must incorporate handling of pseudoknots and kilobase-scale transcripts to become broadly useful.
  • Shifting the prediction target to dynamic structural ensembles will align computational outputs more closely with biological function.
  • A standardized prospective benchmarking system will reduce biased validation and speed reliable progress.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • More robust predictions for unseen sequences could speed the design of RNA therapeutics that target previously uncharacterized transcripts.
  • If data limitations prove central, the foundation-model strategy may transfer to other biomolecular structure problems that also lack abundant labeled examples.
  • Testing hybrid approaches that embed biophysical constraints inside foundation models could distinguish whether current failures arise mainly from data volume or from missing physical principles.

Load-bearing premise

The documented failures of existing models on new RNA families primarily reflect overfitting or data scarcity rather than limitations in the underlying biophysical principles or experimental structure data quality used for training.

What would settle it

An experiment that augments training data with structures from many previously unseen RNA families, applies homology-aware splits, and still finds low accuracy on a fresh held-out family would indicate that data scarcity is not the dominant cause of poor generalization.

Figures

Figures reproduced from arXiv: 2511.02622 by Giovanni Bussi, Giuseppe Sacco, Guido Sanguinetti.

Figure 1
Figure 1. Figure 1: Schematic representation of thermodynamics-based RNA sec [PITH_FULL_IMAGE:figures/full_fig_p007_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Schematic representation of deep learning methods for RNA [PITH_FULL_IMAGE:figures/full_fig_p015_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Schematic representation of backbone training (above) and task [PITH_FULL_IMAGE:figures/full_fig_p019_3.png] view at source ↗
read the original abstract

Predicting the secondary structure of RNA is a core challenge in computational biology, essential for understanding molecular function and designing novel therapeutics. The field has evolved from foundational but accuracy-limited thermodynamic approaches to a new data-driven paradigm dominated by machine learning and deep learning. These models learn folding patterns directly from data, leading to significant performance gains. This review surveys the modern landscape of these methods, covering single-sequence, evolutionary-based, and hybrid models that blend machine learning with biophysics. A central theme is the field's "generalization crisis," where powerful models were found to fail on new RNA families, prompting a community-wide shift to stricter, homology-aware benchmarking. In response to the underlying challenge of data scarcity, RNA foundation models have emerged, learning from massive, unlabeled sequence corpora to improve generalization. Finally, we look ahead to the next set of major hurdles-including the accurate prediction of complex motifs like pseudoknots, scaling to kilobase-length transcripts, incorporating the chemical diversity of modified nucleotides, and shifting the prediction target from static structures to the dynamic ensembles that better capture biological function. We also highlight the need for a standardized, prospective benchmarking system to ensure unbiased validation and accelerate progress.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. This manuscript is a literature review surveying the evolution of machine learning methods for RNA secondary structure prediction. It describes the shift from thermodynamic approaches to data-driven single-sequence, evolutionary-based, and hybrid ML models that achieve performance gains by learning directly from data. The central theme is the field's generalization crisis, in which powerful models fail on new RNA families, motivating stricter homology-aware benchmarking and the emergence of RNA foundation models trained on large unlabeled sequence corpora. The review concludes by outlining open challenges including accurate prediction of pseudoknots, scaling to kilobase-length transcripts, incorporation of modified nucleotides, and prediction of dynamic structural ensembles rather than static structures, while calling for standardized prospective benchmarking.

Significance. If the synthesis of published results is representative and accurate, the review is significant for consolidating community trends around generalization failures and the pivot to foundation models and rigorous benchmarking. This provides a useful roadmap for the field by framing data scarcity and validation practices as key bottlenecks, without introducing new empirical claims or derivations. The descriptive nature of the work makes it a potential reference point for researchers entering the area or designing future experiments.

major comments (2)
  1. [Generalization crisis discussion] The central narrative on the generalization crisis (abstract and main discussion sections) asserts that models fail on new RNA families primarily due to overfitting and data scarcity. This framing would be strengthened by citing specific quantitative evidence, such as reported accuracy drops (e.g., F1 or MCC values) on held-out families from the key studies referenced, to distinguish this from alternative explanations like experimental data quality or biophysical limitations not captured in current training sets.
  2. [Future challenges] In the section outlining future challenges, the call for shifting from static structures to dynamic ensembles is presented as a major hurdle. However, the review does not address how existing ML architectures would need to be adapted for ensemble prediction (e.g., via probabilistic outputs or sampling methods), leaving the feasibility of this transition underexplored relative to its stated importance.
minor comments (2)
  1. [Abstract / Introduction] The abstract introduces 'RNA foundation models' without a concise definition or distinction from standard supervised models; this should be clarified early in the introduction for readers unfamiliar with the term.
  2. [Methods survey sections] Ensure that all cited works in the survey of single-sequence and hybrid models include complete references (e.g., DOIs or arXiv identifiers) to facilitate verification and reproducibility of the summarized results.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive review and recommendation for minor revision. The comments help clarify key aspects of our discussion on the generalization crisis and future challenges. We address each major comment below and have prepared revisions accordingly.

read point-by-point responses
  1. Referee: [Generalization crisis discussion] The central narrative on the generalization crisis (abstract and main discussion sections) asserts that models fail on new RNA families primarily due to overfitting and data scarcity. This framing would be strengthened by citing specific quantitative evidence, such as reported accuracy drops (e.g., F1 or MCC values) on held-out families from the key studies referenced, to distinguish this from alternative explanations like experimental data quality or biophysical limitations not captured in current training sets.

    Authors: We agree that explicit quantitative examples would strengthen the narrative. In the revised version, we will add specific reported performance drops (F1 and MCC decreases on held-out families) drawn directly from the key studies already cited in our review, such as those on single-sequence and evolutionary-based models. This addition will help differentiate overfitting and data scarcity from other factors like experimental noise or unmodeled biophysics, while remaining faithful to the published literature. revision: yes

  2. Referee: [Future challenges] In the section outlining future challenges, the call for shifting from static structures to dynamic ensembles is presented as a major hurdle. However, the review does not address how existing ML architectures would need to be adapted for ensemble prediction (e.g., via probabilistic outputs or sampling methods), leaving the feasibility of this transition underexplored relative to its stated importance.

    Authors: We acknowledge that a short discussion of architectural adaptations would improve balance. In revision, we will add a concise paragraph noting feasible directions such as probabilistic output layers, variational autoencoders for sampling, or ensemble averaging techniques, drawing on emerging work in related structural biology ML. We will emphasize that these remain exploratory and that the primary goal of the section is to identify the challenge rather than prescribe solutions. revision: yes

Circularity Check

0 steps flagged

No significant circularity in literature review

full rationale

This paper is a literature review that surveys existing machine learning methods for RNA secondary structure prediction, summarizes published results on generalization failures, and discusses community trends toward stricter benchmarking and foundation models. It introduces no new quantitative models, equations, fitted parameters, or derivations that could reduce to self-referential inputs. All central claims are presented as descriptive syntheses of external work rather than original empirical assertions requiring internal consistency checks. The discussion of challenges such as pseudoknots and dynamic ensembles is framed as open questions drawn from the broader field.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The review rests on standard domain assumptions in computational biology and machine learning for biomolecules rather than introducing new free parameters or invented entities.

axioms (1)
  • domain assumption Machine learning models can learn RNA folding patterns directly from sequence and structure data.
    This underpins the entire data-driven paradigm described in the abstract.

pith-pipeline@v0.9.0 · 5746 in / 1248 out tokens · 65719 ms · 2026-05-21T19:23:28.112407+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

94 extracted references · 94 canonical work pages

  1. [1]

    H., and Condon, A

    Andronescu, M., Bereg, V., Hoos, H. H., and Condon, A. (2008). RNA STRAND : The RNA Secondary Structure and Statistical Analysis Database . BMC Bioinformatics , 9(1):340

  2. [2]

    H., Mathews, D

    Andronescu, M., Condon, A., Hoos, H. H., Mathews, D. H., and Murphy, K. P. (2007). Efficient parameter estimation for RNA secondary structure prediction. Bioinformatics , 23(13):i19--i28

  3. [3]

    H., Mathews, D

    Andronescu, M., Condon, A., Hoos, H. H., Mathews, D. H., and Murphy, K. P. (2010). Computational approaches for RNA energy parameter estimation. RNA (New York, N.Y.) , 16(12):2304--2318

  4. [4]

    and Mathews, D

    Bellaousov, S. and Mathews, D. H. (2010). ProbKnot : Fast prediction of RNA secondary structure including pseudoknots. RNA (New York, N.Y.) , 16(10):1870--1880

  5. [5]

    H., Hofacker, I

    Bernhart, S. H., Hofacker, I. L., Will, S., Gruber, A. R., and Stadler, P. F. (2008). RNAalifold : Improved consensus structure prediction for RNA alignments. BMC Bioinformatics , 9(1):1--13

  6. [6]

    A., Purta, E., Pi a tkowski, P., Bagi \'n ski, B., Wirecki, T

    Boccaletto, P., Machnicka, M. A., Purta, E., Pi a tkowski, P., Bagi \'n ski, B., Wirecki, T. K., de Cr \'e cy-Lagard , V., Ross, R., Limbach, P. A., Kotter, A., Helm, M., and Bujnicki, J. M. (2018). MODOMICS : A database of RNA modification pathways. 2017 update. Nucleic Acids Research , 46(D1):D303--D307

  7. [7]

    and Orland, H

    Bon, M. and Orland, H. (2011). TT2NE : A novel algorithm to predict RNA secondary structures with pseudoknots. Nucleic Acids Research , 39(14):e93--e93

  8. [8]

    and Westhof, E

    Brion, P. and Westhof, E. (1997). HIERARCHY AND DYNAMICS OF RNA FOLDING . Annual Review of Biophysics , 26(Volume 26, 1997):113--137

  9. [9]

    Calonaci, N., Jones, A., Cuturello, F., Sattler, M., and Bussi, G. (2020). Machine learning a model for RNA structure prediction. NAR Genomics and Bioinformatics , 2(4):lqaa090

  10. [10]

    J., Subramanian, S., Schnare, M

    Cannone, J. J., Subramanian, S., Schnare, M. N., Collett, J. R., D'Souza, L. M., Du, Y., Feng, B., Lin, N., Madabusi, L. V., M \"u ller, K. M., Pande, N., Shang, Z., Yu, N., and Gutell, R. R. (2002). The Comparative RNA Web ( CRW ) Site : An online database of comparative sequence and structure information for ribosomal, intron, and other RNAs . BMC Bioin...

  11. [11]

    Chen, J., Hu, Z., Sun, S., Tan, Q., Wang, Y., Yu, Q., Zong, L., Hong, L., Xiao, J., Shen, T., King, I., and Li, Y. (2022). Interpretable RNA Foundation Model from Unannotated Data for Highly Accurate RNA Structure and Function Predictions . https://arxiv.org/abs/2204.00300v5

  12. [12]

    Chen, K., Litfin, T., Singh, J., Zhan, J., and Zhou, Y. (2024). MARS and RNAcmap3 : The Master Database of All Possible RNA Sequences Integrated with RNAcmap for RNA Homology Search . Genomics, Proteomics & Bioinformatics , 22(1):qzae018

  13. [13]

    Chen, X., Li, Y., Umarov, R., Gao, X., and Song, L. (2020). RNA Secondary Structure Prediction By Learning Unrolled Algorithms

  14. [14]

    Cruz, J. A. and Westhof, E. (2009). The Dynamic Landscapes of RNA Architecture . Cell , 136(4):604--609

  15. [15]

    Cuturello, F., Tiana, G., and Bussi, G. (2020). Assessing the accuracy of direct-coupling analysis for RNA contact prediction. RNA , 26(5):637--647

  16. [16]

    Danaee, P., Rouches, M., Wiley, M., Deng, D., Huang, L., and Hendrix, D. (2018). bpRNA : Large-scale automated annotation and analysis of RNA secondary structure. Nucleic Acids Research , 46(11):5381--5394

  17. [17]

    De Leonardis, E., Lutz, B., Ratz, S., Cocco, S., Monasson, R., Schug, A., and Weigt, M. (2015). Direct- Coupling Analysis of nucleotide coevolution facilitates RNA secondary and tertiary structure prediction. Nucleic Acids Research , 43(21):10444--10455

  18. [18]

    E., Li, T

    Deigan, K. E., Li, T. W., Mathews, D. H., and Weeks, K. M. (2009). Accurate SHAPE-directed RNA structure determination. Proceedings of the National Academy of Sciences , 106(1):97--102

  19. [19]

    K., Zhang, Y., Bevilacqua, P

    Ding, Y., Tang, Y., Kwok, C. K., Zhang, Y., Bevilacqua, P. C., and Assmann, S. M. (2014). In vivo genome-wide profiling of RNA secondary structure reveals novel regulatory features. Nature , 505(7485):696--700

  20. [20]

    B., Woods, D

    Do, C. B., Woods, D. A., and Batzoglou, S. (2006). CONTRAfold : RNA secondary structure prediction without physics-based models. Bioinformatics , 22(14):e90--e98

  21. [21]

    Doudna, J. A. and Cech, T. R. (2002). The chemical repertoire of natural ribozymes. Nature , 418(6894):222--228

  22. [22]

    R., Krogh, A., and Mitchison, G

    Durbin, R., Eddy, S. R., Krogh, A., and Mitchison, G. (1998). Biological Sequence Analysis : Probabilistic Models of Proteins and Nucleic Acids . Cambridge University Press, Cambridge

  23. [23]

    Eddy, S. R. and Durbin, R. (1994). RNA sequence analysis using covariance models. Nucleic Acids Research , 22(11):2079--2088

  24. [24]

    Franke, J. K. H., Runge, F., K \"o ksal, R., Matus, D., Backofen, R., and Hutter, F. (2024). RNAformer : A Simple yet Effective Model for Homology-Aware RNA Secondary Structure Prediction

  25. [25]

    Fu, L., Cao, Y., Wu, J., Peng, Q., Nie, Q., and Xie, X. (2022). UFold : Fast and accurate RNA secondary structure prediction with deep learning. Nucleic Acids Research , 50(3):e14

  26. [26]

    Gardner, P. P. and Giegerich, R. (2004). A comprehensive comparison of comparative RNA structure prediction approaches. BMC Bioinformatics , 5(1):140

  27. [27]

    E., Bellaousov, S., Huggins, W., Leonard, C

    Hajdin, C. E., Bellaousov, S., Huggins, W., Leonard, C. W., Mathews, D. H., and Weeks, K. M. (2013). Accurate SHAPE-directed RNA secondary structure modeling, including pseudoknots. Proceedings of the National Academy of Sciences of the United States of America , 110(14):5498--5503

  28. [28]

    L., Fontana, W., Stadler, P

    Hofacker, I. L., Fontana, W., Stadler, P. F., Bonhoeffer, L. S., Tacker, M., and Schuster, P. (1994). Fast folding and comparison of RNA secondary structures. Monatshefte f \"u r Chemie / Chemical Monthly , 125(2):167--188

  29. [29]

    Holbrook, S. R. (2005). RNA structure: The long and the short of it. Current Opinion in Structural Biology , 15(3):302--308

  30. [30]

    Holbrook, S. R. (2008). Structural Principles From Large RNAs *. Annual Review of Biophysics , 37(Volume 37, 2008):445--464

  31. [31]

    A., and Mathews, D

    Huang, L., Zhang, H., Deng, D., Zhao, K., Liu, K., Hendrix, D. A., and Mathews, D. H. (2019). LinearFold : Linear-time approximate RNA folding by 5'-to-3' dynamic programming and beam search. Bioinformatics , 35(14):i295--i304

  32. [32]

    Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., Tunyasuvunakool, K., Bates, R., Z \'i dek, A., Potapenko, A., Bridgland, A., Meyer, C., Kohl, S. A. A., Ballard, A. J., Cowie, A., Romera-Paredes , B., Nikolov, S., Jain, R., Adler, J., Back, T., Petersen, S., Reiman, D., Clancy, E., Zielinski, M., Steinegger, M., Pacholska, M.,...

  33. [33]

    Justyna, M., Antczak, M., and Szachniuk, M. (2023). Machine learning for RNA 2D structure prediction benchmarked on experimental data. Briefings in Bioinformatics , 24(3):bbad153

  34. [34]

    P., Ontiveros-Palacios , N., Argasinska, J., Lamkiewicz, K., Marz, M., Griffiths-Jones , S., Toffano-Nioche , C., Gautheret, D., Weinberg, Z., Rivas, E., Eddy, S

    Kalvari, I., Nawrocki, E. P., Ontiveros-Palacios , N., Argasinska, J., Lamkiewicz, K., Marz, M., Griffiths-Jones , S., Toffano-Nioche , C., Gautheret, D., Weinberg, Z., Rivas, E., Eddy, S. R., Finn, R. D., Bateman, A., and Petrov, A. I. (2021). Rfam 14: Expanded coverage of metagenomic, viral and microRNA families. Nucleic Acids Research , 49(D1):D192--D200

  35. [35]

    Kerpedjiev, P., Hammer, S., and Hofacker, I. L. (2015). Forna (force-directed RNA ): Simple and effective online RNA secondary structure diagrams. Bioinformatics (Oxford, England) , 31(20):3377--3379

  36. [36]

    and Hein, J

    Knudsen, B. and Hein, J. (2003). Pfold: RNA secondary structure prediction using stochastic context-free grammars. Nucleic Acids Research , 31(13):3423--3428

  37. [37]

    and Mendell, J

    Kopp, F. and Mendell, J. T. (2018). Functional Classification and Experimental Dissection of Long Noncoding RNAs . Cell , 172(3):393--407

  38. [38]

    Kryshtafovych, A., Schwede, T., Topf, M., Fidelis, K., and Moult, J. (2023). Critical assessment of methods of protein structure prediction ( CASP )- Round XV . Proteins , 91(12):1539--1549

  39. [39]

    Li, G., Jiang, F., Zhu, J., Cui, H., Wang, Z., and Chen, W. (2025). HydraRNA : A hybrid architecture based full-length RNA language model

  40. [40]

    H., H \"o ner zu Siederdissen , C., Tafer, H., Flamm, C., Stadler, P

    Lorenz, R., Bernhart, S. H., H \"o ner zu Siederdissen , C., Tafer, H., Flamm, C., Stadler, P. F., and Hofacker, I. L. (2011). ViennaRNA Package 2.0. Algorithms for Molecular Biology , 6(1):26

  41. [41]

    Markham, N. R. and Zuker, M. (2008). UNAFold . In Keith, J. M., editor, Bioinformatics: Structure , Function and Applications , pages 3--31. Humana Press, Totowa, NJ

  42. [42]

    H., Disney, M

    Mathews, D. H., Disney, M. D., Childs, J. L., Schroeder, S. J., Zuker, M., and Turner, D. H. (2004). Incorporating chemical modification constraints into a dynamic programming algorithm for prediction of RNA secondary structure. Proceedings of the National Academy of Sciences , 101(19):7287--7292

  43. [43]

    H., Sabina, J., Zuker, M., and Turner, D

    Mathews, D. H., Sabina, J., Zuker, M., and Turner, D. H. (1999). Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure1. Journal of Molecular Biology , 288(5):911--940

  44. [44]

    Mathews, D. H. and Turner, D. H. (2002). Dynalign: An algorithm for finding the secondary structure common to two RNA sequences. Journal of Molecular Biology , 317(2):191--203

  45. [45]

    Mattick, J. S. (2001). Non-coding RNAs : The architects of eukaryotic complexity. EMBO reports , 2(11):986--991

  46. [46]

    J., Wilkinson, K

    Merino, E. J., Wilkinson, K. A., Coughlan, J. L., and Weeks, K. M. (2005). RNA Structure Analysis at Single Nucleotide Resolution by Selective 2`- Hydroxyl Acylation and Primer Extension ( SHAPE ). Journal of the American Chemical Society , 127(12):4223--4231

  47. [47]

    M., Anselmi, F., van Hemert , M

    Morandi, E., Manfredonia, I., Simon, L. M., Anselmi, F., van Hemert , M. J., Oliviero, S., and Incarnato, D. (2021). Genome-scale deconvolution of RNA structure ensembles. Nature Methods , 18(3):249--252

  48. [48]

    Morris, K. V. and Mattick, J. S. (2014). The rise of regulatory RNA . Nature Reviews Genetics , 15(6):423--437

  49. [49]

    M., Busan, S., Rice, G

    Mustoe, A. M., Busan, S., Rice, G. M., Hajdin, C. E., Peterson, B. K., Ruda, V. M., Kubica, N., Nutiu, R., Baryza, J. L., and Weeks, K. M. (2018). Pervasive Regulatory Functions of mRNA Structure Revealed by High-Resolution SHAPE Probing . Cell , 173(1):181--195.e18

  50. [50]

    Nawrocki, E. P. and Eddy, S. R. (2013). Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics , 29(22):2933--2935

  51. [51]

    R., and Kleitman, D

    Nussinov, R., Pieczenik, G., Griggs, J. R., and Kleitman, D. J. (1978). Algorithms for Loop Matchings . SIAM Journal on Applied Mathematics , 35(1):68--82

  52. [52]

    W., Turner, A.-M

    Olson, S. W., Turner, A.-M. W., Arney, J. W., Saleem, I., Weidmann, C. A., Margolis, D. M., Weeks, K. M., and Mustoe, A. M. (2022). Discovery of a large-scale, cell-state-responsive allosteric switch in the 7SK RNA using DANCE-MaP . Molecular Cell , 82(9):1708--1723.e10

  53. [53]

    Peattie, D. A. and Gilbert, W. (1980). Chemical probes for higher-order structure in RNA . Proceedings of the National Academy of Sciences of the United States of America , 77(8):4679--4682

  54. [54]

    S., Bejerano, G., Siepel, A., Rosenbloom, K., Lindblad-Toh , K., Lander, E

    Pedersen, J. S., Bejerano, G., Siepel, A., Rosenbloom, K., Lindblad-Toh , K., Lander, E. S., Kent, J., Miller, W., and Haussler, D. (2006). Identification and Classification of Conserved RNA Secondary Structures in the Human Genome . PLoS Computational Biology , 2(4):e33

  55. [55]

    Peng, W.-X., Koirala, P., and Mo, Y.-Y. (2017). LncRNA-mediated regulation of cell signaling in cancer. Oncogene , 36(41):5661--5667

  56. [56]

    B., Peter, E

    Pucci, F., Zerihun, M. B., Peter, E. K., and Schug, A. (2020). Evaluating DCA-based method performances for RNA contact prediction by a well-curated data set. RNA , 26(7):794--802

  57. [57]

    Reuter, J. S. and Mathews, D. H. (2010). RNAstructure : Software for RNA secondary structure prediction and analysis. BMC Bioinformatics , 11(1):1--9

  58. [58]

    Rivas, E., Lang, R., and Eddy, S. R. (2012). A range of complex probabilistic models for RNA secondary structure prediction that includes the nearest-neighbor model and more. RNA , 18(2):193--212

  59. [59]

    Rouskin, S., Zubradt, M., Washietl, S., Kellis, M., and Weissman, J. S. (2014). Genome-wide probing of RNA structure reveals active unfolding of mRNA structures in vivo. Nature , 505(7485):701--705

  60. [60]

    S., Sj \"o lander, K., Underwood, R

    Sakakibara, Y., Brown, M., Hughey, R., Mian, I. S., Sj \"o lander, K., Underwood, R. C., and Haussler, D. (1994). Stochastic context-free grammers for tRNA modeling. Nucleic Acids Research , 22(23):5112--5120

  61. [61]

    Saman Booy, M., Ilin, A., and Orponen, P. (2022). RNA secondary structure prediction with convolutional neural networks. BMC Bioinformatics , 23(1):58

  62. [62]

    Sankoff, D. (2006). Simultaneous Solution of the RNA Folding , Alignment and Protosequence Problems . SIAM Journal on Applied Mathematics

  63. [63]

    Sato, K., Akiyama, M., and Sakakibara, Y. (2021). RNA secondary structure prediction using deep learning with thermodynamic integration. Nature Communications , 12(1):941

  64. [64]

    and Hamada, M

    Sato, K. and Hamada, M. (2023). Recent trends in RNA informatics: A review of machine learning and deep learning for RNA secondary structure prediction and RNA drug discovery. Briefings in Bioinformatics , 24(4):bbad186

  65. [65]

    Sato, K., Hamada, M., Asai, K., and Mituyama, T. (2009). CentroidFold : A web server for RNA secondary structure prediction. Nucleic Acids Research , 37(suppl\_2):W277--W280

  66. [66]

    Singh, J., Hanson, J., Paliwal, K., and Zhou, Y. (2019). RNA secondary structure prediction using an ensemble of two-dimensional deep neural networks and transfer learning. Nature Communications , 10(1):5407

  67. [67]

    Singh, J., Paliwal, K., Zhang, T., Singh, J., Litfin, T., and Zhou, Y. (2021). Improved RNA secondary structure and tertiary base-pairing prediction using evolutionary profile, mutational coupling and two-dimensional transfer learning. Bioinformatics , 37(17):2589--2600

  68. [68]

    J., Rice, G

    Smola, M. J., Rice, G. M., Busan, S., Siegfried, N. A., and Weeks, K. M. (2015). Selective 2 -hydroxyl acylation analyzed by primer extension and mutational profiling ( SHAPE-MaP ) for direct, versatile and accurate RNA structure analysis. Nature Protocols , 10(11):1643--1669

  69. [69]

    Staple, D. W. and Butcher, S. E. (2005). Pseudoknots: RNA Structures with Diverse Functions . PLOS Biology , 3(6):e213

  70. [70]

    Statello, L., Guo, C.-J., Chen, L.-L., and Huarte, M. (2021). Gene regulation by long non-coding RNAs and its biological functions. Nature Reviews Molecular Cell Biology , 22(2):96--118

  71. [71]

    J., Yu, A

    Strobel, E. J., Yu, A. M., and Lucks, J. B. (2018). High-throughput determination of RNA structures. Nature Reviews Genetics , 19(10):615--634

  72. [72]

    S., Kjems, J., and Heitsch, C

    S \"u k \"o sd, Z., Swenson, M. S., Kjems, J., and Heitsch, C. E. (2013). Evaluating the accuracy of SHAPE-directed RNA secondary structure predictions. Nucleic Acids Research , 41(5):2807--2816

  73. [73]

    H., de Melo , A

    Sundfeld, D., Havgaard, J. H., de Melo , A. C. M. A., and Gorodkin, J. (2016). Foldalign 2.5: Multithreaded implementation for pairwise structural RNA alignment. Bioinformatics , 32(8):1238--1240

  74. [74]

    Szikszai, M., Wise, M., Datta, A., Ward, M., and Mathews, D. H. (2022). Deep learning models for RNA secondary structure prediction (probably) do not generalize across families. Bioinformatics , 38(16):3892--3899

  75. [75]

    Tan, Z., Fu, Y., Sharma, G., and Mathews, D. H. (2017). TurboFold II : RNA structural alignment and secondary structure prediction informed by multiple homologs. Nucleic Acids Research , 45(20):11570--11581

  76. [76]

    L., and Lorenz, R

    Tanzer, A., Hofacker, I. L., and Lorenz, R. (2019). RNA modifications in structure prediction -- Status quo and future challenges. Methods , 156:32--39

  77. [77]

    RNAcentral : A hub of information for non-coding RNA sequences

    The RNAcentral Consortium (2019). RNAcentral : A hub of information for non-coding RNA sequences. Nucleic Acids Research , 47(D1):D221--D229

  78. [78]

    N., Dengler, B., Levin, M

    Tinoco, I., Borer, P. N., Dengler, B., Levin, M. D., Uhlenbeck, O. C., Crothers, D. M., and Bralla, J. (1973). Improved estimation of secondary structure in ribonucleic acids. Nature: New Biology , 246(150):40--41

  79. [79]

    and Bustamante, C

    Tinoco, I. and Bustamante, C. (1999). How RNA folds. Journal of Molecular Biology , 293(2):271--281

  80. [80]

    J., Corbin, V

    Tomezsko, P. J., Corbin, V. D. A., Gupta, P., Swaminathan, H., Glasgow, M., Persad, S., Edwards, M. D., Mcintosh, L., Papenfuss, A. T., Emery, A., Swanstrom, R., Zang, T., Lan, T. C. T., Bieniasz, P., Kuritzkes, D. R., Tsibris, A., and Rouskin, S. (2020). Determination of RNA structural diversity and its role in HIV-1 RNA splicing. Nature , 582(7812):438--442

Showing first 80 references.