Multimodal Alignment and Preference Optimization for Zero-Shot Conditional RNA Generation
Pith reviewed 2026-06-30 22:15 UTC · model grok-4.3
The pith
Treating conditional RNA generation as a multi-stage alignment problem and applying multimodal supervised fine-tuning followed by direct preference optimization produces RNA sequences with superior protein binding affinities.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that the Moirain series of models, optimized via multimodal SFT and DPO, consistently produces novel, diverse, and biologically plausible RNA sequences with superior binding affinities compared to existing baselines in zero-shot conditional settings.
What carries the argument
Moirain models that employ a multimodal SFT architecture conditioning RNA synthesis on protein features, followed by DPO refinement using synthetic interaction data to improve functional fitness without collapsing the learned natural distribution.
If this is right
- The generated sequences maintain natural distributions while gaining improved functional fitness for protein interactions.
- Target-specific RNA synthesis is enabled through conditioning on protein structural and sequential features.
- The frequency of successful interactions increases in the generated sequences for functional applications.
- Metrics show gains in novelty, diversity, and biological plausibility over prior baselines.
Where Pith is reading between the lines
- The same multi-stage alignment approach could be tested on design tasks for other molecules such as DNA or peptides.
- If synthetic preference data proves reliable, it may lessen dependence on scarce experimental binding measurements across biomolecular tasks.
- Coupling the conditioning step with improved protein structure predictors might yield even more accurate target-specific outputs.
Load-bearing premise
The synthetic interaction data used to train with DPO accurately reflects real-world protein-RNA binding preferences.
What would settle it
Laboratory experiments that measure binding affinities of the generated RNA sequences to their target proteins and compare results against sequences from baseline methods.
Figures
read the original abstract
The design of RNA molecules that interact with specific proteins is a critical challenge in experimental and computational biology. Despite recent progress in natural language modeling and deep learning-based protein design, there remains significant room to improve the frequency of successful interactions and the authenticity of generated sequences for functional applications. In this work, we frame conditional RNA sequence generation as a multi-stage alignment problem, introducing Moirain: a suite of models optimized via multimodal supervised fine-tuning (SFT) and Direct Preference Optimization (DPO). Our approach begins with large-scale pretraining on diverse RNA corpora to capture the fundamental grammars of sequence plausibility. To achieve target-specific generation, we employ a multimodal SFT architecture that conditions RNA synthesis on protein structural and sequential features. Finally, we leverage DPO to refine the model using synthetic interaction data: taking advantage of DPO's unique ability to navigate non-aligned preference spaces, we improve functional fitness without collapsing the learned natural distribution. Extensive evaluation of the Moirain series (Moirain-Base, -Multi, and -DPO) demonstrates that our framework consistently produces novel, diverse, and biologically plausible RNA sequences with superior binding affinities compared to existing baselines.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces the Moirain framework (Moirain-Base, -Multi, -DPO) for zero-shot conditional RNA generation. It pretrains on large RNA corpora, applies multimodal supervised fine-tuning (SFT) to condition generation on protein structural and sequential features, and uses Direct Preference Optimization (DPO) on synthetic interaction data to refine functional fitness without collapsing the natural sequence distribution. The central claim is that the resulting models produce novel, diverse, biologically plausible RNA sequences with superior binding affinities relative to existing baselines.
Significance. If the central claims hold after validation, the work would advance computational RNA design by demonstrating a practical route to target-conditioned generation that leverages preference optimization rather than requiring extensive paired experimental data. The separation of pretraining, multimodal SFT, and DPO stages is a coherent technical choice. Credit is given for attempting to preserve distributional properties while optimizing for an interaction objective. However, the significance is currently limited by the absence of grounding for the synthetic preference signal.
major comments (2)
- [DPO subsection of Methods] DPO training description (Methods): No details are provided on how the synthetic interaction data and preference pairs are generated, nor is any correlation reported between the in silico scorer and experimental binding measurements (SPR, ITC, or pull-down assays). Because the performance lift is attributed to the DPO stage and the multimodal SFT only supplies conditioning, this missing validation directly undermines the claim of improved true functional fitness in zero-shot settings.
- [Results / Evaluation] Evaluation section: The claim of 'superior binding affinities' and 'extensive evaluation' is stated without reporting the specific affinity metric or predictor used, the identity and number of baselines, the test protein-RNA pairs, or any statistical significance tests. This absence makes it impossible to determine whether the reported gains are robust or merely artifacts of the unvalidated proxy.
minor comments (1)
- [Abstract] Abstract: Lacks concrete information on evaluation metrics, dataset sizes, or baseline methods, reducing its utility as a standalone summary.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. Below we respond point-by-point to the major comments. Where details were omitted from the initial submission we will incorporate them; where experimental validation is absent we state this limitation directly.
read point-by-point responses
-
Referee: [DPO subsection of Methods] DPO training description (Methods): No details are provided on how the synthetic interaction data and preference pairs are generated, nor is any correlation reported between the in silico scorer and experimental binding measurements (SPR, ITC, or pull-down assays). Because the performance lift is attributed to the DPO stage and the multimodal SFT only supplies conditioning, this missing validation directly undermines the claim of improved true functional fitness in zero-shot settings.
Authors: We agree that the DPO subsection lacks sufficient methodological detail. In the revised manuscript we will add a full description of how the synthetic interaction data and preference pairs were constructed, including the precise in silico scorer, sampling procedure, and filtering criteria. Because the study is framed around synthetic proxies to enable zero-shot generation without paired experimental data, we do not possess direct correlations with SPR, ITC, or pull-down assays; we will add an explicit limitations paragraph discussing the proxy's grounding in existing literature on in silico RNA-protein predictors. revision: partial
-
Referee: [Results / Evaluation] Evaluation section: The claim of 'superior binding affinities' and 'extensive evaluation' is stated without reporting the specific affinity metric or predictor used, the identity and number of baselines, the test protein-RNA pairs, or any statistical significance tests. This absence makes it impossible to determine whether the reported gains are robust or merely artifacts of the unvalidated proxy.
Authors: We acknowledge that the Evaluation section omitted key reporting elements. The revised manuscript will specify the exact affinity metric and underlying predictor, enumerate all baselines with their identities and counts, list the test protein-RNA pairs, and report the statistical significance tests (including p-values and correction method). These quantities were computed during our experiments and will be added for full reproducibility. revision: yes
- Direct experimental correlation between the in silico preference signal and wet-lab binding measurements (SPR, ITC, or pull-down), as no such assays were performed.
Circularity Check
No circularity; empirical framework with no derivations
full rationale
The paper describes an empirical ML pipeline (pretraining on RNA corpora, multimodal SFT conditioning on protein features, then DPO on synthetic interaction data) and reports evaluation results on generated sequences. No equations, first-principles derivations, fitted parameters renamed as predictions, or self-citation chains appear in the abstract or described text. Central claims rest on external benchmark comparisons rather than self-referential reductions. Per rules, this is scored 0 as a self-contained empirical study without the enumerated circular patterns.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
John Jumper, Richard Evans, Alexander Pritzel, Tim Green, Michael Figurnov, Olaf Ron- neberger, Kathryn Tunyasuvunakool, Russ Bates, Augustin Žídek, Anna Potapenko, Alex Bridgland, Clemens Meyer, Simon A. A. Kohl, Andrew J. Ballard, Andrew Cowie, Bernardino Romera-Paredes, Stanislav Nikolov, Rishub Jain, Jonas Adler, Trevor Back, Stig Petersen, David Reim...
2021
-
[2]
Ingraham, Max Baranov, Zak Costello, Karl W
John B. Ingraham, Max Baranov, Zak Costello, Karl W. Barber, Wujie Wang, Ahmed Ismail, Vincent Frappier, Dana M. Lord, Christopher Ng-Thow-Hing, Erik R. Van Vlack, Shan Tie, Vincent Xue, Sarah C. Cowles, Alan Leung, João V . Rodrigues, Claudio L. Morales-Perez, Alex M. Ayoub, Robin Green, Katherine Puentes, Frank Oplinger, Nishant V . Panwar, Fritz Oberme...
2023
-
[3]
Sofroniew, Deniz Oktay, Zeming Lin, Robert Verkuil, Vincent Q
Thomas Hayes, Roshan Rao, Halil Akin, Nicholas J. Sofroniew, Deniz Oktay, Zeming Lin, Robert Verkuil, Vincent Q. Tran, Jonathan Deaton, Marius Wiggert, Rohil Badkundri, Irhum Shafkat, Jun Gong, Alexander Derry, Raul S. Molina, Neil Thomas, Yousuf Khan, Chetan Mishra, Carolyn Kim, Liam J. Bartie, Matthew Nemeth, Patrick D. Hsu, Tom Sercu, Salvatore 9 Candi...
2024
-
[4]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention Is All You Need, August 2023. arXiv:1706.03762 [cs]
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[5]
A Neural Probabilistic Language Model
Yoshua Bengio, Réjean Ducharme, and Pascal Vincent. A Neural Probabilistic Language Model. InAdvances in Neural Information Processing Systems, volume 13. MIT Press, 2000
2000
-
[6]
Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-V oss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwi...
-
[7]
arXiv:2005.14165 [cs]
work page internal anchor Pith review Pith/arXiv arXiv 2005
-
[8]
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc Le, and Denny Zhou. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models, January 2023. arXiv:2201.11903 [cs]
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[9]
Danyu Li, Rubing Huang, Chenhui Cui, Dave Towey, Ling Zhou, Jinyu Tian, and Bin Zou. RNA-Protein Interaction Prediction Based on Deep Learning: A Comprehensive Survey.arXiv preprint arXiv:2410.00077, 2024
-
[10]
Recent focus in non-SELEX-computational approach for de novo aptamer design: A mini review.Analytical Biochemistry, 699:115756, April 2025
Ilemobayo Victor Fasogbon, Erick Nyakundi Ondari, Deusdedit Tusubira, Loganathan Ran- gasamy, Janarthanan Venkatesan, Angela Mumbua Musyoka, and Patrick Maduabuchi Aja. Recent focus in non-SELEX-computational approach for de novo aptamer design: A mini review.Analytical Biochemistry, 699:115756, April 2025
2025
-
[11]
Hentze, Alfredo Castello, Thomas Schwarzl, and Thomas Preiss
Matthias W. Hentze, Alfredo Castello, Thomas Schwarzl, and Thomas Preiss. A brave new world of RNA-binding proteins.Nature Reviews Molecular Cell Biology, 19(5):327–341, May 2018
2018
-
[12]
Snead, Joe Trebley, Steve Hoeprich, Songchuan Guo, and Yi Shu
Peixuan Guo, Oana Coban, Nicholas M. Snead, Joe Trebley, Steve Hoeprich, Songchuan Guo, and Yi Shu. Engineering RNA for Targeted siRNA Delivery and Medical Application.Advanced Drug Delivery Reviews, 62(6):650–666, April 2010
2010
-
[13]
Hertz, David Z
Walter Thavarajah, Laura M. Hertz, David Z. Bushhouse, Chloé M. Archuleta, and Julius B. Lucks. RNA Engineering for Public Health: Innovations in RNA-Based Diagnostics and Therapeutics.Annual Review of Chemical and Biomolecular Engineering, 12(1):263–286, June 2021
2021
-
[14]
A computational proposal for designing structured RNA pools for in vitro selection of RNAs.RNA, 13(4):478–492, 2007
Namhee Kim, Hin Hark Gan, and Tamar Schlick. A computational proposal for designing structured RNA pools for in vitro selection of RNAs.RNA, 13(4):478–492, 2007
2007
-
[15]
RagPools: RNA-As-Graph-Pools: a web server for assisting the design of structured RNA pools for in vitro selection.Bioinformatics, 23(21):2959–2960, 2007
Namhee Kim, Jin Sup Shin, Shereef Elmetwaly, Hin Hark Gan, and Tamar Schlick. RagPools: RNA-As-Graph-Pools: a web server for assisting the design of structured RNA pools for in vitro selection.Bioinformatics, 23(21):2959–2960, 2007
2007
-
[16]
Biomolecular information gained through in vitro evolution
Takuyo Aita and Yuzuru Husimi. Biomolecular information gained through in vitro evolution. Biophysical reviews, 2:1–11, 2010
2010
-
[17]
Entropic Fragment-Based Approach to Aptamer Design.Chemical Biology & Drug Design, 78(1):1–13, 2011
Chih-Yuan Tseng, Md Ashrafuzzaman, Jonathan Y Mane, Janice Kapty, John R Mercer, and Jack A Tuszynski. Entropic Fragment-Based Approach to Aptamer Design.Chemical Biology & Drug Design, 78(1):1–13, 2011
2011
-
[18]
Hyman, and Andres Jäschke
Yaqing Zhang, Yuan Jiang, David Kuster, Qiwei Ye, Wenhao Huang, Simon Fürbacher, Jingye Zhang, Pia Doll, Wenjun Lin, Siwei Dong, Hui Wang, Zhipeng Tang, David Ibberson, Klemens Wild, Irmgard Sinning, Anthony A. Hyman, and Andres Jäschke. Single-step discovery of high-affinity RNA ligands by UltraSelex.Nature Chemical Biology, 21(7):1118–1126, July 2025. 10
2025
-
[19]
In silico design of novel aptamers utilizing a hybrid method of machine learning and genetic algorithm.Molecular diversity, 25:1395–1407, 2021
Mahsa Torkamanian-Afshar, Sajjad Nematzadeh, Maryam Tabarzad, Ali Najafi, Hossein Lanja- nian, and Ali Masoudi-Nejad. In silico design of novel aptamers utilizing a hybrid method of machine learning and genetic algorithm.Molecular diversity, 25:1395–1407, 2021
2021
-
[20]
Gwangho Lee, Gun Hyuk Jang, Ho Young Kang, and Giltae Song. Predicting aptamer sequences that interact with target proteins using an aptamer-protein interaction classifier and a Monte Carlo tree search approach.PloS one, 16(6):e0253760, 2021
2021
-
[21]
Discrete stochastic models of SELEX: Aptamer capture probabilities and protocol optimization.The Journal of Chemical Physics, 156(24), 2022
Yue Wang, Bhaven A Mistry, and Tom Chou. Discrete stochastic models of SELEX: Aptamer capture probabilities and protocol optimization.The Journal of Chemical Physics, 156(24), 2022
2022
-
[22]
AptaTrans: a deep neural network for predicting aptamer- protein interaction using pretrained encoders.BMC bioinformatics, 24(1):447, 2023
Incheol Shin, Keumseok Kang, Juseong Kim, Sanghun Sel, Jeonghoon Choi, Jae-Wook Lee, Ho Young Kang, and Giltae Song. AptaTrans: a deep neural network for predicting aptamer- protein interaction using pretrained encoders.BMC bioinformatics, 24(1):447, 2023
2023
-
[23]
RNA Generative Modeling With Tree Search
Stephen Obonyo, Nicolas Jouandeau, and Dickson Owuor. RNA Generative Modeling With Tree Search. In2024 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), pages 1–9. IEEE, 2024
2024
-
[24]
A generative model for constructing nucleic acid sequences binding to a protein.BMC genomics, 20(Suppl 13):967, 2019
Jinho Im, Byungkyu Park, and Kyungsook Han. A generative model for constructing nucleic acid sequences binding to a protein.BMC genomics, 20(Suppl 13):967, 2019
2019
-
[25]
Discovering protein-binding RNA motifs with a generative model of RNA sequences.Computational Biology and Chemistry, 84:107171, February 2020
Byungkyu Park and Kyungsook Han. Discovering protein-binding RNA motifs with a generative model of RNA sequences.Computational Biology and Chemistry, 84:107171, February 2020
2020
-
[26]
Chen, Jonathan P
Jonathan C. Chen, Jonathan P. Chen, Max W. Shen, Michael Wornow, Minwoo Bae, Wei-Hsi Yeh, Alvin Hsu, and David R. Liu. Generating experimentally unrelated target molecule-binding highly functionalized nucleic-acid polymers using machine learning.Nature Communications, 13(1):4541, August 2022
2022
-
[27]
Generative aptamer discovery using RaptGen.Nature Computational Science, 2(6):378–386, June 2022
Natsuki Iwano, Tatsuo Adachi, Kazuteru Aoki, Yoshikazu Nakamura, and Michiaki Hamada. Generative aptamer discovery using RaptGen.Nature Computational Science, 2(6):378–386, June 2022
2022
-
[28]
DAPTEV: Deep aptamer evolutionary modelling for COVID-19 drug design.PLOS Computational Biology, 19(7):e1010774, July 2023
Cameron Andress, Kalli Kappel, Marcus Elbert Villena, Miroslava Cuperlovic-Culf, Hongbin Yan, and Yifeng Li. DAPTEV: Deep aptamer evolutionary modelling for COVID-19 drug design.PLOS Computational Biology, 19(7):e1010774, July 2023
2023
-
[29]
Ercument Cicek
Furkan Ozden, Sina Barazandeh, Dogus Akboga, Sobhan Shokoueian Tabrizi, Urartu Ozgur Safak Seker, and A. Ercument Cicek. RNAGEN: A generative adversarial network-based model to generate synthetic RNA sequences to target proteins, July 2023
2023
-
[30]
AptaDiff: de novo design and optimization of aptamers based on diffusion models.Briefings in Bioinformatics, 25(6):bbae517, 2024
Zhen Wang, Ziqi Liu, Wei Zhang, Yanjun Li, Yizhen Feng, Shaokang Lv, Han Diao, Zhaofeng Luo, Pengju Yan, Min He, and others. AptaDiff: de novo design and optimization of aptamers based on diffusion models.Briefings in Bioinformatics, 25(6):bbae517, 2024
2024
-
[31]
RNAGenesis: Foundation Model for Enhanced RNA Sequence Generation and Structural Insights.bioRxiv, pages 2024–12, 2024
Zaixi Zhang, Linlin Chao, RuoFan Jin, Yikun Zhang, Guowei Zhou, Yujie Yang, Yukang Yang, Kaixuan Huang, Qirong Yang, Ziyao Xu, and others. RNAGenesis: Foundation Model for Enhanced RNA Sequence Generation and Structural Insights.bioRxiv, pages 2024–12, 2024
2024
-
[32]
GenerRNA: A generative pre-trained language model for de novo RNA design.PLOS ONE, 19(10):e0310814, October 2024
Yichong Zhao, Kenta Oono, Hiroki Takizawa, and Masaaki Kotera. GenerRNA: A generative pre-trained language model for de novo RNA design.PLOS ONE, 19(10):e0310814, October 2024
2024
-
[33]
RNAFlow: RNA Structure & Sequence Design via Inverse Folding-Based Flow Matching, June 2024
Divya Nori and Wengong Jin. RNAFlow: RNA Structure & Sequence Design via Inverse Folding-Based Flow Matching, June 2024. arXiv:2405.18768 [q-bio]
-
[34]
BAnG: Bidirectional Anchored Generation for Conditional RNA Design
Roman Klypa, Alberto Bietti, and Sergei Grudinin. BAnG: Bidirectional Anchored Generation for Conditional RNA Design, June 2025. arXiv:2502.21274 [cs]
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[35]
Ercument Cicek
Sobhan Shukueian Tabrizi, Sina Barazandeh, Helyasadat Hashemi Aghdam, and A. Ercument Cicek. RNAtranslator: Modeling protein-conditional RNA design as sequence-to-sequence natural language translation.PLOS Computational Biology, 21(10):e1013541, October 2025. 11
2025
-
[36]
Ercument Cicek
Sobhan Shukueian Tabrizi, Helyasadat Hashemi Aghdam, and A. Ercument Cicek. RNA–X: Modeling RNA interactions to design binder RNA and simultaneously target multiple molecules of different types, November 2025. ISSN: 2692-8205 Pages: 2025.11.24.690191 Section: New Results
2025
-
[37]
Nucleic Acid Assessment CASP16, December 2024
Das Rhiju, He Shujun, Hummer Alissa, and Kretsch Rachael. Nucleic Acid Assessment CASP16, December 2024
2024
-
[38]
Learning Transferable Visual Models From Natural Language Supervision, February
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agar- wal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. Learning Transferable Visual Models From Natural Language Supervision, February
-
[39]
arXiv:2103.00020 [cs]
work page internal anchor Pith review Pith/arXiv arXiv
-
[40]
Finetuned Language Models Are Zero-Shot Learners
Jason Wei, Maarten Bosma, Vincent Y . Zhao, Kelvin Guu, Adams Wei Yu, Brian Lester, Nan Du, Andrew M. Dai, and Quoc V . Le. Finetuned Language Models Are Zero-Shot Learners, February 2022. arXiv:2109.01652 [cs]
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[41]
Scaling Instruction-Finetuned Language Models
Hyung Won Chung, Le Hou, Shayne Longpre, Barret Zoph, Yi Tay, William Fedus, Yunxuan Li, Xuezhi Wang, Mostafa Dehghani, Siddhartha Brahma, Albert Webson, Shixiang Shane Gu, Zhuyun Dai, Mirac Suzgun, Xinyun Chen, Aakanksha Chowdhery, Alex Castro-Ros, Marie Pellat, Kevin Robinson, Dasha Valter, Sharan Narang, Gaurav Mishra, Adams Yu, Vincent Zhao, Yanping H...
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[42]
Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer, September 2023. arXiv:1910.10683 [cs]
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[43]
Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul Christiano, Jan Leike, and Ryan Lowe. Training language models to follow instructions with human feedback,...
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[44]
Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback
Yuntao Bai, Andy Jones, Kamal Ndousse, Amanda Askell, Anna Chen, Nova DasSarma, Dawn Drain, Stanislav Fort, Deep Ganguli, Tom Henighan, Nicholas Joseph, Saurav Kadavath, Jackson Kernion, Tom Conerly, Sheer El-Showk, Nelson Elhage, Zac Hatfield-Dodds, Danny Hernandez, Tristan Hume, Scott Johnston, Shauna Kravec, Liane Lovitt, Neel Nanda, Catherine Olsson, ...
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[45]
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler, Nisan Stiennon, Jeffrey Wu, Tom B. Brown, Alec Radford, Dario Amodei, Paul Christiano, and Geoffrey Irving. Fine-Tuning Language Models from Human Preferences, January 2020. arXiv:1909.08593 [cs]
work page internal anchor Pith review Pith/arXiv arXiv 2020
-
[46]
Manning, and Chelsea Finn
Rafael Rafailov, Archit Sharma, Eric Mitchell, Stefano Ermon, Christopher D. Manning, and Chelsea Finn. Direct Preference Optimization: Your Language Model is Secretly a Reward Model, 2023
2023
-
[47]
Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen
Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. LoRA: Low-Rank Adaptation of Large Language Models, October
-
[48]
arXiv:2106.09685 [cs]
work page internal anchor Pith review Pith/arXiv arXiv
-
[49]
Cunningham
Dan Biderman, Jacob Portes, Jose Javier Gonzalez Ortiz, Mansheej Paul, Philip Greengard, Connor Jennings, Daniel King, Sam Havens, Vitaliy Chiley, Jonathan Frankle, Cody Blakeney, and John P. Cunningham. LoRA Learns Less and Forgets Less, May 2024
2024
-
[50]
Attributing mode collapse in the fine-tuning of large language models
Laura O’Mahony, Leo Grinsztajn, Hailey Schoelkopf, and Stella Biderman. Attributing mode collapse in the fine-tuning of large language models. InICLR 2024 Workshop on Mathematical and Empirical Understanding of Foundation Models, volume 2, 2024. 12
2024
-
[51]
Diversity in Large Language Models under Supervised Fine-Tuning
Roman Klypa and Oleksandr Cherednichenko. Diversity in Large Language Models under Supervised Fine-Tuning, April 2026. arXiv:2605.00195 [cs] version: 1
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[52]
Flamingo: a Visual Language Model for Few-Shot Learning
Jean-Baptiste Alayrac, Jeff Donahue, Pauline Luc, Antoine Miech, Iain Barr, Yana Hasson, Karel Lenc, Arthur Mensch, Katie Millican, Malcolm Reynolds, Roman Ring, Eliza Ruther- ford, Serkan Cabi, Tengda Han, Zhitao Gong, Sina Samangooei, Marianne Monteiro, Ja- cob Menick, Sebastian Borgeaud, Andrew Brock, Aida Nematzadeh, Sahand Sharifzadeh, Mikolaj Binkow...
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[53]
Junnan Li, Dongxu Li, Silvio Savarese, and Steven Hoi. BLIP-2: Bootstrapping Language- Image Pre-training with Frozen Image Encoders and Large Language Models, June 2023. arXiv:2301.12597 [cs]
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[54]
Haotian Liu, Chunyuan Li, Qingyang Wu, and Yong Jae Lee. Visual Instruction Tuning, December 2023. arXiv:2304.08485 [cs]
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[55]
Proximal Policy Optimization Algorithms
John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. Proximal Policy Optimization Algorithms, August 2017. arXiv:1707.06347 [cs]
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[56]
Durrant, Brian Kang, Dhruva Katrekar, David B
Eric Nguyen, Michael Poli, Matthew G. Durrant, Brian Kang, Dhruva Katrekar, David B. Li, Liam J. Bartie, Armin W. Thomas, Samuel H. King, Garyk Brixi, Jeremy Sullivan, Madelena Y . Ng, Ashley Lewis, Aaron Lou, Stefano Ermon, Stephen A. Baccus, Tina Hernandez-Boussard, Christopher Ré, Patrick D. Hsu, and Brian L. Hie. Sequence modeling and design from mole...
2024
-
[57]
Teaching AI to speak protein.Current Opinion in Structural Biology, 91:102986, April 2025
Michael Heinzinger and Burkhard Rost. Teaching AI to speak protein.Current Opinion in Structural Biology, 91:102986, April 2025
2025
-
[58]
How artificial intelligence is reengineering protein engineering.Science, 392(6794):159–166, April 2026
Jennifer Listgarten and Hanlun Jiang. How artificial intelligence is reengineering protein engineering.Science, 392(6794):159–166, April 2026
2026
-
[59]
RNAcentral: a hub of information for non-coding RNA sequences.Nucleic Acids Research, 47(D1):D221–D229, January 2019
The RNAcentral Consortium, Blake A Sweeney, Anton I Petrov, Boris Burkov, Robert D Finn, Alex Bateman, Maciej Szymanski, Wojciech M Karlowski, Jan Gorodkin, Stefan E Seemann, Jamie J Cannone, Robin R Gutell, Petra Fey, Siddhartha Basu, Simon Kay, Guy Cochrane, Kostantinos Billis, David Emmert, Steven J Marygold, Rachael P Huntley, Ruth C Lovering, Adam Fr...
2019
-
[60]
Neural Machine Translation of Rare Words with Subword Units
Rico Sennrich, Barry Haddow, and Alexandra Birch. Neural Machine Translation of Rare Words with Subword Units, June 2016. arXiv:1508.07909 [cs]
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[61]
LLaMA: Open and Efficient Foundation Language Models, 2023
Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timo- thée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, and Guillaume Lample. LLaMA: Open and Efficient Foundation Language Models, 2023. Version Number: 1
2023
-
[62]
Rush, Boaz Barak, Teven Le Scao, Aleksandra Piktus, Nouamane Tazi, Sampo Pyysalo, Thomas Wolf, and Colin Raffel
Niklas Muennighoff, Alexander M. Rush, Boaz Barak, Teven Le Scao, Aleksandra Piktus, Nouamane Tazi, Sampo Pyysalo, Thomas Wolf, and Colin Raffel. Scaling Data-Constrained Language Models, May 2023
2023
-
[63]
RNAInter v4.0: RNA interactome repository with redefined confidence scoring system and improved accessibility.Nucleic Acids Research, 50(D1):D326–D332, January 2022
Juanjuan Kang, Qiang Tang, Jun He, Le Li, Nianling Yang, Shuiyan Yu, Mengyao Wang, Yuchen Zhang, Jiahao Lin, Tianyu Cui, Yongfei Hu, Puwen Tan, Jun Cheng, Hailong Zheng, Dong Wang, Xi Su, Wei Chen, and Yan Huang. RNAInter v4.0: RNA interactome repository with redefined confidence scoring system and improved accessibility.Nucleic Acids Research, 50(D1):D32...
2022
-
[64]
AlphaFold Protein Structure Database and 3D-Beacons: New Data and Capabilities.Journal of Molecular Biology, 437(15):168967, August 2025
Jennifer Fleming, Paulyna Magana, Sreenath Nair, Maxim Tsenkov, Damian Bertoni, Ivanna Pidruchna, Marcelo Querino Lima Afonso, Adam Midlik, Urmila Paramval, Augustin Žídek, Agata Laydon, Oleg Kovalevskiy, Joshua Pan, Jun Cheng, Žiga Avsec, Clare Bycroft, Lai Hong Wong, Meera Last, Milot Mirdita, Martin Steinegger, Pushmeet Kohli, Mihály Váradi, and Sameer...
2025
-
[65]
BAnG: Bidirectional Anchored Generation for Conditional RNA Design
Roman Klypa, Alberto Bietti, and Sergei Grudinin. BAnG: Bidirectional Anchored Generation for Conditional RNA Design. InProceedings of the 42nd International Conference on Machine Learning, pages 31020–31043. PMLR, October 2025
2025
-
[66]
LIMA: Less Is More for Alignment, May 2023
Chunting Zhou, Pengfei Liu, Puxin Xu, Srini Iyer, Jiao Sun, Yuning Mao, Xuezhe Ma, Avia Efrat, Ping Yu, Lili Yu, Susan Zhang, Gargi Ghosh, Mike Lewis, Luke Zettlemoyer, and Omer Levy. LIMA: Less Is More for Alignment, May 2023
2023
-
[67]
Cook, Matthew T
Debashish Ray, Hilal Kazan, Kate B. Cook, Matthew T. Weirauch, Hamed S. Najafabadi, Xiao Li, Serge Gueroussov, Mihai Albu, Hong Zheng, Ally Yang, Hong Na, Manuel Irimia, Leah H. Matzat, Ryan K. Dale, Sarah A. Smith, Christopher A. Yarosh, Seth M. Kelly, Behnam Nabet, Desirea Mecenas, Weimin Li, Rakesh S. Laishram, Mei Qiao, Howard D. Lipshitz, Fabio Piano...
2013
- [68]
-
[69]
Faith and Fate: Limits of Transformers on Compositionality.Advances in Neural Information Processing Systems, 36:70293–70332, December 2023
Nouha Dziri, Ximing Lu, Melanie Sclar, Xiang (Lorraine) Li, Liwei Jiang, Bill Yuchen Lin, Sean Welleck, Peter West, Chandra Bhagavatula, Ronan Le Bras, Jena Hwang, Soumya Sanyal, Xiang Ren, Allyson Ettinger, Zaid Harchaoui, and Yejin Choi. Faith and Fate: Limits of Transformers on Compositionality.Advances in Neural Information Processing Systems, 36:7029...
2023
-
[70]
Smaug: Fixing Failure Modes of Preference Optimisation with DPO-Positive
Arka Pal, Deep Karkhanis, Samuel Dooley, Manley Roberts, Siddartha Naidu, and Colin White. Smaug: Fixing Failure Modes of Preference Optimisation with DPO-Positive, July 2024. arXiv:2402.13228 [cs]
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[71]
Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization, April
Noam Razin, Sadhika Malladi, Adithya Bhaskar, Danqi Chen, Sanjeev Arora, and Boris Hanin. Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization, April
- [72]
-
[73]
Gardner, and Vincent Moulton
Eva Freyhult, Paul P. Gardner, and Vincent Moulton. A comparison of RNA folding measures. BMC Bioinformatics, 6(1):241, October 2005
2005
-
[74]
DeepCLIP: predicting the effect of mutations on protein–RNA binding with deep learning.Nucleic Acids Research, page gkaa530, June 2020
Alexander Gulliver Bjørnholt Grønning, Thomas Koed Doktor, Simon Jonas Larsen, Ulrika Si- mone Spangsberg Petersen, Lise Lolle Holm, Gitte Hoffmann Bruun, Michael Birkerod Hansen, Anne-Mette Hartung, Jan Baumbach, and Brage Storstein Andresen. DeepCLIP: predicting the effect of mutations on protein–RNA binding with deep learning.Nucleic Acids Research, pa...
2020
-
[75]
Clustering huge protein sequence sets in linear time
Martin Steinegger and Johannes Söding. Clustering huge protein sequence sets in linear time. Nature Communications, 9(1):2542, June 2018
2018
-
[76]
Gaussian Error Linear Units (GELUs), 2016
Dan Hendrycks and Kevin Gimpel. Gaussian Error Linear Units (GELUs), 2016. Version Number: 5
2016
-
[77]
Root Mean Square Layer Normalization
Biao Zhang and Rico Sennrich. Root Mean Square Layer Normalization, October 2019. arXiv:1910.07467 [cs]
work page internal anchor Pith review Pith/arXiv arXiv 2019
-
[78]
Mostafa Dehghani, Josip Djolonga, Basil Mustafa, Piotr Padlewski, Jonathan Heek, Justin Gilmer, Andreas Steiner, Mathilde Caron, Robert Geirhos, Ibrahim Alabdulmohsin, Rodolphe Je- natton, Lucas Beyer, Michael Tschannen, Anurag Arnab, Xiao Wang, Carlos Riquelme, Matthias 14 Minderer, Joan Puigcerver, Utku Evci, Manoj Kumar, Sjoerd van Steenkiste, Gamaleld...
-
[79]
RoFormer: Enhanced Transformer with Rotary Position Embedding
Jianlin Su, Yu Lu, Shengfeng Pan, Ahmed Murtadha, Bo Wen, and Yunfeng Liu. RoFormer: Enhanced Transformer with Rotary Position Embedding, November 2023. arXiv:2104.09864 [cs]
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[80]
On the Convergence of Adam and Beyond
Sashank J. Reddi, Satyen Kale, and Sanjiv Kumar. On the Convergence of Adam and Beyond, April 2019. arXiv:1904.09237 [cs]
work page internal anchor Pith review Pith/arXiv arXiv 2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.