pith. sign in

arxiv: 2403.03234 · v2 · pith:TLEI3YIZnew · submitted 2024-03-05 · 🧬 q-bio.GN · cs.LG

Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling

classification 🧬 q-bio.GN cs.LG
keywords long-rangecaduceusmodelsmodelingbi-directionalbi-directionalityblockchallenges
0
0 comments X
read the original abstract

Large-scale sequence modeling has sparked rapid advances that now extend into biology and genomics. However, modeling genomic sequences introduces challenges such as the need to model long-range token interactions, the effects of upstream and downstream regions of the genome, and the reverse complementarity (RC) of DNA. Here, we propose an architecture motivated by these challenges that builds off the long-range Mamba block, and extends it to a BiMamba component that supports bi-directionality, and to a MambaDNA block that additionally supports RC equivariance. We use MambaDNA as the basis of Caduceus, the first family of RC equivariant bi-directional long-range DNA language models, and we introduce pre-training and fine-tuning strategies that yield Caduceus DNA foundation models. Caduceus outperforms previous long-range models on downstream benchmarks; on a challenging long-range variant effect prediction task, Caduceus exceeds the performance of 10x larger models that do not leverage bi-directionality or equivariance.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Set Diffusion: Interpolating Token Orderings Between Autoregression and Diffusion for Fast and Flexible Decoding

    cs.LG 2026-07 unverdicted novelty 7.0

    Set diffusion factorizes likelihood over arbitrary token sets and uses a set-causal diffusion architecture to support KV caching and any-order decoding, yielding improved speed-quality tradeoffs versus prior diffusion LMs.

  2. A Survey of Mamba

    cs.LG 2024-08 unverdicted novelty 2.0

    The paper consolidates existing research on Mamba models, their architecture variants, adaptations to different data modalities, and applications across domains.