A Survey of Mamba

Haohao Qu; Hui Liu; Liangbo Ning; Qing Li; Rui An; Tyler Derr; Wenqi Fan; Xin Xu

arxiv: 2408.01129 · v8 · submitted 2024-08-02 · 💻 cs.LG · cs.AI

A Survey of Mamba

Haohao Qu , Liangbo Ning , Rui An , Wenqi Fan , Tyler Derr , Hui Liu , Xin Xu , Qing Li This is my paper

Pith reviewed 2026-05-23 22:06 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords MambaState Space ModelsTransformersFoundation ModelsSequence ModelingDeep LearningScalabilityAttention Mechanisms

0 comments

The pith

Mamba matches Transformers with near-linear sequence scaling

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper surveys the rapid development of Mamba-based models as an alternative to Transformers in deep learning. It reviews the basics of Mamba-1 and Mamba-2, then examines architecture designs, methods to adapt Mamba to different data types, and its applications in various domains. The survey also identifies limitations and outlines future research directions. A reader would care because it organizes the growing literature on this architecture that promises to overcome the quadratic complexity issue in attention mechanisms. This consolidation helps understand Mamba's potential for building more efficient foundation models.

Core claim

Mamba, drawing inspiration from classical state space models, has emerged as a promising alternative for building foundation models, delivering comparable modeling abilities to Transformers while preserving near-linear scalability concerning sequence length, as shown by the increasing number of studies achieving impressive performance across diverse domains.

What carries the argument

The selective state space model mechanism in Mamba that enables efficient sequence processing with linear complexity in length.

If this is right

Mamba models can achieve better efficiency in inference for long sequences compared to Transformers.
Adaptation techniques allow Mamba to excel in non-text data such as images and audio.
Applications in multiple domains demonstrate Mamba's versatility beyond language modeling.
The identified limitations suggest specific areas for architectural improvements in future Mamba variants.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Exploring combinations of Mamba with other architectures could yield hybrid models with enhanced capabilities.
The survey's overview may inspire theoretical analyses of why state space models perform well in practice.
Future surveys could track Mamba's progress beyond August 2024 to update the understanding of its potential.
Developers might prioritize Mamba for resource-constrained environments handling long contexts.

Load-bearing premise

The body of Mamba-related papers published by August 2024 is already large and representative enough for a systematic consolidation to provide a comprehensive understanding of the architecture's potential.

What would settle it

Demonstration through large-scale experiments that Mamba fails to match Transformer performance on standard benchmarks or exhibits worse scaling properties would undermine the survey's central narrative.

Figures

Figures reproduced from arXiv: 2408.01129 by Haohao Qu, Hui Liu, Liangbo Ning, Qing Li, Rui An, Tyler Derr, Wenqi Fan, Xin Xu.

**Figure 2.** Figure 2: An illustration of representative model architectures, namely Recurrent Neural Network (RNN), Transformer, and State Space [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗

**Figure 3.** Figure 3: Overview of the Selective State Space Model with hardware-aware state expansions. The selective mechanism introduces [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

**Figure 4.** Figure 4: The block architectures of Mamba-1 and Mamba-2. [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗

**Figure 5.** Figure 5: Representative examples of improved Mamba models based on the perspective of block design: (a) Integration methods [PITH_FULL_IMAGE:figures/full_fig_p014_5.png] view at source ↗

**Figure 6.** Figure 6: Recently developed scanning methods in Mamba-based models: Flatten Scans (a-c) involve flattening the model input into [PITH_FULL_IMAGE:figures/full_fig_p017_6.png] view at source ↗

**Figure 7.** Figure 7: Representative strategies exist for adapting Mamba to diverse types of data. (a-e) The Mamba architecture, imbued with [PITH_FULL_IMAGE:figures/full_fig_p020_7.png] view at source ↗

read the original abstract

As one of the most representative DL techniques, Transformer architecture has empowered numerous advanced models, especially the large language models (LLMs) that comprise billions of parameters, becoming a cornerstone in deep learning. Despite the impressive achievements, Transformers still face inherent limitations, particularly the time-consuming inference resulting from the quadratic computation complexity of attention calculation. Recently, a novel architecture named Mamba, drawing inspiration from classical state space models (SSMs), has emerged as a promising alternative for building foundation models, delivering comparable modeling abilities to Transformers while preserving near-linear scalability concerning sequence length. This has sparked an increasing number of studies actively exploring Mamba's potential to achieve impressive performance across diverse domains. Given such rapid evolution, there is a critical need for a systematic review that consolidates existing Mamba-empowered models, offering a comprehensive understanding of this emerging model architecture. In this survey, we therefore conduct an in-depth investigation of recent Mamba-associated studies, covering three main aspects: the advancements of Mamba-based models, the techniques of adapting Mamba to diverse data, and the applications where Mamba can excel. Specifically, we first review the foundational knowledge of various representative deep learning models and the details of Mamba-1&2 as preliminaries. Then, to showcase the significance of Mamba for AI, we comprehensively review the related studies focusing on Mamba models' architecture design, data adaptability, and applications. Finally, we present a discussion of current limitations and explore various promising research directions to provide deeper insights for future investigations.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

A competent but ordinary survey that maps Mamba work up to mid-2024 without adding analysis or new synthesis.

read the letter

This survey pulls together existing papers on Mamba but introduces no fresh technical result or deeper insight of its own. The structure follows the abstract: first the background on SSMs and Mamba-1/2, then sections on architecture variants, data adaptations, and applications, ending with limitations and open directions. That layout is straightforward and matches what a reader would expect from a consolidation piece. If the citation list is reasonably broad, it can function as a time-saver for someone who wants an overview rather than hunting through arXiv individually. The authors correctly note that Mamba's linear scaling comes from the cited prior work, not from new measurements here. The main soft spot is the usual one for surveys: completeness and selection criteria are not spelled out in the abstract, so it is hard to know whether key papers in vision, audio, or long-context settings were omitted or treated too lightly. Depth on any single application also looks limited by design. The paper is aimed at newcomers or researchers who need a quick map of where Mamba has been tried so far. It will not shift anyone's core research questions. A serious editor should still send it to referees because the topic is active and a usable reference document has practical value even when the novelty bar is low.

Referee Report

2 major / 2 minor

Summary. The manuscript is a literature survey on the Mamba architecture (inspired by state-space models) as an alternative to Transformers. It first reviews preliminaries on representative deep learning models and the details of Mamba-1 and Mamba-2, then surveys Mamba-based model architectures, techniques for adapting Mamba to diverse data modalities, and applications across domains, before discussing current limitations and promising research directions.

Significance. If the coverage is representative, the survey provides a timely consolidation of the rapidly growing Mamba literature (post-2023), which could help researchers identify patterns in architecture variants, data adaptations, and application successes. The explicit three-part structure (preliminaries, models/data/applications, limitations) and grounding in prior empirical claims about near-linear scaling are strengths for a survey in this fast-moving area.

major comments (2)

[Abstract, §1] Abstract and §1: the claim that the survey conducts a 'systematic review' and 'in-depth investigation' is not supported by any description of search strategy, inclusion/exclusion criteria, or database sources; without these the representativeness of the consolidated studies cannot be assessed.
[§3 (architecture/data/applications review)] The weakest assumption noted in the reader report (that the August 2024 corpus is already large and representative) is not addressed; the survey should include a quantitative summary (e.g., number of papers per category, publication timeline) to substantiate that the body of work merits consolidation.

minor comments (2)

[Preliminaries section] Notation for Mamba-1 vs. Mamba-2 parameters and selective SSM equations should be introduced once in the preliminaries and used consistently thereafter to avoid reader confusion when comparing variants.
[Tables/figures in §3] Figure captions and table headers listing surveyed models should include publication year and venue for quick reference; several entries currently omit these.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback and positive recommendation for minor revision. We address the two major comments below and will update the manuscript accordingly to improve transparency and substantiation of the survey's scope.

read point-by-point responses

Referee: [Abstract, §1] Abstract and §1: the claim that the survey conducts a 'systematic review' and 'in-depth investigation' is not supported by any description of search strategy, inclusion/exclusion criteria, or database sources; without these the representativeness of the consolidated studies cannot be assessed.

Authors: We agree that the abstract and §1 would be strengthened by greater methodological transparency. The survey was compiled via ongoing literature tracking on arXiv and related venues up to the August 2024 cutoff, but no formal search protocol was described. In revision we will add a short 'Literature Search Methodology' paragraph (or subsection) in §1 that states the primary sources (arXiv, Google Scholar), core keywords (Mamba, state-space model, selective SSM, etc.), and high-level inclusion criteria (peer-reviewed or preprint works proposing Mamba variants or applications). If space constraints arise we will also soften the phrasing from 'systematic review' to 'comprehensive survey' while retaining the claim of in-depth coverage. revision: yes
Referee: [§3 (architecture/data/applications review)] The weakest assumption noted in the reader report (that the August 2024 corpus is already large and representative) is not addressed; the survey should include a quantitative summary (e.g., number of papers per category, publication timeline) to substantiate that the body of work merits consolidation.

Authors: We concur that a quantitative overview would better justify the decision to consolidate the literature. The current text notes rapid growth qualitatively but provides no counts or timeline. In the revised manuscript we will insert a new table (or figure) early in §3 that reports: (i) total papers reviewed, (ii) breakdown by the three main categories (architecture variants, modality adaptations, domain applications), and (iii) a simple publication-year histogram or cumulative count showing the post-2023 surge. This addition will directly address the representativeness concern while remaining concise. revision: yes

Circularity Check

0 steps flagged

No significant circularity: literature survey with no derivations

full rationale

This manuscript is a survey paper that consolidates existing literature on Mamba models without presenting any original derivations, predictions, fitted parameters, or modeling inferences. Its claims about Mamba's capabilities are explicitly attributed to prior publications as background rather than derived internally. No equations, self-citations, or ansatzes function as load-bearing steps that reduce to the paper's own inputs by construction. The structure is self-contained as a review, with no circularity patterns applicable.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a survey paper. It contains no original mathematical derivations, fitted parameters, or postulated entities; the content is a synthesis of prior published work on Mamba.

pith-pipeline@v0.9.0 · 5813 in / 1070 out tokens · 27149 ms · 2026-05-23T22:06:52.881466+00:00 · methodology

discussion (0)

Forward citations

Cited by 5 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

mKG-RAG: Leveraging Multimodal Knowledge Graphs in Retrieval-Augmented Generation for Knowledge-intensive VQA
cs.CV 2025-08 unverdicted novelty 7.0

mKG-RAG constructs multimodal KGs via MLLM-driven extraction and vision-text matching then applies dual-stage query-aware retrieval to achieve new state-of-the-art results on knowledge-based VQA.
DeMa: Dual-Path Delay-Aware Mamba for Efficient Multivariate Time Series Analysis
cs.LG 2026-01 unverdicted novelty 6.0

DeMa is a dual-path delay-aware Mamba architecture that decomposes MTS into intra-series temporal and inter-series variate paths to achieve SOTA performance with linear complexity on forecasting, imputation, anomaly d...
Predicting one-year clinical instability and mortality in heart failure patients using sequence modeling
cs.LG 2025-11 unverdicted novelty 4.0

Sequence models on EHR data from a Swedish heart failure cohort achieve AUPRCs of 0.555 to 0.854 for one-year instability and mortality predictions and support four care pathways.
When control meets large language models: From words to dynamics
eess.SY 2026-02 unverdicted novelty 3.0

The paper proposes a bidirectional continuum between LLMs and control systems, covering LLM-assisted controller design, control-based LLM steering, and state-space modeling of LLMs.
Advancing Intelligent Sequence Modeling: Evolution, Trade-offs, and Applications of State- Space Architectures from S4 to Mamba
cs.LG 2025-03 unverdicted

A survey tracing the evolution of state-space models like S4 and Mamba, their efficiency trade-offs, and applications in NLP, vision, and other domains.

Reference graph

Works this paper leans on

247 extracted references · 247 canonical work pages · cited by 5 Pith papers · 10 internal anchors

[1]

Ossama Abdel-Hamid, Abdel-rahman Mohamed, Hui Jiang, Li Deng, Gerald Penn, and Dong Yu. 2014. Convolutional neural networks for speech recognition. IEEE/ACM Transactions on audio, speech, and language processing 22, 10 (2014), 1533–1545

work page 2014
[2]

Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, et al. 2023. Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023)

work page internal anchor Pith review Pith/arXiv arXiv 2023
[3]

Md Atik Ahamed and Qiang Cheng. 2024. Timemachine: A time series is worth 4 mambas for long-term forecasting.arXiv preprint arXiv:2403.09898 (2024)

work page arXiv 2024
[4]

Md Atik Ahamed and Qiang Cheng. 2024. TSCMamba: Mamba Meets Multi-View Learning for Time Series Classification. arXiv preprint arXiv:2406.04419 (2024)

work page arXiv 2024
[5]

Quentin Anthony, Yury Tokpanov, Paolo Glorioso, and Beren Millidge. 2024. BlackMamba: Mixture of Experts for State-Space Models. arXiv preprint arXiv:2402.01771 (2024)

work page arXiv 2024
[6]

Anurag Arnab, Mostafa Dehghani, Georg Heigold, Chen Sun, Mario Lučić, and Cordelia Schmid. 2021. Vivit: A video vision transformer. In Proceedings of the IEEE/CVF international conference on computer vision . 6836–6846

work page 2021
[7]

Zhongxin Bai and Xiao-Lei Zhang. 2021. Speaker recognition based on deep learning: An overview. Neural Networks 140 (2021), 65–99

work page 2021
[8]

Malyaban Bal and Abhronil Sengupta. 2024. Rethinking Spiking Neural Networks as State Space Models. arXiv preprint arXiv:2406.02923 (2024)

work page arXiv 2024
[9]

Ali Behrouz and Farnoosh Hashemi. 2024. Graph Mamba: Towards Learning on Graphs with State Space Models. arXiv preprint arXiv:2402.08678 (2024)

work page arXiv 2024
[10]

Ali Behrouz, Michele Santacatterina, and Ramin Zabih. 2024. Mambamixer: Efficient selective state space models with dual token and channel selection. arXiv preprint arXiv:2403.19888 (2024)

work page arXiv 2024
[11]

Saurabhchand Bhati, Yuan Gong, Leonid Karlinsky, Hilde Kuehne, Rogerio Feris, and James Glass. 2024. DASS: Distilled Audio State Space Models Are Stronger and More Duration-Scalable Learners. arXiv preprint arXiv:2407.04082 (2024)

work page arXiv 2024
[12]

Raunaq Bhirangi, Chenyu Wang, Venkatesh Pattabiraman, Carmel Majidi, Abhinav Gupta, Tess Hellebrekers, and Lerrel Pinto. 2024. Hierarchical State Space Models for Continuous Sequence-to-Sequence Modeling. arXiv preprint arXiv:2402.10211 (2024)

work page arXiv 2024
[13]

Rishi Bommasani, Drew A Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, et al. 2021. On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258 (2021)

work page internal anchor Pith review Pith/arXiv arXiv 2021
[14]

Florian Le Bronnec, Song Duong, Mathieu Ravaut, Alexandre Allauzen, Nancy F Chen, Vincent Guigue, Alberto Lumbreras, Laure Soulier, and Patrick Gallinari. 2024. LOCOST: State-Space Models for Long Document Abstractive Summarization. arXiv preprint arXiv:2401.17919 (2024)

work page arXiv 2024
[15]

Jiahang Cao, Qiang Zhang, Ziqing Wang, Jiaxu Wang, Hao Cheng, Yecheng Shao, Wen Zhao, Gang Han, Yijie Guo, and Renjing Xu. 2024. Mamba as Decision Maker: Exploring Multi-scale Sequence Modeling in Offline Reinforcement Learning. arXiv preprint arXiv:2406.02013 (2024)

work page arXiv 2024
[16]

Yang Cao and Wei Zhang. 2024. Mamba4KT: An Efficient and Effective Mamba-based Knowledge Tracing Model. arXiv preprint arXiv:2405.16542 (2024)

work page arXiv 2024
[17]

Rong Chao, Wen-Huang Cheng, Moreno La Quatra, Sabato Marco Siniscalchi, Chao-Han Huck Yang, Szu-Wei Fu, and Yu Tsao. 2024. An Investigation of Incorporating Mamba for Speech Enhancement. arXiv preprint arXiv:2405.06573 (2024)

work page arXiv 2024
[18]

Soumyabrata Chaudhuri and Saumik Bhattacharya. 2024. Simba: Mamba augmented U-ShiftGCN for Skeletal Action Recognition in Videos. arXiv preprint arXiv:2404.07645 (2024)

work page arXiv 2024
[19]

Chi-Sheng Chen, Guan-Ying Chen, Dong Zhou, Di Jiang, and Dai-Shi Chen. 2024. Res-VMamba: Fine-Grained Food Category Visual Classification Using Selective State Space Models with Deep Residual Learning. arXiv preprint arXiv:2402.15761 (2024)

work page arXiv 2024
[20]

Deli Chen, Yankai Lin, Wei Li, Peng Li, Jie Zhou, and Xu Sun. 2020. Measuring and relieving the over-smoothing problem for graph neural networks from the topological view. In Proceedings of the AAAI conference on artificial intelligence , Vol. 34. 3438–3445. Manuscript submitted to ACM 32 Qu et al

work page 2020
[21]

Hongruixuan Chen, Jian Song, Chengxi Han, Junshi Xia, and Naoto Yokoya. 2024. Changemamba: Remote sensing change detection with spatio-temporal state space model. arXiv preprint arXiv:2404.03425 (2024)

work page arXiv 2024
[22]

Jiawei Chen, Hongyu Lin, Xianpei Han, and Le Sun. 2024. Benchmarking large language models in retrieval-augmented generation. In Proceedings of the AAAI Conference on Artificial Intelligence , Vol. 38. 17754–17762

work page 2024
[23]

Keyan Chen, Bowen Chen, Chenyang Liu, Wenyuan Li, Zhengxia Zou, and Zhenwei Shi. 2024. Rsmamba: Remote sensing image classification with state space model. arXiv preprint arXiv:2403.19654 (2024)

work page arXiv 2024
[24]

Tianxiang Chen, Zhentao Tan, Tao Gong, Qi Chu, Yue Wu, Bin Liu, Jieping Ye, and Nenghai Yu. 2024. Mim-istd: Mamba-in-mamba for efficient infrared small target detection. arXiv preprint arXiv:2403.02148 (2024)

work page arXiv 2024
[25]

Xiao Chen, Wenqi Fan, Jingfan Chen, Haochen Liu, Zitao Liu, Zhaoxiang Zhang, and Qing Li. 2023. Fairly adaptive negative sampling for recommendations. In Proceedings of the ACM Web Conference 2023 . 3723–3733

work page 2023
[26]

Ying Chen, Jiajing Xie, Yuxiang Lin, Yuhang Song, Wenxian Yang, and Rongshan Yu. 2024. Survmamba: State space model with multi-grained multi-modal interaction for survival prediction. arXiv preprint arXiv:2404.08027 (2024)

work page arXiv 2024
[27]

Yujie Chen, Jiangyan Yi, Jun Xue, Chenglong Wang, Xiaohui Zhang, Shunbo Dong, Siding Zeng, Jianhua Tao, Lv Zhao, and Cunhang Fan. 2024. RawBMamba: End-to-End Bidirectional State Space Model for Audio Deepfake Detection. arXiv preprint arXiv:2406.06086 (2024)

work page arXiv 2024
[28]

Tri Dao and Albert Gu. 2024. Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality. In International Conference on Machine Learning (ICML)

work page 2024
[29]

Rui Deng and Tianpei Gu. 2024. CU-Mamba: Selective State Space Models with Channel Learning for Image Restoration. arXiv preprint arXiv:2404.11778 (2024)

work page arXiv 2024
[30]

Yujuan Ding, Wenqi Fan, Liangbo Ning, Shijie Wang, Hengyun Li, Dawei Yin, Tat-Seng Chua, and Qing Li. 2024. A survey on rag meets llms: Towards retrieval-augmented large language models. arXiv preprint arXiv:2405.06211 (2024)

work page arXiv 2024
[31]

Rares Dolga, Kai Biegun, Jake Cunningham, and David Barber. 2024. RotRNN: Modelling Long Sequences with Rotations. arXiv preprint arXiv:2407.07239 (2024)

work page arXiv 2024
[32]

Wenhao Dong, Haodong Zhu, Shaohui Lin, Xiaoyan Luo, Yunhang Shen, Xuhui Liu, Juan Zhang, Guodong Guo, and Baochang Zhang. 2024. Fusion-mamba for cross-modality object detection. arXiv preprint arXiv:2404.09146 (2024)

work page arXiv 2024
[33]

Xin Luna Dong, Seungwhan Moon, Yifan Ethan Xu, Kshitiz Malik, and Zhou Yu. 2023. Towards next-generation intelligent assistants leveraging llm techniques. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining . 5792–5793

work page 2023
[34]

Filip Karlo Došilović, Mario Brčić, and Nikica Hlupić. 2018. Explainable artificial intelligence: A survey. In2018 41st International convention on information and communication technology, electronics and microelectronics (MIPRO) . IEEE, 0210–0215

work page 2018
[35]

Haruka Ezoe and Kazuhiro Sato. 2024. Learning method for S4 with Diagonal State Space Layers using Balanced Truncation. arXiv preprint arXiv:2402.15993 (2024)

work page arXiv 2024
[36]

Lili Fan, Junhao Wang, Yuanmeng Chang, Yuke Li, Yutong Wang, and Dongpu Cao. 2024. 4D mmWave radar for autonomous driving perception: a comprehensive survey. IEEE Transactions on Intelligent Vehicles (2024)

work page 2024
[37]

Wenqi Fan, Tyler Derr, Yao Ma, Jianping Wang, Jiliang Tang, and Qing Li. 2019. Deep Adversarial Social Recommendation. In28th International Joint Conference on Artificial Intelligence (IJCAI-19) . International Joint Conferences on Artificial Intelligence, 1351–1357

work page 2019
[38]

Wenqi Fan, Qing Li, and Min Cheng. 2018. Deep modeling of social relations for recommendation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32

work page 2018
[39]

Wenqi Fan, Xiaorui Liu, Wei Jin, Xiangyu Zhao, Jiliang Tang, and Qing Li. 2022. Graph Trend Filtering Networks for Recommendation. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval . 112–121

work page 2022
[40]

Wenqi Fan, Yao Ma, Qing Li, Yuan He, Eric Zhao, Jiliang Tang, and Dawei Yin. 2019. Graph neural networks for social recommendation. InThe world wide web conference . 417–426

work page 2019
[41]

Wenqi Fan, Yao Ma, Qing Li, Jianping Wang, Guoyong Cai, Jiliang Tang, and Dawei Yin. 2020. A graph neural network framework for social recommendations. IEEE Transactions on Knowledge and Data Engineering 34, 5 (2020), 2033–2047

work page 2020
[42]

Wenqi Fan, Yao Ma, Dawei Yin, Jianping Wang, Jiliang Tang, and Qing Li. 2019. Deep social collaborative filtering. In Proceedings of the 13th ACM Conference on Recommender Systems . 305–313

work page 2019
[43]

Wenqi Fan, Shijie Wang, Jiani Huang, Zhikai Chen, Yu Song, Wenzhuo Tang, Haitao Mao, Hui Liu, Xiaorui Liu, Dawei Yin, et al. 2024. Graph machine learning in the era of large language models (llms). arXiv preprint arXiv:2404.14928 (2024)

work page arXiv 2024
[44]

Wenqi Fan, Xiangyu Zhao, Qing Li, Tyler Derr, Yao Ma, Hui Liu, Jianping Wang, and Jiliang Tang. 2023. Adversarial Attacks for Black-Box Recommender Systems Via Copying Transferable Cross-Domain User Profiles. IEEE Transactions on Knowledge and Data Engineering (2023)

work page 2023
[45]

William Fedus, Barret Zoph, and Noam Shazeer. 2022. Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity. Journal of Machine Learning Research 23, 120 (2022), 1–39

work page 2022
[46]

Zhengcong Fei, Mingyuan Fan, Changqian Yu, and Junshi Huang. 2024. Scalable Diffusion Models with State Space Backbone. arXiv preprint arXiv:2402.05608 (2024)

work page arXiv 2024
[47]

Daniel Y Fu, Elliot L Epstein, Eric Nguyen, Armin W Thomas, Michael Zhang, Tri Dao, Atri Rudra, and Christopher Ré. 2023. Simple hardware- efficient long convolutions for sequence modeling. In International Conference on Machine Learning . PMLR, 10373–10391

work page 2023
[48]

Guanyiman Fu, Fengchao Xiong, Jianfeng Lu, and Jun Zhou. 2024. Ssumamba: Spatial-spectral selective state space model for hyperspectral image denoising. IEEE Transactions on Geoscience and Remote Sensing (2024). Manuscript submitted to ACM A Survey of Mamba 33

work page 2024
[49]

Linjie Fu, Xia Li, Xiuding Cai, Yingkai Wang, Xueyao Wang, Yali Shen, and Yu Yao. 2024. MD-Dose: A Diffusion Model based on the Mamba for Radiotherapy Dose Prediction. arXiv preprint arXiv:2403.08479 (2024)

work page arXiv 2024
[50]

Peng Gao, Shijie Geng, Renrui Zhang, Teli Ma, Rongyao Fang, Yongfeng Zhang, Hongsheng Li, and Yu Qiao. 2024. Clip-adapter: Better vision- language models with feature adapters. International Journal of Computer Vision 132, 2 (2024), 581–595

work page 2024
[51]

Ruisheng Gao, Zeyu Xiao, and Zhiwei Xiong. 2024. Mamba-based Light Field Super-Resolution with Efficient Subspace Scanning. arXiv preprint arXiv:2406.16083 (2024)

work page arXiv 2024
[52]

Yu Gao, Jiancheng Huang, Xiaopeng Sun, Zequn Jie, Yujie Zhong, and Lin Ma. 2024. Matten: Video Generation with Mamba-Attention. arXiv preprint arXiv:2405.03025 (2024)

work page arXiv 2024
[53]

Negar Golestani and Mahta Moghaddam. 2020. Human activity recognition using magnetic induction-based motion signals and deep recurrent neural networks. Nature communications 11, 1 (2020), 1551

work page 2020
[54]

Haifan Gong, Luoyao Kang, Yitao Wang, Xiang Wan, and Haofeng Li. 2024. nnmamba: 3d biomedical image segmentation, classification and landmark detection with state space model. arXiv preprint arXiv:2402.03526 (2024)

work page arXiv 2024
[55]

Alex Graves and Alex Graves. 2012. Long short-term memory. Supervised sequence labelling with recurrent neural networks (2012), 37–45

work page 2012
[56]

Albert Gu and Tri Dao. 2023. Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752 (2023)

work page internal anchor Pith review Pith/arXiv arXiv 2023
[57]

Albert Gu, Tri Dao, Stefano Ermon, Atri Rudra, and Christopher Ré. 2020. Hippo: Recurrent memory with optimal polynomial projections.Advances in neural information processing systems 33 (2020), 1474–1487

work page 2020
[58]

Albert Gu, Karan Goel, Ankit Gupta, and Christopher Ré. 2022. On the parameterization and initialization of diagonal state space models. Advances in Neural Information Processing Systems 35 (2022), 35971–35983

work page 2022
[59]

Albert Gu, Karan Goel, and Christopher Ré. 2021. Efficiently modeling long sequences with structured state spaces. arXiv preprint arXiv:2111.00396 (2021)

work page internal anchor Pith review Pith/arXiv arXiv 2021
[60]

Albert Gu, Isys Johnson, Karan Goel, Khaled Saab, Tri Dao, Atri Rudra, and Christopher Ré. 2021. Combining recurrent, convolutional, and continuous-time models with linear state space layers. Advances in neural information processing systems 34 (2021), 572–585

work page 2021
[61]

Yanchen Guan, Haicheng Liao, Zhenning Li, Jia Hu, Runze Yuan, Yunjian Li, Guohui Zhang, and Chengzhong Xu. 2024. World models for autonomous driving: An initial survey. IEEE Transactions on Intelligent Vehicles (2024)

work page 2024
[62]

Jeff Guo and Philippe Schwaller. 2024. Saturn: Sample-efficient Generative Molecular Design using Memory Manipulation. arXiv preprint arXiv:2405.17066 (2024)

work page arXiv 2024
[63]

Yulan Guo, Hanyun Wang, Qingyong Hu, Hao Liu, Li Liu, and Mohammed Bennamoun. 2020. Deep learning for 3d point clouds: A survey. IEEE transactions on pattern analysis and machine intelligence 43, 12 (2020), 4338–4364

work page 2020
[64]

Xu Han, Yuan Tang, Zhaoxuan Wang, and Xianzhi Li. 2024. Mamba3d: Enhancing local features for 3d point cloud analysis via state space model. arXiv preprint arXiv:2404.14966 (2024)

work page arXiv 2024
[65]

Mark Harris, Shubhabrata Sengupta, and John D Owens. 2007. Parallel prefix sum (scan) with CUDA. GPU gems 3, 39 (2007), 851–876

work page 2007
[66]

Ali Hatamizadeh and Jan Kautz. 2024. MambaVision: A Hybrid Mamba-Transformer Vision Backbone. arXiv preprint arXiv:2407.08083 (2024)

work page arXiv 2024
[67]

Haoyang He, Yuhu Bai, Jiangning Zhang, Qingdong He, Hongxu Chen, Zhenye Gan, Chengjie Wang, Xiangtai Li, Guanzhong Tian, and Lei Xie

work page
[68]

arXiv preprint arXiv:2404.06564 (2024)

Mambaad: Exploring state space models for multi-class unsupervised anomaly detection. arXiv preprint arXiv:2404.06564 (2024)

work page arXiv 2024
[69]

Wei He, Kai Han, Yehui Tang, Chengcheng Wang, Yujie Yang, Tianyu Guo, and Yunhe Wang. 2024. Densemamba: State space models with dense hidden connection for efficient large language models. arXiv preprint arXiv:2403.00818 (2024)

work page arXiv 2024
[70]

Xuanhua He, Ke Cao, Keyu Yan, Rui Li, Chengjun Xie, Jie Zhang, and Man Zhou. 2024. Pan-Mamba: Effective pan-sharpening with State Space Model. arXiv preprint arXiv:2402.12192 (2024)

work page arXiv 2024
[71]

Michiel Hermans and Benjamin Schrauwen. 2013. Training and analysing deep recurrent neural networks. Advances in neural information processing systems 26 (2013)

work page 2013
[72]

Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising diffusion probabilistic models. Advances in neural information processing systems 33 (2020), 6840–6851

work page 2020
[73]

Alireza Hosseini, Amirhossein Kazerouni, Saeed Akhavan, Michael Brudno, and Babak Taati. 2024. SUM: Saliency Unification through Mamba for Visual Attention Modeling. arXiv preprint arXiv:2406.17815 (2024)

work page arXiv 2024
[74]

Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. 2021. Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685 (2021)

work page internal anchor Pith review Pith/arXiv arXiv 2021
[75]

Hao Hu and Guo-Jun Qi. 2017. State-frequency memory recurrent neural networks. In International Conference on Machine Learning . PMLR, 1568–1577

work page 2017
[76]

Lijie Hu, Yixin Liu, Ninghao Liu, Mengdi Huai, Lichao Sun, and Di Wang. 2023. Seat: stable and explainable attention. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 37. 12907–12915

work page 2023
[77]

Vincent Tao Hu, Stefan Andreas Baumann, Ming Gui, Olga Grebenkova, Pingchuan Ma, Johannes Fischer, and Bjorn Ommer. 2024. Zigma: Zigzag mamba diffusion model. arXiv preprint arXiv:2403.13802 (2024)

work page arXiv 2024
[78]

Chensen Huang, Guibo Zhu, Xuepeng Wang, Yifei Luo, Guojing Ge, Haoran Chen, Dong Yi, and Jinqiao Wang. 2024. Recurrent Context Compression: Efficiently Expanding the Context Window of LLM. arXiv preprint arXiv:2406.06110 (2024)

work page arXiv 2024
[79]

Kexin Huang, Cao Xiao, Lucas M Glass, Marinka Zitnik, and Jimeng Sun. 2020. SkipGNN: predicting molecular interactions with skip-graph networks. Scientific reports 10, 1 (2020), 21092. Manuscript submitted to ACM 34 Qu et al

work page 2020
[80]

Ling Huang, Anthony D Joseph, Blaine Nelson, Benjamin IP Rubinstein, and J Doug Tygar. 2011. Adversarial machine learning. In Proceedings of the 4th ACM workshop on Security and artificial intelligence . 43–58

work page 2011

Showing first 80 references.

[1] [1]

Ossama Abdel-Hamid, Abdel-rahman Mohamed, Hui Jiang, Li Deng, Gerald Penn, and Dong Yu. 2014. Convolutional neural networks for speech recognition. IEEE/ACM Transactions on audio, speech, and language processing 22, 10 (2014), 1533–1545

work page 2014

[2] [2]

Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, et al. 2023. Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023)

work page internal anchor Pith review Pith/arXiv arXiv 2023

[3] [3]

Md Atik Ahamed and Qiang Cheng. 2024. Timemachine: A time series is worth 4 mambas for long-term forecasting.arXiv preprint arXiv:2403.09898 (2024)

work page arXiv 2024

[4] [4]

Md Atik Ahamed and Qiang Cheng. 2024. TSCMamba: Mamba Meets Multi-View Learning for Time Series Classification. arXiv preprint arXiv:2406.04419 (2024)

work page arXiv 2024

[5] [5]

Quentin Anthony, Yury Tokpanov, Paolo Glorioso, and Beren Millidge. 2024. BlackMamba: Mixture of Experts for State-Space Models. arXiv preprint arXiv:2402.01771 (2024)

work page arXiv 2024

[6] [6]

Anurag Arnab, Mostafa Dehghani, Georg Heigold, Chen Sun, Mario Lučić, and Cordelia Schmid. 2021. Vivit: A video vision transformer. In Proceedings of the IEEE/CVF international conference on computer vision . 6836–6846

work page 2021

[7] [7]

Zhongxin Bai and Xiao-Lei Zhang. 2021. Speaker recognition based on deep learning: An overview. Neural Networks 140 (2021), 65–99

work page 2021

[8] [8]

Malyaban Bal and Abhronil Sengupta. 2024. Rethinking Spiking Neural Networks as State Space Models. arXiv preprint arXiv:2406.02923 (2024)

work page arXiv 2024

[9] [9]

Ali Behrouz and Farnoosh Hashemi. 2024. Graph Mamba: Towards Learning on Graphs with State Space Models. arXiv preprint arXiv:2402.08678 (2024)

work page arXiv 2024

[10] [10]

Ali Behrouz, Michele Santacatterina, and Ramin Zabih. 2024. Mambamixer: Efficient selective state space models with dual token and channel selection. arXiv preprint arXiv:2403.19888 (2024)

work page arXiv 2024

[11] [11]

Saurabhchand Bhati, Yuan Gong, Leonid Karlinsky, Hilde Kuehne, Rogerio Feris, and James Glass. 2024. DASS: Distilled Audio State Space Models Are Stronger and More Duration-Scalable Learners. arXiv preprint arXiv:2407.04082 (2024)

work page arXiv 2024

[12] [12]

Raunaq Bhirangi, Chenyu Wang, Venkatesh Pattabiraman, Carmel Majidi, Abhinav Gupta, Tess Hellebrekers, and Lerrel Pinto. 2024. Hierarchical State Space Models for Continuous Sequence-to-Sequence Modeling. arXiv preprint arXiv:2402.10211 (2024)

work page arXiv 2024

[13] [13]

Rishi Bommasani, Drew A Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, et al. 2021. On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258 (2021)

work page internal anchor Pith review Pith/arXiv arXiv 2021

[14] [14]

Florian Le Bronnec, Song Duong, Mathieu Ravaut, Alexandre Allauzen, Nancy F Chen, Vincent Guigue, Alberto Lumbreras, Laure Soulier, and Patrick Gallinari. 2024. LOCOST: State-Space Models for Long Document Abstractive Summarization. arXiv preprint arXiv:2401.17919 (2024)

work page arXiv 2024

[15] [15]

Jiahang Cao, Qiang Zhang, Ziqing Wang, Jiaxu Wang, Hao Cheng, Yecheng Shao, Wen Zhao, Gang Han, Yijie Guo, and Renjing Xu. 2024. Mamba as Decision Maker: Exploring Multi-scale Sequence Modeling in Offline Reinforcement Learning. arXiv preprint arXiv:2406.02013 (2024)

work page arXiv 2024

[16] [16]

Yang Cao and Wei Zhang. 2024. Mamba4KT: An Efficient and Effective Mamba-based Knowledge Tracing Model. arXiv preprint arXiv:2405.16542 (2024)

work page arXiv 2024

[17] [17]

Rong Chao, Wen-Huang Cheng, Moreno La Quatra, Sabato Marco Siniscalchi, Chao-Han Huck Yang, Szu-Wei Fu, and Yu Tsao. 2024. An Investigation of Incorporating Mamba for Speech Enhancement. arXiv preprint arXiv:2405.06573 (2024)

work page arXiv 2024

[18] [18]

Soumyabrata Chaudhuri and Saumik Bhattacharya. 2024. Simba: Mamba augmented U-ShiftGCN for Skeletal Action Recognition in Videos. arXiv preprint arXiv:2404.07645 (2024)

work page arXiv 2024

[19] [19]

Chi-Sheng Chen, Guan-Ying Chen, Dong Zhou, Di Jiang, and Dai-Shi Chen. 2024. Res-VMamba: Fine-Grained Food Category Visual Classification Using Selective State Space Models with Deep Residual Learning. arXiv preprint arXiv:2402.15761 (2024)

work page arXiv 2024

[20] [20]

Deli Chen, Yankai Lin, Wei Li, Peng Li, Jie Zhou, and Xu Sun. 2020. Measuring and relieving the over-smoothing problem for graph neural networks from the topological view. In Proceedings of the AAAI conference on artificial intelligence , Vol. 34. 3438–3445. Manuscript submitted to ACM 32 Qu et al

work page 2020

[21] [21]

Hongruixuan Chen, Jian Song, Chengxi Han, Junshi Xia, and Naoto Yokoya. 2024. Changemamba: Remote sensing change detection with spatio-temporal state space model. arXiv preprint arXiv:2404.03425 (2024)

work page arXiv 2024

[22] [22]

Jiawei Chen, Hongyu Lin, Xianpei Han, and Le Sun. 2024. Benchmarking large language models in retrieval-augmented generation. In Proceedings of the AAAI Conference on Artificial Intelligence , Vol. 38. 17754–17762

work page 2024

[23] [23]

Keyan Chen, Bowen Chen, Chenyang Liu, Wenyuan Li, Zhengxia Zou, and Zhenwei Shi. 2024. Rsmamba: Remote sensing image classification with state space model. arXiv preprint arXiv:2403.19654 (2024)

work page arXiv 2024

[24] [24]

Tianxiang Chen, Zhentao Tan, Tao Gong, Qi Chu, Yue Wu, Bin Liu, Jieping Ye, and Nenghai Yu. 2024. Mim-istd: Mamba-in-mamba for efficient infrared small target detection. arXiv preprint arXiv:2403.02148 (2024)

work page arXiv 2024

[25] [25]

Xiao Chen, Wenqi Fan, Jingfan Chen, Haochen Liu, Zitao Liu, Zhaoxiang Zhang, and Qing Li. 2023. Fairly adaptive negative sampling for recommendations. In Proceedings of the ACM Web Conference 2023 . 3723–3733

work page 2023

[26] [26]

Ying Chen, Jiajing Xie, Yuxiang Lin, Yuhang Song, Wenxian Yang, and Rongshan Yu. 2024. Survmamba: State space model with multi-grained multi-modal interaction for survival prediction. arXiv preprint arXiv:2404.08027 (2024)

work page arXiv 2024

[27] [27]

Yujie Chen, Jiangyan Yi, Jun Xue, Chenglong Wang, Xiaohui Zhang, Shunbo Dong, Siding Zeng, Jianhua Tao, Lv Zhao, and Cunhang Fan. 2024. RawBMamba: End-to-End Bidirectional State Space Model for Audio Deepfake Detection. arXiv preprint arXiv:2406.06086 (2024)

work page arXiv 2024

[28] [28]

Tri Dao and Albert Gu. 2024. Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality. In International Conference on Machine Learning (ICML)

work page 2024

[29] [29]

Rui Deng and Tianpei Gu. 2024. CU-Mamba: Selective State Space Models with Channel Learning for Image Restoration. arXiv preprint arXiv:2404.11778 (2024)

work page arXiv 2024

[30] [30]

Yujuan Ding, Wenqi Fan, Liangbo Ning, Shijie Wang, Hengyun Li, Dawei Yin, Tat-Seng Chua, and Qing Li. 2024. A survey on rag meets llms: Towards retrieval-augmented large language models. arXiv preprint arXiv:2405.06211 (2024)

work page arXiv 2024

[31] [31]

Rares Dolga, Kai Biegun, Jake Cunningham, and David Barber. 2024. RotRNN: Modelling Long Sequences with Rotations. arXiv preprint arXiv:2407.07239 (2024)

work page arXiv 2024

[32] [32]

Wenhao Dong, Haodong Zhu, Shaohui Lin, Xiaoyan Luo, Yunhang Shen, Xuhui Liu, Juan Zhang, Guodong Guo, and Baochang Zhang. 2024. Fusion-mamba for cross-modality object detection. arXiv preprint arXiv:2404.09146 (2024)

work page arXiv 2024

[33] [33]

Xin Luna Dong, Seungwhan Moon, Yifan Ethan Xu, Kshitiz Malik, and Zhou Yu. 2023. Towards next-generation intelligent assistants leveraging llm techniques. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining . 5792–5793

work page 2023

[34] [34]

Filip Karlo Došilović, Mario Brčić, and Nikica Hlupić. 2018. Explainable artificial intelligence: A survey. In2018 41st International convention on information and communication technology, electronics and microelectronics (MIPRO) . IEEE, 0210–0215

work page 2018

[35] [35]

Haruka Ezoe and Kazuhiro Sato. 2024. Learning method for S4 with Diagonal State Space Layers using Balanced Truncation. arXiv preprint arXiv:2402.15993 (2024)

work page arXiv 2024

[36] [36]

Lili Fan, Junhao Wang, Yuanmeng Chang, Yuke Li, Yutong Wang, and Dongpu Cao. 2024. 4D mmWave radar for autonomous driving perception: a comprehensive survey. IEEE Transactions on Intelligent Vehicles (2024)

work page 2024

[37] [37]

Wenqi Fan, Tyler Derr, Yao Ma, Jianping Wang, Jiliang Tang, and Qing Li. 2019. Deep Adversarial Social Recommendation. In28th International Joint Conference on Artificial Intelligence (IJCAI-19) . International Joint Conferences on Artificial Intelligence, 1351–1357

work page 2019

[38] [38]

Wenqi Fan, Qing Li, and Min Cheng. 2018. Deep modeling of social relations for recommendation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32

work page 2018

[39] [39]

Wenqi Fan, Xiaorui Liu, Wei Jin, Xiangyu Zhao, Jiliang Tang, and Qing Li. 2022. Graph Trend Filtering Networks for Recommendation. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval . 112–121

work page 2022

[40] [40]

Wenqi Fan, Yao Ma, Qing Li, Yuan He, Eric Zhao, Jiliang Tang, and Dawei Yin. 2019. Graph neural networks for social recommendation. InThe world wide web conference . 417–426

work page 2019

[41] [41]

Wenqi Fan, Yao Ma, Qing Li, Jianping Wang, Guoyong Cai, Jiliang Tang, and Dawei Yin. 2020. A graph neural network framework for social recommendations. IEEE Transactions on Knowledge and Data Engineering 34, 5 (2020), 2033–2047

work page 2020

[42] [42]

Wenqi Fan, Yao Ma, Dawei Yin, Jianping Wang, Jiliang Tang, and Qing Li. 2019. Deep social collaborative filtering. In Proceedings of the 13th ACM Conference on Recommender Systems . 305–313

work page 2019

[43] [43]

Wenqi Fan, Shijie Wang, Jiani Huang, Zhikai Chen, Yu Song, Wenzhuo Tang, Haitao Mao, Hui Liu, Xiaorui Liu, Dawei Yin, et al. 2024. Graph machine learning in the era of large language models (llms). arXiv preprint arXiv:2404.14928 (2024)

work page arXiv 2024

[44] [44]

Wenqi Fan, Xiangyu Zhao, Qing Li, Tyler Derr, Yao Ma, Hui Liu, Jianping Wang, and Jiliang Tang. 2023. Adversarial Attacks for Black-Box Recommender Systems Via Copying Transferable Cross-Domain User Profiles. IEEE Transactions on Knowledge and Data Engineering (2023)

work page 2023

[45] [45]

William Fedus, Barret Zoph, and Noam Shazeer. 2022. Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity. Journal of Machine Learning Research 23, 120 (2022), 1–39

work page 2022

[46] [46]

Zhengcong Fei, Mingyuan Fan, Changqian Yu, and Junshi Huang. 2024. Scalable Diffusion Models with State Space Backbone. arXiv preprint arXiv:2402.05608 (2024)

work page arXiv 2024

[47] [47]

Daniel Y Fu, Elliot L Epstein, Eric Nguyen, Armin W Thomas, Michael Zhang, Tri Dao, Atri Rudra, and Christopher Ré. 2023. Simple hardware- efficient long convolutions for sequence modeling. In International Conference on Machine Learning . PMLR, 10373–10391

work page 2023

[48] [48]

Guanyiman Fu, Fengchao Xiong, Jianfeng Lu, and Jun Zhou. 2024. Ssumamba: Spatial-spectral selective state space model for hyperspectral image denoising. IEEE Transactions on Geoscience and Remote Sensing (2024). Manuscript submitted to ACM A Survey of Mamba 33

work page 2024

[49] [49]

Linjie Fu, Xia Li, Xiuding Cai, Yingkai Wang, Xueyao Wang, Yali Shen, and Yu Yao. 2024. MD-Dose: A Diffusion Model based on the Mamba for Radiotherapy Dose Prediction. arXiv preprint arXiv:2403.08479 (2024)

work page arXiv 2024

[50] [50]

Peng Gao, Shijie Geng, Renrui Zhang, Teli Ma, Rongyao Fang, Yongfeng Zhang, Hongsheng Li, and Yu Qiao. 2024. Clip-adapter: Better vision- language models with feature adapters. International Journal of Computer Vision 132, 2 (2024), 581–595

work page 2024

[51] [51]

Ruisheng Gao, Zeyu Xiao, and Zhiwei Xiong. 2024. Mamba-based Light Field Super-Resolution with Efficient Subspace Scanning. arXiv preprint arXiv:2406.16083 (2024)

work page arXiv 2024

[52] [52]

Yu Gao, Jiancheng Huang, Xiaopeng Sun, Zequn Jie, Yujie Zhong, and Lin Ma. 2024. Matten: Video Generation with Mamba-Attention. arXiv preprint arXiv:2405.03025 (2024)

work page arXiv 2024

[53] [53]

Negar Golestani and Mahta Moghaddam. 2020. Human activity recognition using magnetic induction-based motion signals and deep recurrent neural networks. Nature communications 11, 1 (2020), 1551

work page 2020

[54] [54]

Haifan Gong, Luoyao Kang, Yitao Wang, Xiang Wan, and Haofeng Li. 2024. nnmamba: 3d biomedical image segmentation, classification and landmark detection with state space model. arXiv preprint arXiv:2402.03526 (2024)

work page arXiv 2024

[55] [55]

Alex Graves and Alex Graves. 2012. Long short-term memory. Supervised sequence labelling with recurrent neural networks (2012), 37–45

work page 2012

[56] [56]

Albert Gu and Tri Dao. 2023. Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752 (2023)

work page internal anchor Pith review Pith/arXiv arXiv 2023

[57] [57]

Albert Gu, Tri Dao, Stefano Ermon, Atri Rudra, and Christopher Ré. 2020. Hippo: Recurrent memory with optimal polynomial projections.Advances in neural information processing systems 33 (2020), 1474–1487

work page 2020

[58] [58]

Albert Gu, Karan Goel, Ankit Gupta, and Christopher Ré. 2022. On the parameterization and initialization of diagonal state space models. Advances in Neural Information Processing Systems 35 (2022), 35971–35983

work page 2022

[59] [59]

Albert Gu, Karan Goel, and Christopher Ré. 2021. Efficiently modeling long sequences with structured state spaces. arXiv preprint arXiv:2111.00396 (2021)

work page internal anchor Pith review Pith/arXiv arXiv 2021

[60] [60]

Albert Gu, Isys Johnson, Karan Goel, Khaled Saab, Tri Dao, Atri Rudra, and Christopher Ré. 2021. Combining recurrent, convolutional, and continuous-time models with linear state space layers. Advances in neural information processing systems 34 (2021), 572–585

work page 2021

[61] [61]

Yanchen Guan, Haicheng Liao, Zhenning Li, Jia Hu, Runze Yuan, Yunjian Li, Guohui Zhang, and Chengzhong Xu. 2024. World models for autonomous driving: An initial survey. IEEE Transactions on Intelligent Vehicles (2024)

work page 2024

[62] [62]

Jeff Guo and Philippe Schwaller. 2024. Saturn: Sample-efficient Generative Molecular Design using Memory Manipulation. arXiv preprint arXiv:2405.17066 (2024)

work page arXiv 2024

[63] [63]

Yulan Guo, Hanyun Wang, Qingyong Hu, Hao Liu, Li Liu, and Mohammed Bennamoun. 2020. Deep learning for 3d point clouds: A survey. IEEE transactions on pattern analysis and machine intelligence 43, 12 (2020), 4338–4364

work page 2020

[64] [64]

Xu Han, Yuan Tang, Zhaoxuan Wang, and Xianzhi Li. 2024. Mamba3d: Enhancing local features for 3d point cloud analysis via state space model. arXiv preprint arXiv:2404.14966 (2024)

work page arXiv 2024

[65] [65]

Mark Harris, Shubhabrata Sengupta, and John D Owens. 2007. Parallel prefix sum (scan) with CUDA. GPU gems 3, 39 (2007), 851–876

work page 2007

[66] [66]

Ali Hatamizadeh and Jan Kautz. 2024. MambaVision: A Hybrid Mamba-Transformer Vision Backbone. arXiv preprint arXiv:2407.08083 (2024)

work page arXiv 2024

[67] [67]

Haoyang He, Yuhu Bai, Jiangning Zhang, Qingdong He, Hongxu Chen, Zhenye Gan, Chengjie Wang, Xiangtai Li, Guanzhong Tian, and Lei Xie

work page

[68] [68]

arXiv preprint arXiv:2404.06564 (2024)

Mambaad: Exploring state space models for multi-class unsupervised anomaly detection. arXiv preprint arXiv:2404.06564 (2024)

work page arXiv 2024

[69] [69]

Wei He, Kai Han, Yehui Tang, Chengcheng Wang, Yujie Yang, Tianyu Guo, and Yunhe Wang. 2024. Densemamba: State space models with dense hidden connection for efficient large language models. arXiv preprint arXiv:2403.00818 (2024)

work page arXiv 2024

[70] [70]

Xuanhua He, Ke Cao, Keyu Yan, Rui Li, Chengjun Xie, Jie Zhang, and Man Zhou. 2024. Pan-Mamba: Effective pan-sharpening with State Space Model. arXiv preprint arXiv:2402.12192 (2024)

work page arXiv 2024

[71] [71]

Michiel Hermans and Benjamin Schrauwen. 2013. Training and analysing deep recurrent neural networks. Advances in neural information processing systems 26 (2013)

work page 2013

[72] [72]

Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising diffusion probabilistic models. Advances in neural information processing systems 33 (2020), 6840–6851

work page 2020

[73] [73]

Alireza Hosseini, Amirhossein Kazerouni, Saeed Akhavan, Michael Brudno, and Babak Taati. 2024. SUM: Saliency Unification through Mamba for Visual Attention Modeling. arXiv preprint arXiv:2406.17815 (2024)

work page arXiv 2024

[74] [74]

Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. 2021. Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685 (2021)

work page internal anchor Pith review Pith/arXiv arXiv 2021

[75] [75]

Hao Hu and Guo-Jun Qi. 2017. State-frequency memory recurrent neural networks. In International Conference on Machine Learning . PMLR, 1568–1577

work page 2017

[76] [76]

Lijie Hu, Yixin Liu, Ninghao Liu, Mengdi Huai, Lichao Sun, and Di Wang. 2023. Seat: stable and explainable attention. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 37. 12907–12915

work page 2023

[77] [77]

Vincent Tao Hu, Stefan Andreas Baumann, Ming Gui, Olga Grebenkova, Pingchuan Ma, Johannes Fischer, and Bjorn Ommer. 2024. Zigma: Zigzag mamba diffusion model. arXiv preprint arXiv:2403.13802 (2024)

work page arXiv 2024

[78] [78]

Chensen Huang, Guibo Zhu, Xuepeng Wang, Yifei Luo, Guojing Ge, Haoran Chen, Dong Yi, and Jinqiao Wang. 2024. Recurrent Context Compression: Efficiently Expanding the Context Window of LLM. arXiv preprint arXiv:2406.06110 (2024)

work page arXiv 2024

[79] [79]

Kexin Huang, Cao Xiao, Lucas M Glass, Marinka Zitnik, and Jimeng Sun. 2020. SkipGNN: predicting molecular interactions with skip-graph networks. Scientific reports 10, 1 (2020), 21092. Manuscript submitted to ACM 34 Qu et al

work page 2020

[80] [80]

Ling Huang, Anthony D Joseph, Blaine Nelson, Benjamin IP Rubinstein, and J Doug Tygar. 2011. Adversarial machine learning. In Proceedings of the 4th ACM workshop on Security and artificial intelligence . 43–58

work page 2011