Efficiently modeling long sequences with structured state spaces

Albert Gu, Karan Goel, Christopher Re · 2021

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Jamba: A Hybrid Transformer-Mamba Language Model

cs.CL · 2024-03-28 · conditional · novelty 7.0

Jamba presents a hybrid Transformer-Mamba MoE architecture for LLMs that delivers state-of-the-art benchmark performance and strong results up to 256K token contexts while fitting in one 80GB GPU with high throughput.

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

cs.CL · 2022-11-09 · unverdicted · novelty 6.0

BLOOM is a 176B-parameter open-access multilingual language model trained on the ROOTS corpus that achieves competitive performance on benchmarks, with improved results after multitask prompted finetuning.

citing papers explorer

Showing 2 of 2 citing papers.

Jamba: A Hybrid Transformer-Mamba Language Model cs.CL · 2024-03-28 · conditional · none · ref 18
Jamba presents a hybrid Transformer-Mamba MoE architecture for LLMs that delivers state-of-the-art benchmark performance and strong results up to 256K token contexts while fitting in one 80GB GPU with high throughput.
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model cs.CL · 2022-11-09 · unverdicted · none · ref 245
BLOOM is a 176B-parameter open-access multilingual language model trained on the ROOTS corpus that achieves competitive performance on benchmarks, with improved results after multitask prompted finetuning.

Efficiently modeling long sequences with structured state spaces

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer