pith. sign in

Sequence Parallelism: Long Sequence Training from System Perspective , url =

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

citation-role summary

background 2 baseline 1

citation-polarity summary

fields

cs.CL 4 cs.CR 1

representative citing papers

Gated Delta Networks: Improving Mamba2 with Delta Rule

cs.CL · 2024-12-09 · unverdicted · novelty 5.0

Gated DeltaNet integrates gating and delta rules into linear transformers, outperforming Mamba2 and DeltaNet on language modeling, reasoning, retrieval, and long-context tasks.

citing papers explorer

Showing 5 of 5 citing papers.