PyTorch implementation of Qwen3-Next, modular_qwen3_next.py

Qwen Team, Hugging Face · 2025

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Provably Shorter Scratchpads in Hybrid DeltaNet-Attention Decoders

cs.LG · 2026-05-15 · unverdicted · novelty 6.0

Hybrid Gated DeltaNet-Attention decoders solve parity-conditioned retrieval with O(1) scratchpad while pure Gated DeltaNet cannot and pure Gated Attention needs polynomial length.

citing papers explorer

Showing 1 of 1 citing paper.

Provably Shorter Scratchpads in Hybrid DeltaNet-Attention Decoders cs.LG · 2026-05-15 · unverdicted · none · ref 12
Hybrid Gated DeltaNet-Attention decoders solve parity-conditioned retrieval with O(1) scratchpad while pure Gated DeltaNet cannot and pure Gated Attention needs polynomial length.

PyTorch implementation of Qwen3-Next, modular_qwen3_next.py

fields

years

verdicts

representative citing papers

citing papers explorer