FG²-GDN replaces the scalar beta in the delta update with a channel-wise vector and decouples key/value scaling to improve recall over prior GDN and KDA models.
Parallelizing linear transformers with the delta rule over sequence length,
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
AI incorporating active precision from pyramidal neurons may reduce information overload by evaluating evidence coherence before attention rather than maximizing rewards.
citing papers explorer
-
FG$^2$-GDN: Enhancing Long-Context Gated Delta Networks with Doubly Fine-Grained Control
FG²-GDN replaces the scalar beta in the delta update with a channel-wise vector and decouples key/value scaling to improve recall over prior GDN and KDA models.
-
A Cellular Doctrine of Morality: Intrinsic Active Precision and the Mind-Reality Overload Dilemma
AI incorporating active precision from pyramidal neurons may reduce information overload by evaluating evidence coherence before attention rather than maximizing rewards.