Parameter-efficient fine-tuning of state space models.arXiv preprint arXiv:2410.09016

URLhttps://arxiv · 2024 · arXiv 2410.09016

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

citation-role summary

background 1 method 1

citation-polarity summary

background 1 use method 1

representative citing papers

WriteSAE: Sparse Autoencoders for Recurrent State

cs.LG · 2026-05-12 · unverdicted · novelty 8.0 · 4 refs

WriteSAE introduces sparse autoencoders with rank-1 matrix atoms for recurrent state updates, allowing replacement tests that outperform deletion on 92.4% of positions and a formula predicting logit changes with R²=0.98.

S0 Tuning: Zero-Overhead Adaptation of Hybrid Recurrent-Attention Models

cs.CL · 2026-04-01 · conditional · novelty 7.0

S0 tuning optimizes initial recurrent states in hybrid models to outperform LoRA with zero inference cost on HumanEval and partial cross-domain transfer.

Where Should LoRA Go? Component-Type Placement in Hybrid Language Models

cs.CL · 2026-04-24 · unverdicted · novelty 6.0

Adapting only the attention components with LoRA outperforms full-model adaptation in hybrid LLMs, with recurrent adaptation harming sequential hybrids but helping parallel ones.

citing papers explorer

Showing 3 of 3 citing papers.

WriteSAE: Sparse Autoencoders for Recurrent State cs.LG · 2026-05-12 · unverdicted · none · ref 18 · 4 links
WriteSAE introduces sparse autoencoders with rank-1 matrix atoms for recurrent state updates, allowing replacement tests that outperform deletion on 92.4% of positions and a formula predicting logit changes with R²=0.98.
S0 Tuning: Zero-Overhead Adaptation of Hybrid Recurrent-Attention Models cs.CL · 2026-04-01 · conditional · none · ref 5
S0 tuning optimizes initial recurrent states in hybrid models to outperform LoRA with zero inference cost on HumanEval and partial cross-domain transfer.
Where Should LoRA Go? Component-Type Placement in Hybrid Language Models cs.CL · 2026-04-24 · unverdicted · none · ref 2
Adapting only the attention components with LoRA outperforms full-model adaptation in hybrid LLMs, with recurrent adaptation harming sequential hybrids but helping parallel ones.

Parameter-efficient fine-tuning of state space models.arXiv preprint arXiv:2410.09016

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer