SiLIF models apply SSM dynamics and parametrization to spiking neurons for stable training, reaching new SOTA on event-based and raw-audio speech datasets while using half the compute of SSMs via synaptic delays.
Relu strikes back: Exploiting activation sparsity in large language models
3 Pith papers cite this work. Polarity classification is still indexing.
years
2025 3verdicts
UNVERDICTED 3representative citing papers
SPON adds a small set of trainable input-independent activation vectors as representational anchors, trained by distribution matching, to stabilize sparse activation in LLMs and recover performance lost to hidden-state distribution shifts.
Post-training N:M activation pruning preserves generative performance in LLMs better than equivalent weight pruning, with the 8:16 pattern emerging as a practical hardware-friendly choice.
citing papers explorer
-
SiLIF: Structured State Space Model Dynamics and Parametrization for Spiking Neural Networks
SiLIF models apply SSM dynamics and parametrization to spiking neurons for stable training, reaching new SOTA on event-based and raw-audio speech datasets while using half the compute of SSMs via synaptic delays.
-
Resting Neurons, Active Insights: Robustifying Activation Sparsity in LLMs via Spontaneity
SPON adds a small set of trainable input-independent activation vectors as representational anchors, trained by distribution matching, to stabilize sparse activation in LLMs and recover performance lost to hidden-state distribution shifts.
-
Motivating Next-Gen Accelerators with Flexible (N:M) Activation Sparsity via Benchmarking Lightweight Post-Training Sparsification Approaches
Post-training N:M activation pruning preserves generative performance in LLMs better than equivalent weight pruning, with the 8:16 pattern emerging as a practical hardware-friendly choice.