A 194M-parameter spiking dual-path model trained on 3B Chinese-English tokens achieves held-out PPL 8.88-8.93 at >89% per-element sparsity, trailing GPT-2 201M by 7.7% while showing that LIF temporal integration outperforms simple top-k masking at matched sparsity.
2005.864128
4 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 4representative citing papers
CryoZip delivers up to 48x compression (1.8x over prior art) and 4-26x energy savings for QEC syndromes in 22 nm FDSOI at 4 K, reaching 14,238x bandwidth reduction and 42x energy savings when paired with a predecoder.
BMRUs enable analog recurrent neural network hardware via discrete outputs that suppress noise 20-fold, with one-to-one parameter-to-circuit mapping and linear power scaling for recurrence.
ShadowNPU presents shadowAttn, a co-designed sparse attention system that uses NPU pilot compute and techniques like graph bucketing and per-head sparsity to minimize CPU/GPU fallback during on-device LLM inference while maintaining accuracy.
citing papers explorer
-
SymbolicLight V1: Spike-Gated Dual-Path Language Modeling with High Activation Sparsity and Sub-Billion-Scale Pre-Training Evidence
A 194M-parameter spiking dual-path model trained on 3B Chinese-English tokens achieves held-out PPL 8.88-8.93 at >89% per-element sparsity, trailing GPT-2 201M by 7.7% while showing that LIF temporal integration outperforms simple top-k masking at matched sparsity.
-
CryoZip: An Efficient Cryogenic Compressor for Quantum Error Correction Syndromes
CryoZip delivers up to 48x compression (1.8x over prior art) and 4-26x energy savings for QEC syndromes in 22 nm FDSOI at 4 K, reaching 14,238x bandwidth reduction and 42x energy savings when paired with a predecoder.
-
Hardware-Software Co-Design of Scalable, Energy-Efficient Analog Recurrent Computations
BMRUs enable analog recurrent neural network hardware via discrete outputs that suppress noise 20-fold, with one-to-one parameter-to-circuit mapping and linear power scaling for recurrence.
-
ShadowNPU: System and Algorithm Co-design for NPU-Centric On-Device LLM Inference
ShadowNPU presents shadowAttn, a co-designed sparse attention system that uses NPU pilot compute and techniques like graph bucketing and per-head sparsity to minimize CPU/GPU fallback during on-device LLM inference while maintaining accuracy.