SymbolicLight V1: Spike-Gated Dual-Path Language Modeling with High Activation Sparsity and Sub-Billion-Scale Pre-Training Evidence

· 2026 · cs.CL · arXiv 2605.21333

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Natively trained spiking language models struggle to combine Transformer-like language quality, stable multi-domain pre-training, and high activation sparsity. We present SymbolicLight V1, a spike-gated dual-path language model that combines binary Leaky Integrate-and-Fire spike dynamics with a continuous residual stream. Its Dual-Path SparseTCAM module replaces dense self-attention with an exponential-decay aggregation path for long-range memory and a spike-gated local attention path for short-range precision, complemented by a dynamic context-conditioned decoding head and a bilingual tokenizer. A 194M-parameter SymbolicLight V1 model trained from scratch on a 3B-token Chinese-English corpus reaches held-out validation PPL 8.88-8.93 across four independent runs at >89% per-element activation sparsity. It trails GPT-2 201M by 7.7% in PPL while surpassing GPT-2 124M under the reported comparison. Component ablations at matched 0.5B-token training budgets show that the spike-gated local attention path is the largest contributor, and that replacing LIF dynamics with a deterministic top-k mask at matched sparsity causes a larger degradation, indicating that temporal integration rather than sparsity alone drives performance. We also report a 0.8B-parameter scale-up run trained on 48.8B tokens as evidence of optimization and sparsity preservation, not as a primary quality comparison. Current dense-hardware inference is slower than GPT-2, so neuromorphic deployment is presented as a future sparsity-driven opportunity rather than an achieved hardware speedup.

representative citing papers

Spike-Aware C++ INT8 Inference for Sparse Spiking Language Models on Commodity CPUs

cs.NE · 2026-06-02 · unverdicted · novelty 5.0

A spike-aware C++ INT8 runtime for sparse spiking LMs delivers 22.63 tokens/s single-thread on Ryzen 7, beating several Q8_0 dense models in llama.cpp while cutting weights from 3.49 GB to 1.06 GB, at the cost of higher perplexity.

citing papers explorer

Showing 1 of 1 citing paper.

Spike-Aware C++ INT8 Inference for Sparse Spiking Language Models on Commodity CPUs cs.NE · 2026-06-02 · unverdicted · none · ref 2 · internal anchor
A spike-aware C++ INT8 runtime for sparse spiking LMs delivers 22.63 tokens/s single-thread on Ryzen 7, beating several Q8_0 dense models in llama.cpp while cutting weights from 3.49 GB to 1.06 GB, at the cost of higher perplexity.

SymbolicLight V1: Spike-Gated Dual-Path Language Modeling with High Activation Sparsity and Sub-Billion-Scale Pre-Training Evidence

fields

years

verdicts

representative citing papers

citing papers explorer