{"paper":{"title":"BiSpikCLM: A Spiking Language Model integrating Softmax-Free Spiking Attention and Spike-Aware Alignment Distillation","license":"http://creativecommons.org/licenses/by/4.0/","headline":"BiSpikCLM creates the first fully binary spiking causal language model that avoids all floating-point matrix multiplications and softmax.","cross_cats":["cs.AI","cs.LG"],"primary_cat":"cs.NE","authors_text":"Chenlin Zhou, Jiaqi Wang, Kehai Chen, Qingyan Meng, Sihang Guo, Zhengyu Ma","submitted_at":"2026-04-14T09:57:15Z","abstract_excerpt":"Spiking Neural Networks (SNNs) offer promising energy-efficient alternatives to large language models (LLMs) due to their event-driven nature and ultra-low power consumption. However, to preserve capacity, most existing spiking LLMs still incur intensive floating-point matrix multiplication (MatMul) and nonlinearities, or training difficulties arising from the complex spatiotemporal dynamics. To address these challenges, we propose BiSpikCLM, the first fully binary spiking MatMul-free causal language model. BiSpikCLM introduces Softmax-Free Spiking Attention (SFSA), eliminating softmax and flo"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"BiSpikCLM achieves competitive performance at only 4.16% - 5.87% of the computational cost on natural language generation tasks while being the first fully binary spiking MatMul-free causal language model.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"That the Spike-Aware Alignment Distillation can align the spiking student to the ANN teacher across embeddings, attention maps, features, and logits without introducing unrecoverable capacity loss or requiring hidden floating-point operations during inference.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"BiSpikCLM is the first fully binary spiking MatMul-free causal language model that matches ANN performance on generation tasks using only 4-6 percent of the compute via softmax-free spiking attention and spike-aware distillation.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"BiSpikCLM creates the first fully binary spiking causal language model that avoids all floating-point matrix multiplications and softmax.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"f5a03709bbdeca0b78f55750f43d454fa2ff817c0d6684a885c9ce781707be00"},"source":{"id":"2605.13859","kind":"arxiv","version":1},"verdict":{"id":"013dec39-4e5a-40be-bd85-6a35744319e8","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-15T07:06:47.550153Z","strongest_claim":"BiSpikCLM achieves competitive performance at only 4.16% - 5.87% of the computational cost on natural language generation tasks while being the first fully binary spiking MatMul-free causal language model.","one_line_summary":"BiSpikCLM is the first fully binary spiking MatMul-free causal language model that matches ANN performance on generation tasks using only 4-6 percent of the compute via softmax-free spiking attention and spike-aware distillation.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"That the Spike-Aware Alignment Distillation can align the spiking student to the ANN teacher across embeddings, attention maps, features, and logits without introducing unrecoverable capacity loss or requiring hidden floating-point operations during inference.","pith_extraction_headline":"BiSpikCLM creates the first fully binary spiking causal language model that avoids all floating-point matrix multiplications and softmax."},"references":{"count":29,"sample":[{"doi":"","year":null,"title":"GPT-4 Technical Report","work_id":"b928e041-6991-4c08-8c81-0359e4097c7b","ref_index":1,"cited_arxiv_id":"2303.08774","is_internal_anchor":true},{"doi":"","year":1901,"title":"D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al","work_id":"9806adeb-7378-4bee-a184-3e98c89988dd","ref_index":2,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":1905,"title":"BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions","work_id":"511eeb84-4b95-46d5-b14f-50da43f4f19f","ref_index":3,"cited_arxiv_id":"1905.10044","is_internal_anchor":true},{"doi":"","year":null,"title":"Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge","work_id":"28ea1282-d657-4c61-a83c-f1249be6d6b1","ref_index":4,"cited_arxiv_id":"1803.05457","is_internal_anchor":true},{"doi":"","year":null,"title":"Advancing residual learning towards powerful deep spiking neural networks","work_id":"79dd3e99-cfc2-4116-bf90-68cad89fafeb","ref_index":5,"cited_arxiv_id":"","is_internal_anchor":false}],"resolved_work":29,"snapshot_sha256":"8d1bce9e97f08df03638aa7a9acc82c006d0b560ed9326c6729c19be064f9d08","internal_anchors":9},"formal_canon":{"evidence_count":2,"snapshot_sha256":"dabb6bdd750e4ecd0259d45b65d02178da79b61e2dc75c86277c69f704d81f13"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}