{"paper":{"title":"Elastic Spiking Transformers for Efficient Gesture Understanding","license":"http://creativecommons.org/licenses/by/4.0/","headline":"A single Elastic Spiking Transformer dynamically resizes at runtime to match hardware budgets while matching baseline accuracy in gesture recognition.","cross_cats":["cs.AI","cs.CV"],"primary_cat":"cs.NE","authors_text":"Alberto Ancilotto, Elisabetta Farella, Gianluca Amprimo, Stefano Di Carlo","submitted_at":"2026-05-04T12:35:52Z","abstract_excerpt":"Spiking Neural Networks (SNNs), particularly Spiking Transformers, offer energy-efficient processing of event-based sensor data for healthcare applications. Yet current architectures are rigid: they are trained and deployed as static networks with fixed parameter counts and computational graphs. This limits deployment on neuromorphic hardware such as Loihi and SpiNNaker, where on-chip constraints often require smaller models that trade accuracy for feasibility. We introduce the Elastic Spiking Transformer, a runtime-adaptive architecture that brings elasticity into the spiking paradigm. Inspir"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"one Elastic Spiking Transformer spans a broad range of complexity-accuracy trade-offs, matching or surpassing independently trained baselines while supporting adaptive, real-time gesture recognition on resource-constrained edge devices.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"Granularity-aware weight sharing in the Feature Extractor, Spiking Self-Attention, and Feed-Forward blocks preserves accuracy across all dynamic slices without retraining or performance degradation.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"A single Elastic Spiking Transformer model dynamically slices network width and attention heads at runtime via granularity-aware weight sharing, matching or exceeding fixed baselines on CIFAR and gesture datasets while reducing spike operations.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"A single Elastic Spiking Transformer dynamically resizes at runtime to match hardware budgets while matching baseline accuracy in gesture recognition.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"1b88a0af23123897f6bdb46406ceca30bad8cff81aedca764d5d7652f18bfe29"},"source":{"id":"2605.13869","kind":"arxiv","version":1},"verdict":{"id":"f2b01194-b4b3-4cd1-bb4c-db9e50add424","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-15T06:25:28.622311Z","strongest_claim":"one Elastic Spiking Transformer spans a broad range of complexity-accuracy trade-offs, matching or surpassing independently trained baselines while supporting adaptive, real-time gesture recognition on resource-constrained edge devices.","one_line_summary":"A single Elastic Spiking Transformer model dynamically slices network width and attention heads at runtime via granularity-aware weight sharing, matching or exceeding fixed baselines on CIFAR and gesture datasets while reducing spike operations.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"Granularity-aware weight sharing in the Feature Extractor, Spiking Self-Attention, and Feed-Forward blocks preserves accuracy across all dynamic slices without retraining or performance degradation.","pith_extraction_headline":"A single Elastic Spiking Transformer dynamically resizes at runtime to match hardware budgets while matching baseline accuracy in gesture recognition."},"references":{"count":35,"sample":[{"doi":"","year":2017,"title":"A. Amir, B. Taba, D. Berg, T. Melano, J. McKinstry, C. Di Nolfo, T. Marelli, A. Hsu, G. Sherbondy, and D. S. Modha. A low power, fully event-based gesture recognition system. InProceedings of the IEEE","work_id":"0e54cedc-e180-4cf4-84cd-7243bed29288","ref_index":1,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2025,"title":"G. Amprimo, A. Ancilotto, A. Savino, F. Quazzolo, C. Ferraris, G. Olmo, E. Farella, and S. Di Carlo. Ehwgesture-a dataset for multimodal understanding of clinical gestures. InProceedings of the IEEE/C","work_id":"20195559-1d17-4eb2-8998-5b0116618925","ref_index":2,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2023,"title":"A. Ancilotto, F. Paissan, and E. Farella. Xinet: Efficient neural networks for tinyml.2023 IEEE/CVF International Conference on Computer Vision (ICCV), pages 16922–16931, 2023","work_id":"6eb56ec4-d621-460d-ad11-d47948725606","ref_index":3,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2025,"title":"A. Carpegna, A. Savino, and S. D. Carlo. Spiker+: A framework for the generation of efficient spiking neural networks fpga accelerators for inference at the edge.IEEE Transactions on Emerging Topics i","work_id":"2518fca2-f9b6-4855-ade4-ba6428bbfce0","ref_index":4,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2018,"title":"M. Davies, N. Srinivasa, T.-H. Lin, G. Chinya, Y . Cao, S. H. Choday, G. Dimou, P. Joshi, N. Imam, S. Jain, et al. Loihi: A neuromorphic manycore processor with on-chip learning.IEEE Micro, 38(1):82–9","work_id":"77517c55-03f5-442d-a68c-0de2bc7c0bdc","ref_index":5,"cited_arxiv_id":"","is_internal_anchor":false}],"resolved_work":35,"snapshot_sha256":"20988ab7fd09158e37a55be452b5d8b2ff34c3c932b19120050763bfc4453a2e","internal_anchors":1},"formal_canon":{"evidence_count":2,"snapshot_sha256":"dc356459c7dc6007c9b44e5bf6c99cc044e692167e51005c4518e45afaf7cda9"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}