SEABAD is a publicly released, balanced dataset of 50,000 curated 16 kHz audio clips spanning 1,677 tropical bird species, with a dual-branch curation pipeline and MobileNetV3-Small baseline reaching 99.57% accuracy.
hub
CMSIS-NN: Efficient neural network kernels for ARM Cortex-M CPUs
11 Pith papers cite this work. Polarity classification is still indexing.
abstract
Deep Neural Networks are becoming increasingly popular in always-on IoT edge devices performing data analytics right at the source, reducing latency as well as energy consumption for data communication. This paper presents CMSIS-NN, efficient kernels developed to maximize the performance and minimize the memory footprint of neural network (NN) applications on Arm Cortex-M processors targeted for intelligent IoT edge devices. Neural network inference based on CMSIS-NN kernels achieves 4.6X improvement in runtime/throughput and 4.9X improvement in energy efficiency.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
NDR-SHKF replaces the static forgetting factor in Sage-Husa Kalman Filters with a learned vector-valued memory attenuation policy from a bifurcated recurrent network trained end-to-end on whitened innovations to minimize estimation error.
CATS enables collaborative transformer inference on up to 16 ultra-low-power wireless devices, supporting models up to 14 times larger than a single device can run via SomeGather pruning and message-dropout robustness.
MP-IB uses an 8x information asymmetry via FP16 trait heads and INT4 state heads to disentangle speaker identity from agitation in voice biomarkers, outperforming larger models on edge devices with low latency and suppressed identity leakage.
AEG baremetal framework achieves 9.2x higher compute efficiency, 3-7x less data movement, and near-zero latency variance for ResNet-18 on 28 AIE tiles versus Linux Vitis AI on 304 tiles while maintaining 68.78% ImageNet accuracy.
Non-IID data causes up to 55% accuracy loss in federated learning due to weight divergence measured by earth mover's distance; 5% globally shared data recovers 30% accuracy on CIFAR-10.
A fine-grained split inference system enables CNN models infeasible on single MCUs to run across networked devices by partitioning at sub-layer granularity, reducing per-device peak RAM while keeping practical latency.
EdgeSpike delivers 91.4% mean accuracy on five sensing tasks with 31x lower energy on neuromorphic hardware and 6.3x longer battery life in a seven-month field deployment compared to conventional CNNs.
A co-design framework using approximate matrix decomposition and genetic algorithms delivers 33% average latency reduction in TinyML CNN FPGA accelerators with 1.3% average accuracy loss versus standard systolic arrays.
A three-layer leaky integrate-and-fire spiking neural network estimates passive component parameters in power converters, cutting resistance error from 25.8% to 10.2% versus feedforward baselines at projected 270x lower energy on neuromorphic chips.
MicroBi-ConvLSTM is a convolutional-recurrent model with 11.4K parameters that delivers competitive accuracy on eight HAR benchmarks and full INT8 deployment coverage on Raspberry Pi Pico 2 and ESP32.
citing papers explorer
-
SEABAD: A Tropical Bird Activity Detection Dataset for Passive Acoustic Monitoring
SEABAD is a publicly released, balanced dataset of 50,000 curated 16 kHz audio clips spanning 1,677 tropical bird species, with a dual-branch curation pipeline and MobileNetV3-Small baseline reaching 99.57% accuracy.
-
Learned Memory Attenuation in Sage-Husa Kalman Filters for Robust UAV State Estimation
NDR-SHKF replaces the static forgetting factor in Sage-Husa Kalman Filters with a learned vector-valued memory attenuation policy from a bifurcated recurrent network trained end-to-end on whitened innovations to minimize estimation error.
-
Going Beyond the Edge: Distributed Inference of Transformer Models on Ultra-Low-Power Wireless Devices
CATS enables collaborative transformer inference on up to 16 ultra-low-power wireless devices, supporting models up to 14 times larger than a single device can run via SomeGather pruning and message-dropout robustness.
-
Mixed-Precision Information Bottlenecks for On-Device Trait-State Disentanglement in Bipolar Agitation Detection
MP-IB uses an 8x information asymmetry via FP16 trait heads and INT4 state heads to disentangle speaker identity from agitation in voice biomarkers, outperforming larger models on edge devices with low latency and suppressed identity leakage.
-
AEG: A Baremetal Framework for AI Acceleration via Direct Hardware Access in Heterogeneous Accelerators
AEG baremetal framework achieves 9.2x higher compute efficiency, 3-7x less data movement, and near-zero latency variance for ResNet-18 on 28 AIE tiles versus Linux Vitis AI on 304 tiles while maintaining 68.78% ImageNet accuracy.
-
Federated Learning with Non-IID Data
Non-IID data causes up to 55% accuracy loss in federated learning due to weight divergence measured by earth mover's distance; 5% globally shared data recovers 30% accuracy on CIFAR-10.
-
Split CNN Inference on Networked Microcontrollers
A fine-grained split inference system enables CNN models infeasible on single MCUs to run across networked devices by partitioning at sub-layer granularity, reducing per-device peak RAM while keeping practical latency.
-
EdgeSpike: Spiking Neural Networks for Low-Power Autonomous Sensing in Edge IoT Architectures
EdgeSpike delivers 91.4% mean accuracy on five sensing tasks with 31x lower energy on neuromorphic hardware and 6.3x longer battery life in a seven-month field deployment compared to conventional CNNs.
-
Co-Design of CNN Accelerators for TinyML using Approximate Matrix Decomposition
A co-design framework using approximate matrix decomposition and genetic algorithms delivers 33% average latency reduction in TinyML CNN FPGA accelerators with 1.3% average accuracy loss versus standard systolic arrays.
-
Neuromorphic Parameter Estimation for Power Converter Health Monitoring Using Spiking Neural Networks
A three-layer leaky integrate-and-fire spiking neural network estimates passive component parameters in power converters, cutting resistance error from 25.8% to 10.2% versus feedforward baselines at projected 270x lower energy on neuromorphic chips.
-
MicroBi-ConvLSTM: An Ultra-Lightweight Efficient Model for Human Activity Recognition on Resource Constrained Devices
MicroBi-ConvLSTM is a convolutional-recurrent model with 11.4K parameters that delivers competitive accuracy on eight HAR benchmarks and full INT8 deployment coverage on Raspberry Pi Pico 2 and ESP32.