pith. machine review for the scientific record. sign in

arxiv: 1801.06601 · v1 · submitted 2018-01-19 · 💻 cs.NE · cs.LG· cs.MS

Recognition: unknown

CMSIS-NN: Efficient Neural Network Kernels for Arm Cortex-M CPUs

Authors on Pith no claims yet
classification 💻 cs.NE cs.LGcs.MS
keywords neuralcmsis-nnkernelsnetworkcortex-mdatadevicesedge
0
0 comments X
read the original abstract

Deep Neural Networks are becoming increasingly popular in always-on IoT edge devices performing data analytics right at the source, reducing latency as well as energy consumption for data communication. This paper presents CMSIS-NN, efficient kernels developed to maximize the performance and minimize the memory footprint of neural network (NN) applications on Arm Cortex-M processors targeted for intelligent IoT edge devices. Neural network inference based on CMSIS-NN kernels achieves 4.6X improvement in runtime/throughput and 4.9X improvement in energy efficiency.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 5 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Mixed-Precision Information Bottlenecks for On-Device Trait-State Disentanglement in Bipolar Agitation Detection

    cs.LG 2026-05 unverdicted novelty 7.0

    MP-IB uses an 8x information asymmetry via FP16 trait heads and INT4 state heads to disentangle speaker identity from agitation in voice biomarkers, outperforming larger models on edge devices with low latency and sup...

  2. Split CNN Inference on Networked Microcontrollers

    cs.DC 2026-05 unverdicted novelty 6.0

    A fine-grained split inference system enables CNN models infeasible on single MCUs to run across networked devices by partitioning at sub-layer granularity, reducing per-device peak RAM while keeping practical latency.

  3. EdgeSpike: Spiking Neural Networks for Low-Power Autonomous Sensing in Edge IoT Architectures

    cs.NE 2026-04 unverdicted novelty 6.0

    EdgeSpike delivers 91.4% mean accuracy on five sensing tasks with 31x lower energy on neuromorphic hardware and 6.3x longer battery life in a seven-month field deployment compared to conventional CNNs.

  4. Co-Design of CNN Accelerators for TinyML using Approximate Matrix Decomposition

    cs.AR 2026-04 unverdicted novelty 6.0

    A co-design framework using approximate matrix decomposition and genetic algorithms delivers 33% average latency reduction in TinyML CNN FPGA accelerators with 1.3% average accuracy loss versus standard systolic arrays.

  5. Neuromorphic Parameter Estimation for Power Converter Health Monitoring Using Spiking Neural Networks

    cs.NE 2026-04 unverdicted novelty 6.0

    A three-layer leaky integrate-and-fire spiking neural network estimates passive component parameters in power converters, cutting resistance error from 25.8% to 10.2% versus feedforward baselines at projected 270x low...