pith. sign in

arxiv: 2106.07597 · v4 · pith:5Z7KUVUQnew · submitted 2021-06-14 · 💻 cs.LG · cs.AR

MLPerf Tiny Benchmark

classification 💻 cs.LG cs.AR
keywords tinybenchmarkmlperfsystemslearningmachinesuitereproducible
0
0 comments X
read the original abstract

Advancements in ultra-low-power tiny machine learning (TinyML) systems promise to unlock an entirely new class of smart applications. However, continued progress is limited by the lack of a widely accepted and easily reproducible benchmark for these systems. To meet this need, we present MLPerf Tiny, the first industry-standard benchmark suite for ultra-low-power tiny machine learning systems. The benchmark suite is the collaborative effort of more than 50 organizations from industry and academia and reflects the needs of the community. MLPerf Tiny measures the accuracy, latency, and energy of machine learning inference to properly evaluate the tradeoffs between systems. Additionally, MLPerf Tiny implements a modular design that enables benchmark submitters to show the benefits of their product, regardless of where it falls on the ML deployment stack, in a fair and reproducible manner. The suite features four benchmarks: keyword spotting, visual wake words, image classification, and anomaly detection.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 12 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. OpenGlass: Ultra-Low-Power On-Device AI Eyewear with Event-based Vision

    cs.CV 2026-06 unverdicted novelty 7.0

    OpenGlass is an open-source smart glasses platform using event-based vision and event-driven power management to achieve 11.5 hours of continuous on-device ML on a 200 mAh battery, demonstrated with 83.94% cross-subje...

  2. A Fully Tunable Ultra-Low Power Current-Mode Memory Cell in Standard CMOS Technology

    eess.SP 2026-05 unverdicted novelty 7.0

    A fully tunable ultra-low-power current-mode bistable memory cell using nine standard CMOS transistors enables spike-based logic gates and noise-immune recurrent neural units.

  3. Design Rules for Extreme-Edge Scientific Computing on AI Engines

    cs.AR 2026-04 unverdicted novelty 7.0

    AI Engines enable larger low-latency neural networks for extreme-edge scientific computing on FPGAs than programmable logic, via a new latency-adjusted resource equivalence metric and tailored optimizations.

  4. Wake Vision: A Tailored Dataset and Benchmark Suite for TinyML Computer Vision Applications

    cs.CV 2024-05 unverdicted novelty 7.0

    Wake Vision pipeline produces a 6M-image person detection dataset for TinyML with 2.2% label error, improving model accuracy up to 6.6% over prior VWW benchmark across architectures and subsets.

  5. AdvScan: Black-Box Adversarial Example Detection at Runtime through Power Analysis

    cs.CR 2026-06 unverdicted novelty 6.0

    AdvScan detects adversarial examples in black-box TinyML on ARM Cortex-M devices via one-sample t-test on runtime power signatures against a benign baseline, reporting 99.984% detection with 40 false negatives and zer...

  6. Hardware-Software Co-Design of Scalable, Energy-Efficient Analog Recurrent Computations

    cs.AR 2026-05 unverdicted novelty 6.0

    BMRUs enable a direct one-to-one mapping from learned parameters to current-mode analog circuit elements, with discrete hysteretic outputs suppressing noise by at least 20x and supporting sub-microwatt RNN inference i...

  7. A Fully Tunable Ultra-Low Power Current-Mode Memory Cell in Standard CMOS Technology

    eess.SP 2026-05 unverdicted novelty 6.0

    A nine-transistor current-mode bistable memory cell in 180 nm CMOS is presented with independent tuning of threshold, hysteresis, and gain, shown via schematic simulations for spike-based logic gates and recurrent neu...

  8. QuIDE: Mastering the Quantized Intelligence Trade-off via Active Optimization

    cs.LG 2026-05 unverdicted novelty 6.0

    QuIDE defines the Intelligence Index I = (C × P) / log₂(T+1) as a unified score for the compression-accuracy-latency trade-off in quantized neural networks, with experiments showing task-dependent optimal bit widths.

  9. Are Large Language Models Economically Viable for Industry Deployment?

    cs.CL 2026-04 unverdicted novelty 6.0

    Small LLMs under 2B parameters achieve better economic break-even, energy efficiency, and hardware density than larger models on legacy GPUs for industrial tasks.

  10. Hardware-Software Co-Design of Scalable, Energy-Efficient Analog Recurrent Computations

    cs.AR 2026-05 unverdicted novelty 5.0

    BMRUs enable analog recurrent neural network hardware via discrete outputs that suppress noise 20-fold, with one-to-one parameter-to-circuit mapping and linear power scaling for recurrence.

  11. Perforated Neural Networks for Keyword Spotting

    cs.LG 2026-05 unverdicted novelty 4.0

    Dendritic models using Perforated Backpropagation reach 0.933 test accuracy with 1500 parameters on keyword spotting, beating a baseline of 0.921 accuracy that needs roughly 4000 parameters.

  12. Efficient Network Inference via Hardware-Aware Architecture Search, Model Pruning & Quantization

    cs.LG 2026-06 unverdicted novelty 2.0

    Combines pruning, quantization, and hardware-aware NAS on MCUNet to reduce size and complexity while preserving performance for GNSS interference monitoring on MCUs and Raspberry Pi devices.