hub

Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1

Courbariaux, M · 2016 · cs.LG · arXiv 1602.02830

22 Pith papers cite this work. Polarity classification is still indexing.

22 Pith papers citing it

open full Pith review browse 22 citing papers arXiv PDF

abstract

We introduce a method to train Binarized Neural Networks (BNNs) - neural networks with binary weights and activations at run-time. At training-time the binary weights and activations are used for computing the parameters gradients. During the forward pass, BNNs drastically reduce memory size and accesses, and replace most arithmetic operations with bit-wise operations, which is expected to substantially improve power-efficiency. To validate the effectiveness of BNNs we conduct two sets of experiments on the Torch7 and Theano frameworks. On both, BNNs achieved nearly state-of-the-art results over the MNIST, CIFAR-10 and SVHN datasets. Last but not least, we wrote a binary matrix multiplication GPU kernel with which it is possible to run our MNIST BNN 7 times faster than with an unoptimized GPU kernel, without suffering any loss in classification accuracy. The code for training and running our BNNs is available on-line.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

method 2 background 1

citation-polarity summary

use method 2 background 1

representative citing papers

Layerwise Progressive Freezing: A Training Scaffold for Depth-Scalable Binary Networks

cs.LG · 2026-06-26 · unverdicted · novelty 7.0

StoMPP progressively binarizes BNN layers layerwise from input to output via stochastic masks, delivering depth-scalable accuracy gains in a fully STE-free regime by controlling activation-induced gradient blockades.

FTerViT: Fully Ternary Vision Transformer

cs.CV · 2026-05-20 · conditional · novelty 7.0

FTerViT introduces fully ternary Vision Transformers with TernaryBitConv2d and TernaryLayerNorm operators, achieving 82.43% ImageNet top-1 at 6.09 MB with 15x compression.

A Stochastic-Computing based Deep Learning Framework using Adiabatic Quantum-Flux-Parametron SuperconductingTechnology

cs.NE · 2019-07-22 · unverdicted · novelty 7.0

Proposes first stochastic-computing DNN acceleration framework tailored to AQFP superconducting technology.

LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale

cs.LG · 2022-08-15 · conditional · novelty 7.0

LLM.int8() performs 8-bit inference for transformers up to 175B parameters with no accuracy loss by combining vector-wise quantization for most features with 16-bit mixed-precision handling of systematic outlier dimensions.

Spatial Partial Functionalization of Neural Networks based on Noise Fields

cs.NE · 2026-06-23 · unverdicted · novelty 6.0

A crossing activation function combined with virtual noise fields allows one neural network to learn multiple functions assigned to different noise locations, with capacity rising when noise arrangement matches function proximity.

Signs Beat Floats: Low-Rank Double-Binary Adaptation for On-Device Fine-Tuning

cs.LG · 2026-05-22 · unverdicted · novelty 6.0

LoRDBA replaces LoRA low-rank factors with binary sign carriers and channel-wise scales, matching fp16 LoRA quality in some regimes with over 10x smaller adapter size and at most 8% prefill latency overhead.

DAP: Doppler-aware Point Network for Heterogeneous mmWave Action Recognition

cs.CV · 2026-05-10 · unverdicted · novelty 6.0 · 2 refs

Introduces the first heterogeneous multi-source mmWave point cloud HAR dataset and DAP-Net, which uses Doppler patterns for source-invariant action recognition and outperforms prior methods.

SURGE: Surrogate Gradient Adaptation in Binary Neural Networks

cs.LG · 2026-05-09 · unverdicted · novelty 6.0 · 2 refs

SURGE proposes a dual-path gradient compensator and adaptive gradient scaler to mitigate gradient mismatch in binary neural network training via auxiliary backpropagation.

Rethinking 1-bit Optimization Leveraging Pre-trained Large Language Models

cs.CL · 2025-08-09 · conditional · novelty 6.0

A progressive training scheme with binary-aware initialization and dual-scaling allows pre-trained LLMs to be converted to high-performance 1-bit models without training from scratch.

A Lower Bound for the Number of Linear Regions of Ternary ReLU Regression Neural Networks

cs.LG · 2025-07-21 · unverdicted · novelty 6.0

Proves polynomial-in-width and exponential-in-depth lower bounds on linear regions for ternary ReLU regression networks, with width-doubling constructions achieving bounds comparable to unrestricted ReLU networks.

Co-Evolutionary Compression for Unpaired Image Translation

cs.CV · 2019-07-25 · unverdicted · novelty 6.0

A co-evolutionary compression technique reduces parameters and FLOPs in unpaired image-to-image translation GAN generators while maintaining translation quality on benchmarks.

Efficient Detection and Quantification of Timing Leaks with Neural Networks

cs.CR · 2019-07-23 · unverdicted · novelty 6.0

Neural networks are trained as timing models of programs and analyzed via MILP to detect and quantify timing side-channel information leaks.

Adaptive Precision CNN Accelerator Using Radix-X Parallel Connected Memristor Crossbars

eess.SP · 2019-06-22 · unverdicted · novelty 6.0

Radix-5 memristor crossbar CNN accelerator reaches 90.5% CIFAR-10 accuracy with 46% area reduction by using variable memristor counts and single-column signed weights.

LBLLM: Lightweight Binarization of Large Language Models via Three-Stage Distillation

cs.LG · 2026-04-21 · unverdicted · novelty 6.0

LBLLM achieves better accuracy than prior binarization methods for LLMs by decoupling weight and activation quantization through initialization, layer-wise distillation, and learnable activation scaling.

Low-Energy Reduced RISC-V Instruction Subset Processor for Tsetlin Machine Inference at the Edge

cs.LG · 2026-06-18 · unverdicted · novelty 5.0

A domain-specific reduced RISC-V core for Tsetlin Machine inference delivers up to 98% faster execution and 29.7x lower energy than baseline RV32IM while matching or exceeding BNN accuracy on tested datasets.

QuoVLA: Quotient Space for Vision-Language-Action Models

cs.CV · 2026-05-24 · unverdicted · novelty 5.0

QuoVLA introduces a quotient-space framework that compresses VLM latents into action-sufficient representations via quantization and dual-branch design for better VLA generalization.

New pointwise convolution in Deep Neural Networks through Extremely Fast and Non Parametric Transforms

cs.CV · 2019-06-25 · unverdicted · novelty 5.0

Replacing pointwise convolutions with DWHT yields a model with 79.1% fewer parameters, 48.4% fewer FLOPs, and 1.49% higher accuracy than MobileNet-V1 on CIFAR-100.

Improving Branch Prediction By Modeling Global History with Convolutional Neural Networks

cs.DC · 2019-06-20 · unverdicted · novelty 5.0

CNNs applied to global history improve prediction accuracy for hard-to-predict branches in SPEC 2017, with hardware-adapted inference and reusability across inputs.

A Composite Activation Function for Learning Stable Binary Representations

cs.LG · 2026-05-12 · unverdicted · novelty 5.0

HTAF is a sigmoid-tanh composite that approximates the Heaviside function to allow stable gradient training of binary activation networks, yielding ICBMs with stable discretization and competitive performance on image tasks.

Design and Implementation of BNN-Based Object Detection on FPGA

cs.AR · 2026-05-05 · unverdicted · novelty 4.0 · 2 refs

A BNN-based YOLOv3-tiny-like object detector with 1-bit weights and 8-bit activations is implemented in Verilog on FPGA, achieving 39.6% mAP50 on VOC and 0.999964 correlation with the ONNX model in RTL simulation.

Hybrid Compression: Integrating Pruning and Quantization for Optimized Neural Networks

cs.CV · 2026-06-22 · unverdicted · novelty 3.0

Hybrid method applies pruning and quantization followed by MoE routing of compressed CNN experts to achieve large reductions in FLOPs and parameters with negligible accuracy loss on benchmarks.

Deep learning applied to computational mechanics: A comprehensive review, state of the art, and the classics

cs.LG · 2022-12-18 · unverdicted · novelty 2.0

A comprehensive review of deep learning techniques for computational mechanics, including LSTM for constitutive modeling, PINNs for PDE solving, optimizers, and kernel methods.

citing papers explorer

Showing 22 of 22 citing papers.

Layerwise Progressive Freezing: A Training Scaffold for Depth-Scalable Binary Networks cs.LG · 2026-06-26 · unverdicted · none · ref 8 · internal anchor
StoMPP progressively binarizes BNN layers layerwise from input to output via stochastic masks, delivering depth-scalable accuracy gains in a fully STE-free regime by controlling activation-induced gradient blockades.
FTerViT: Fully Ternary Vision Transformer cs.CV · 2026-05-20 · conditional · none · ref 16 · internal anchor
FTerViT introduces fully ternary Vision Transformers with TernaryBitConv2d and TernaryLayerNorm operators, achieving 82.43% ImageNet top-1 at 6.09 MB with 15x compression.
A Stochastic-Computing based Deep Learning Framework using Adiabatic Quantum-Flux-Parametron SuperconductingTechnology cs.NE · 2019-07-22 · unverdicted · none · ref 12 · internal anchor
Proposes first stochastic-computing DNN acceleration framework tailored to AQFP superconducting technology.
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale cs.LG · 2022-08-15 · conditional · none · ref 125
LLM.int8() performs 8-bit inference for transformers up to 175B parameters with no accuracy loss by combining vector-wise quantization for most features with 16-bit mixed-precision handling of systematic outlier dimensions.
Spatial Partial Functionalization of Neural Networks based on Noise Fields cs.NE · 2026-06-23 · unverdicted · none · ref 44 · internal anchor
A crossing activation function combined with virtual noise fields allows one neural network to learn multiple functions assigned to different noise locations, with capacity rising when noise arrangement matches function proximity.
Signs Beat Floats: Low-Rank Double-Binary Adaptation for On-Device Fine-Tuning cs.LG · 2026-05-22 · unverdicted · none · ref 10 · internal anchor
LoRDBA replaces LoRA low-rank factors with binary sign carriers and channel-wise scales, matching fp16 LoRA quality in some regimes with over 10x smaller adapter size and at most 8% prefill latency overhead.
DAP: Doppler-aware Point Network for Heterogeneous mmWave Action Recognition cs.CV · 2026-05-10 · unverdicted · none · ref 10 · 2 links · internal anchor
Introduces the first heterogeneous multi-source mmWave point cloud HAR dataset and DAP-Net, which uses Doppler patterns for source-invariant action recognition and outperforms prior methods.
SURGE: Surrogate Gradient Adaptation in Binary Neural Networks cs.LG · 2026-05-09 · unverdicted · none · ref 22 · 2 links · internal anchor
SURGE proposes a dual-path gradient compensator and adaptive gradient scaler to mitigate gradient mismatch in binary neural network training via auxiliary backpropagation.
Rethinking 1-bit Optimization Leveraging Pre-trained Large Language Models cs.CL · 2025-08-09 · conditional · none · ref 7 · internal anchor
A progressive training scheme with binary-aware initialization and dual-scaling allows pre-trained LLMs to be converted to high-performance 1-bit models without training from scratch.
A Lower Bound for the Number of Linear Regions of Ternary ReLU Regression Neural Networks cs.LG · 2025-07-21 · unverdicted · none · ref 1 · internal anchor
Proves polynomial-in-width and exponential-in-depth lower bounds on linear regions for ternary ReLU regression networks, with width-doubling constructions achieving bounds comparable to unrestricted ReLU networks.
Co-Evolutionary Compression for Unpaired Image Translation cs.CV · 2019-07-25 · unverdicted · none · ref 3 · internal anchor
A co-evolutionary compression technique reduces parameters and FLOPs in unpaired image-to-image translation GAN generators while maintaining translation quality on benchmarks.
Efficient Detection and Quantification of Timing Leaks with Neural Networks cs.CR · 2019-07-23 · unverdicted · none · ref 15 · internal anchor
Neural networks are trained as timing models of programs and analyzed via MILP to detect and quantify timing side-channel information leaks.
Adaptive Precision CNN Accelerator Using Radix-X Parallel Connected Memristor Crossbars eess.SP · 2019-06-22 · unverdicted · none · ref 24 · internal anchor
Radix-5 memristor crossbar CNN accelerator reaches 90.5% CIFAR-10 accuracy with 46% area reduction by using variable memristor counts and single-column signed weights.
LBLLM: Lightweight Binarization of Large Language Models via Three-Stage Distillation cs.LG · 2026-04-21 · unverdicted · none · ref 43
LBLLM achieves better accuracy than prior binarization methods for LLMs by decoupling weight and activation quantization through initialization, layer-wise distillation, and learnable activation scaling.
Low-Energy Reduced RISC-V Instruction Subset Processor for Tsetlin Machine Inference at the Edge cs.LG · 2026-06-18 · unverdicted · none · ref 9 · internal anchor
A domain-specific reduced RISC-V core for Tsetlin Machine inference delivers up to 98% faster execution and 29.7x lower energy than baseline RV32IM while matching or exceeding BNN accuracy on tested datasets.
QuoVLA: Quotient Space for Vision-Language-Action Models cs.CV · 2026-05-24 · unverdicted · none · ref 6 · internal anchor
QuoVLA introduces a quotient-space framework that compresses VLM latents into action-sufficient representations via quantization and dual-branch design for better VLA generalization.
New pointwise convolution in Deep Neural Networks through Extremely Fast and Non Parametric Transforms cs.CV · 2019-06-25 · unverdicted · none · ref 3 · internal anchor
Replacing pointwise convolutions with DWHT yields a model with 79.1% fewer parameters, 48.4% fewer FLOPs, and 1.49% higher accuracy than MobileNet-V1 on CIFAR-100.
Improving Branch Prediction By Modeling Global History with Convolutional Neural Networks cs.DC · 2019-06-20 · unverdicted · none · ref 13 · internal anchor
CNNs applied to global history improve prediction accuracy for hard-to-predict branches in SPEC 2017, with hardware-adapted inference and reusability across inputs.
A Composite Activation Function for Learning Stable Binary Representations cs.LG · 2026-05-12 · unverdicted · none · ref 11
HTAF is a sigmoid-tanh composite that approximates the Heaviside function to allow stable gradient training of binary activation networks, yielding ICBMs with stable discretization and competitive performance on image tasks.
Design and Implementation of BNN-Based Object Detection on FPGA cs.AR · 2026-05-05 · unverdicted · none · ref 7 · 2 links
A BNN-based YOLOv3-tiny-like object detector with 1-bit weights and 8-bit activations is implemented in Verilog on FPGA, achieving 39.6% mAP50 on VOC and 0.999964 correlation with the ONNX model in RTL simulation.
Hybrid Compression: Integrating Pruning and Quantization for Optimized Neural Networks cs.CV · 2026-06-22 · unverdicted · none · ref 4 · internal anchor
Hybrid method applies pruning and quantization followed by MoE routing of compressed CNN experts to achieve large reductions in FLOPs and parameters with negligible accuracy loss on benchmarks.
Deep learning applied to computational mechanics: A comprehensive review, state of the art, and the classics cs.LG · 2022-12-18 · unverdicted · none · ref 297 · internal anchor
A comprehensive review of deep learning techniques for computational mechanics, including LSTM for constitutive modeling, PINNs for PDE solving, optimizers, and kernel methods.

Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer