Fast Algorithms for Convolutional Neural Networks

· 2015 · cs.NE · arXiv 1509.09308

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Deep convolutional neural networks take GPU days of compute time to train on large data sets. Pedestrian detection for self driving cars requires very low latency. Image recognition for mobile phones is constrained by limited processing resources. The success of convolutional neural networks in these situations is limited by how fast we can compute them. Conventional FFT based convolution is fast for large filters, but state of the art convolutional neural networks use small, 3x3 filters. We introduce a new class of fast algorithms for convolutional neural networks using Winograd's minimal filtering algorithms. The algorithms compute minimal complexity convolution over small tiles, which makes them fast with small filters and small batch sizes. We benchmark a GPU implementation of our algorithm with the VGG network and show state of the art throughput at batch sizes from 1 to 64.

representative citing papers

Separable Convolutional LSTMs for Faster Video Segmentation

cs.CV · 2019-07-16 · unverdicted · novelty 6.0

Separable convLSTMs cut parameters and FLOPs in video segmentation, delivering up to 15% faster GPU inference with similar or slightly lower accuracy.

citing papers explorer

Showing 1 of 1 citing paper.

Separable Convolutional LSTMs for Faster Video Segmentation cs.CV · 2019-07-16 · unverdicted · none · ref 11 · internal anchor
Separable convLSTMs cut parameters and FLOPs in video segmentation, delivering up to 15% faster GPU inference with similar or slightly lower accuracy.

Fast Algorithms for Convolutional Neural Networks

fields

years

verdicts

representative citing papers

citing papers explorer