Efficient Memory Management for Deep Neural Net Inference

Juhyun Lee; Yury Pisarchyk

arxiv: 2001.03288 · v3 · pith:WN6X5SGCnew · submitted 2020-01-10 · 💻 cs.LG · cs.CV

Efficient Memory Management for Deep Neural Net Inference

Yury Pisarchyk , Juhyun Lee This is my paper

classification 💻 cs.LG cs.CV

keywords memorydeepinferenceneuraldevicesefficientonlytask

0 comments

read the original abstract

While deep neural net inference was considered a task for servers only, latest advances in technology allow the task of inference to be moved to mobile and embedded devices, desired for various reasons ranging from latency to privacy. These devices are not only limited by their compute power and battery, but also by their inferior physical memory and cache, and thus, an efficient memory manager becomes a crucial component for deep neural net inference at the edge. We explore various strategies to smartly share memory buffers among intermediate tensors in deep neural nets. Employing these can result in up to 11% smaller memory footprint than the state of the art.

This paper has not been read by Pith yet.

Efficient Memory Management for Deep Neural Net Inference

discussion (0)