pith. sign in

arxiv: 1806.01105 · v1 · pith:N3ZDBI7Rnew · submitted 2018-05-04 · 💻 cs.DC

Performance tuning for deep learning on a many-core processor (master thesis)

classification 💻 cs.DC
keywords performanceverylokimany-coreoptimisationspossibleprocessorachieving
0
0 comments X
read the original abstract

Convolutional neural networks (CNNs) are becoming very successful and popular for a variety of applications. The Loki many-core processor architecture is very promising for achieving specialised hardware performance and efficiency while being a general purpose solution. Loki combines many simple cores with increased control for the programmer. This freedom can be exploited to produce much more efficient code than in conventional multiprocessors but it also creates a very big design space for possible optimisations. In this project, I explore possible optimisations for a CNN application, their portability on different Loki-specific configurations, convolution parameters and inputs. Finally, I investigate the potential for adaptive algorithms for further performance increase.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.