Modified SIMD architecture suitable for single-chip implementation
classification
🌌 astro-ph
keywords
elementsprocessingsimdallowsarchitecturechangegroupslarge
read the original abstract
We describe a modified SIMD architecture suitable for single-chip integration of a large number of processing elements, such as 1,000 or more. Important differences from traditional SIMD designs are: a) The size of the memory per processing elements is kept small. b) The processors are organized into groups, each with a small buffer memory. Reduction operation over the groups is done in hardware. The first change allows us to integrate a very large number of processing elements into a single chip. The second change allows us to achieve a close-to-peak performance for many scientific applications like particle-based simulations and dense-matrix operations.
This paper has not been read by Pith yet.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.