Performance of MPI sends of non-contiguous data
classification
💻 cs.DC
keywords
derivedmessagesperformancebufferbufferingcausescombinationcomparably
read the original abstract
We present an experimental investigation of the performance of MPI derived datatypes. For messages up to the megabyte range most schemes perform comparably to each other and to manual copying into a regular send buffer. However, for large messages the internal buffering of MPI causes differences in efficiency. The optimal scheme is a combination of packing and derived types.
This paper has not been read by Pith yet.
Forward citations
Cited by 1 Pith paper
-
Routing-Based Continual Learning for Multimodal Large Language Models
Routing architecture for MLLMs enables continual learning with constant compute, matching multi-task learning performance and supporting cross-modal transfer.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.