Parallel Transpose Algorithms: Massively Parallel Computing with GPUs Series

10:00 pm - 12:00 pm
Zoom

Several highly accurate computational methods, such as Fourier spectral method and compact finite difference method, require complete grid data along spatial axes when solving partial differential equations on structured grids. However, modern simulations often involve enormous datasets, especially when using high-resolution grids, that exceed the memory capacity of a single processor. This poses a significant challenge: how to efficiently process and analyze these large datasets across multiple processors without compromising accuracy or speed.

We will explore parallel transpose algorithms designed to efficiently distribute and rearrange large datasets across multiple processors. By leveraging the Message Passing Interface (MPI), we will examine techniques to optimize data movement and minimize communication overhead for 2D and 3D transpose operations. These algorithms can significantly reduce execution time and improve scalability, enabling the simulation of larger and more complex physical systems. 

This workshop emphasizes the practical aspects of implementing these methods, and is beneficial for two groups of people:

Coders: If you're building your own code, this workshop will provide practical experience in implementing parallel transpose algorithms.

Users: If you're using existing research code, this workshop will help you understand its performance and potential limitations, enabling you to optimize its use.

Instructor: Shao-Ching Huang

OARC Organizing Group