Mechanics & Industry
Volume 21, Number 3, 2020
|Number of page(s)||15|
|Published online||06 April 2020|
Parallel Thomas approach development for solving tridiagonal systems in GPU programming − steady and unsteady flow simulation
Department of Mechanical and Mechatronics Engineering, Shahrood University of Technology, Semnan, Iran
2 School of Engineering Science, College of Engineering, University of Tehran, Tehran, Iran
* e-mail: email@example.com
Accepted: 6 February 2020
The solution of tridiagonal system of equations using graphic processing units (GPU) is assessed. The parallel-Thomas-algorithm (PTA) is developed and the solution of PTA is compared to two known parallel algorithms, i.e. cyclic-reduction (CR) and parallel-cyclic-reduction (PCR). Lid-driven cavity problem is considered to assess these parallel approaches. This problem is also simulated using the classic Thomas algorithm that runs on a central processing unit (CPU). Runtimes and physical parameters of the mentioned GPU and CPU algorithms are compared. The results show that the speedup of CR, PCR and PTA against the CPU runtime is 4.4x,5.2x and 38.5x, respectively. Furthermore, the effect of coalesced and uncoalesced memory access to GPU global memory is examined for PTA, and a 2x-speedup is achieved for the coalesced memory access. Additionally, the PTA performance in a time dependent problem, the unsteady flow over a square, is assessed and a 9x-speedup is obtained against the CPU.
Key words: Tridiagonal system of equations / graphic processing units / parallel Thomas approach / flow over a square / lid-driven cavity
© AFM, EDP Sciences 2020
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.