Lattice Boltzmann Method (LBM) is a powerful numerical simulation method of the fluid flow. With its data parallel nature, it is a promising candidate for a parallel implementation on a GPU. The LBM, however, is heavily data intensive and memory bound. In particular, moving the data to the adjacent cells in the streaming computation phase incurs a lot of uncoalesced accesses on the GPU which affects the overall performance. Furthermore, the main computation kernels of the LBM use a large number of registers per thread which limits the thread parallelism available at the run time due to the fixed number of registers on the GPU. In this paper, we develop high performance parallelization of the LBM on a GPU by minimizing the overheads associated with the uncoalesced memory accesses while improving the cache locality using the tiling optimization with the data layout change. Furthermore, we aggressively reduce the register uses for the LBM kernels in order to increase the run-time thread parallelism. Experimental results on the Nvidia Tesla K20 GPU show that our approach delivers impressive throughput performance: 1210.63 Million Lattice Updates Per Second (MLUPS).
from #AlexandrosSfakianakis via Alexandros G.Sfakianakis on Inoreader http://ift.tt/2iZGIcv
via IFTTT
Εγγραφή σε:
Σχόλια ανάρτησης (Atom)
Δημοφιλείς αναρτήσεις
-
Essay Thesaurus Generator eisenschiml thesis Short essay on great wall of china how to start a compare and contrast essay sample assessing c...
-
How to write a Scholarship Essay - Examples. Scholarship Essays should use this formatting unless specified otherwise: Two to three pages in...
-
Abstract Chromatin structure is a major barrier to gene transcription that must be disrupted and re-set during each round of transcription....
-
Through the Wormhole: Is There an Edge to... Science - 43 min - ★ It is commonly theorized that the universe began with the Big Bang... Thro...
-
The Notch signaling pathway is a very conserved system that controls embryonic cell fate decisions and the maintenance of adult stem cells t...
-
Web version of a book about Subversion. Work in progress, however already very complete. The book should be published by O'Reilly and As...
-
3D CAD Services Streamline Design Process. Neco Inc., of Denver, Colorado, provides 3D Computer Aided Design and support services primarily ...
-
http://ift.tt/2p7HgAl
-
Reported by Scientific American, this Week in World War I: March 24, 1917 -- Read more on ScientificAmerican.com from #Alexandro...
Δεν υπάρχουν σχόλια:
Δημοσίευση σχολίου