Speaker
Dr
hirokazu Kobayashi
(Intel K. K.)
Description
We implemented lattice QCD on Xeon Phi coprocessor using intrinsics as vectorization method, and OpenMP and MPI as parallelization method. Our implementation uses double precision conjugate gradient (CG) solver which also supports multi-shift CG.
We present our optimization methodology and performance for key steps in CG algorithms.
Primary author
Dr
hirokazu Kobayashi
(Intel K. K.)
Co-authors
Dr
Shinji Takeda
(Kanazawa University)
Dr
Yoshifumi Nakamura
(RIKEN AICS)
Prof.
Yoshinobu Kuramashi
(University of Tsukuba/ RIKEN AICS)