AVX Acceleration of DD Arithmetic Between a Sparse Matrix and Vector
Download PDF
Citation (bibtex)
@inproceedings{hishinuma2013ppam,
title={AVX acceleration of DD arithmetic between a sparse matrix and vector},
author={Hishinuma, Toshiaki and Fujii, Akihiro and Tanaka, Teruo and Hasegawa, Hidehiko},
booktitle={International Conference on Parallel Processing and Applied Mathematics},
pages={622--631},
year={2013},
organization={Springer}
}
Abstract
High precision arithmetic can improve the convergence of Krylov subspace methods; however, it is very costly. One system of high precision arithmetic is double-double (DD) arithmetic, which uses more than 20 double precision operations for one DD operation. We accelerated DD arithmetic using AVX SIMD instructions. The performances of vector operations in 4 threads are 51-59% of peak performance in a cache and bounded by the memory access speed out of the cache. For SpMV, we used a double precision sparse matrix A and DD vector x to reduce memory access and achieved performances of 17-41% of peak performance using padding in execution. We also achieved performances that were 9-33% of peak performance for a transposed SpMV. For these cases, the performances were not bounded by memory access.