- Integration of CiNii Books functions for fiscal year 2025 has completed
- Trial version of CiNii Research Knowledge Graph Search feature is available on CiNii Labs
- 【Updated on November 26, 2025】Regarding the recording of “Research Data” and “Evidence Data”
- Incorporated Jxiv preprints from JaLC and adding coverage from NDL Search
Adaptive Lossy Data Compression Extended Architecture for Memory Bandwidth Conservation in SpMV
-
- HU Siyi
- The University of Tokyo
-
- ITO Makiko
- Fujitsu Ltd.
-
- YOSHIKAWA Takahide
- Fujitsu Ltd.
-
- HE Yuan
- Keio University
-
- NAKAMURA Hiroshi
- The University of Tokyo
-
- KONDO Masaaki
- Keio University RIKEN
Bibliographic Information
- Published
- 2023-12-01
- DOI
-
- 10.1587/transinf.2023pap0008
- Publisher
- The Institute of Electronics, Information and Communication Engineers
Search this article
Description
<p>Widely adopted by machine learning and graph processing applications nowadays, sparse matrix-Vector multiplication (SpMV) is a very popular algorithm in linear algebra. This is especially the case for fully-connected MLP layers, which dominate many SpMV computations and play a substantial role in diverse services. As a consequence, a large fraction of data center cycles is spent on SpMV kernels. Meanwhile, despite having efficient storage options against sparsity (such as CSR or CSC), SpMV kernels still suffer from the problem of limited memory bandwidth during data transferring because of the memory hierarchy of modern computing systems. In more detail, we find that both integer and floating-point data used in SpMV kernels are handled plainly without any necessary pre-processing. Therefore, we believe bandwidth conservation techniques, such as data compression, may dramatically help SpMV kernels when data is transferred between the main memory and the Last Level Cache (LLC). Furthermore, we also observe that convergence conditions in some typical scientific computation benchmarks (based on SpMV kernels) will not be degraded when adopting lower precision floating-point data. Based on these findings, in this work, we propose a simple yet effective data compression scheme that can be extended to general purpose computing architectures or HPC systems preferably. When it is adopted, a best-case speedup of 1.92x is made. Besides, evaluations with both the CG kernel and the PageRank algorithm indicate that our proposal introduces negligible overhead on both the convergence speed and the accuracy of final results.</p>
Journal
-
- IEICE Transactions on Information and Systems
-
IEICE Transactions on Information and Systems E106.D (12), 2015-2025, 2023-12-01
The Institute of Electronics, Information and Communication Engineers
- Tweet
Details 詳細情報について
-
- CRID
- 1390861305860374784
-
- ISSN
- 17451361
- 09168532
-
- Text Lang
- en
-
- Data Source
-
- JaLC
- Crossref
- OpenAIRE
-
- Abstract License Flag
- Disallowed

