A Binary Translator to Accelerate Development of Deep Learning Processing Library for AArch64 CPU
-
- KAWAKAMI Kentaro
- Fujitsu Limited
-
- KURIHARA Kouji
- Fujitsu Limited
-
- YAMAZAKI Masafumi
- Fujitsu Limited
-
- HONDA Takumi
- Fujitsu Limited
-
- FUKUMOTO Naoto
- Fujitsu Limited
抄録
<p>To accelerate deep learning (DL) processes on the supercomputer Fugaku, the authors have ported and optimized oneDNN for Fugaku's CPU, the Fujitsu A64FX. oneDNN is an open-source DL processing library developed by Intel for the x86_64 architecture. The A64FX CPU is based on the Armv8-A architecture. oneDNN dynamically creates the execution code for the computation kernels, which are implemented at the granularity of x86_64 instructions using Xbyak, the Just-In-Time (JIT) assembler for x86_64 architecture. To port oneDNN to A64FX, it must be rewritten into Armv8-A instructions using Xbyak_aarch64, the JIT assembler for the Armv8-A architecture. This is challenging because the number of steps to be rewritten exceeds several tens of thousands of lines. This study presents the Xbyak_translator_aarch64. Xbyak_translator_aarch64 is a binary translator that at runtime converts dynamically produced executable codes for the x86_64 architecture into executable codes for the Armv8-A architecture. Xbyak_translator_aarch64 eliminates the need to rewrite the source code for porting oneDNN to A64FX and allows us to port oneDNN to A64FX quickly.</p>
収録刊行物
-
- IEICE Transactions on Electronics
-
IEICE Transactions on Electronics E105.C (6), 222-231, 2022-06-01
一般社団法人 電子情報通信学会
- Tweet
詳細情報 詳細情報について
-
- CRID
- 1390010776368819968
-
- NII論文ID
- 130008124376
-
- ISSN
- 17451353
- 09168524
-
- 本文言語コード
- en
-
- データソース種別
-
- JaLC
- Crossref
- CiNii Articles
-
- 抄録ライセンスフラグ
- 使用不可