Compiling ONNX Neural Network Model Using MLIR

この論文をさがす

抄録

Neural network model is becoming popular and has been used in various tasks such as computer vision, speech recognition, and natural language processing. It is often the case that the training phase of a model is done in an environment, while the inference phase is executed in another environment. It is because the optimization characteristics for each phase are largely different. As a result, it is critical to efficiently compile a trained model for inferencing on different environments. To represent neural network models, users often use ONNX which is an open standard format for machine learning interoperability. We are developing a framework for compiling a model in ONNX into a standalone binary that is executable on different target hardwares such as x86, P, and Z@. The framework is written using MLIR, a modern compiler infrastructure for multi-level intermediate representations. In particular, we introduce two internal representations: ONNX IR for representing ONNX operators, and Kernel IR as an intermediate representation for efficiently lowering ONNX operators into LLVM bitcode. In this presentation, we will discuss the overall structure of the framework and show some practical examples of converting ONNX operators and models. We also cover several issues related to endianness.

Neural network model is becoming popular and has been used in various tasks such as computer vision, speech recognition, and natural language processing. It is often the case that the training phase of a model is done in an environment, while the inference phase is executed in another environment. It is because the optimization characteristics for each phase are largely different. As a result, it is critical to efficiently compile a trained model for inferencing on different environments. To represent neural network models, users often use ONNX which is an open standard format for machine learning interoperability. We are developing a framework for compiling a model in ONNX into a standalone binary that is executable on different target hardwares such as x86, P, and Z@. The framework is written using MLIR, a modern compiler infrastructure for multi-level intermediate representations. In particular, we introduce two internal representations: ONNX IR for representing ONNX operators, and Kernel IR as an intermediate representation for efficiently lowering ONNX operators into LLVM bitcode. In this presentation, we will discuss the overall structure of the framework and show some practical examples of converting ONNX operators and models. We also cover several issues related to endianness.

収録刊行物

詳細情報 詳細情報について

  • CRID
    1051131408409418880
  • NII論文ID
    170000184305
  • NII書誌ID
    AA11464814
  • ISSN
    18827802
  • Web Site
    http://id.nii.ac.jp/1001/00209240/
  • 本文言語コード
    en
  • 資料種別
    article
  • データソース種別
    • IRDB
    • CiNii Articles

問題の指摘

ページトップへ