Real-Time and Energy-Efficient Inference at GPU-Based Network Edge using PON

  • Yukito Onodera
    Tokyo University of Agriculture and Technology,Department of Computer and Information Sciences,Tokyo,Japan
  • Yoshiaki Inoue
    Graduate School of Engineering, Osaka University,Osaka,Japan
  • Daisuke Hisano
    Graduate School of Engineering, Osaka University,Osaka,Japan
  • Yu Nakayama
    Tokyo University of Agriculture and Technology,Department of Computer and Information Sciences,Tokyo,Japan

Description

In recent years, advances in deep learning (DL) technology have greatly improved research and services related to artificial intelligence (AI). In particular, real-time object recognition has become an important technology in smart cities. To achieve this, low-cost network deployment and low-latency data transfer are the key technologies. In this paper, we focus on Time- and Wavelength-Division Multiplexed Passive Optical Network (TWDM-PON) based inference systems to deploy cost-efficient networks that accommodate many network cameras. A significant issue for a GPU-based inference system via TWDM-PON is optimally allocating upstream wavelength and bandwidth to enable real-time inference. However, it has not been considered to increase the batch size of arrival data at edge servers ensuring low-latency transmission. Therefore, this paper proposes a concept of an inference system in which a large number of cameras periodically upload image data to a GPU-based server via TWDM-PONe We also propose a cooperative wavelength and bandwidth allocation algorithm to ensure low-latency and time-synchronized data arrival at the edge. The performance of the proposed scheme is verified with computer simulation.

Journal

Citations (1)*help

See more

Report a problem

Back to top