Face Reenactment with Diffusion Model and Its Application to Video Compression

IUCHI Wataru, UMEDA Yuya, HARADA Kazuaki, YUNOKI Hayato, MUKAI Koki, YOSHIDA Shun, YAMASAKI Toshihiko

doi:10.11517/pjsai.jsai2023.0_3d5gs203

Bibliographic Information

Other Title

拡散モデルによる顔画像の再構成と動画圧縮への応用

Abstract

<p>With the advancement of information technology, the use of images and videos has become common. However, the capacity of storagedevices and communication bandwidth is finite, so the demand forcompression has been increasing. In addition to conventional frequency-based compression, deeplearning-based compression methods such as generative adversarial networks (GAN) have been emerging in recentyears. According to the existing FaR-GAN, a face image with a certainfacial expression can be reconstructed from a reference face image of a person andthe coordinate data of 68 landmarks representing the facial expression, which can be used forefficient facial image compression. However, suchexisting methods have problems in terms of reconstruction accuracy and smoothnessbetween frames.In this study, we propose a method that reconstructs an image from theprevious frame using a diffusion model recurrently for smooth inter-framerepresentation while optimizing the trade-off between person identificationand facial expression generation in diffusion model-based face imagereconstruction.</p>

Journal

Proceedings of the Annual Conference of JSAI

Proceedings of the Annual Conference of JSAI JSAI2023 (0), 3D5GS203-3D5GS203, 2023

The Japanese Society for Artificial Intelligence

Keywords

Details 詳細情報について

CRID: 1390859758174690944

DOI: 10.11517/pjsai.jsai2023.0_3d5gs203

ISSN: 27587347

Text Lang: ja

Data Source

JaLC

Abstract License Flag: Disallowed

Export