Natural Text-Driven, Multi-Attribute Editing of Facial Images with Robustness in Sparse Latent Space

この論文をさがす

説明

Due to the development of GAN and the proposal of many excellent models like StyleGAN, text-driven image editing and image generation have made great progress in recent years, but the task of generating diverse images of specific people under the guidance of text is still lacking. This paper combines two pre-training models, CLIP and StyleGAN2, to conduct a preliminary exploration of the above tasks. The latent code of the input portrait is driven to be edited and manipulated in the StyleGAN latent space via a CLIP-based text-driven module. Especially in the sparse region of the generator latent space, and when editing multiple attributes at the same time, some good results have finally been achieved.

収録刊行物

詳細情報 詳細情報について

  • CRID
    1390014868195782016
  • NII書誌ID
    AA12746425
  • DOI
    10.15002/00026277
  • HANDLE
    10114/00026277
  • ISSN
    24321192
  • 本文言語コード
    en
  • 資料種別
    departmental bulletin paper
  • データソース種別
    • JaLC
    • IRDB
  • 抄録ライセンスフラグ
    使用可

問題の指摘

ページトップへ