Natural Text-Driven, Multi-Attribute Editing of Facial Images with Robustness in Sparse Latent Space

Search this article

Description

Due to the development of GAN and the proposal of many excellent models like StyleGAN, text-driven image editing and image generation have made great progress in recent years, but the task of generating diverse images of specific people under the guidance of text is still lacking. This paper combines two pre-training models, CLIP and StyleGAN2, to conduct a preliminary exploration of the above tasks. The latent code of the input portrait is driven to be edited and manipulated in the StyleGAN latent space via a CLIP-based text-driven module. Especially in the sparse region of the generator latent space, and when editing multiple attributes at the same time, some good results have finally been achieved.

Journal

Details 詳細情報について

  • CRID
    1390014868195782016
  • NII Book ID
    AA12746425
  • DOI
    10.15002/00026277
  • HANDLE
    10114/00026277
  • ISSN
    24321192
  • Text Lang
    en
  • Article Type
    departmental bulletin paper
  • Data Source
    • JaLC
    • IRDB
  • Abstract License Flag
    Allowed

Report a problem

Back to top