書誌事項
- タイトル別名
-
- Removing Partial Mismatches in Unsupervised Image Captioning
説明
<p>Unsupervised image captioning is a task to describe images without the supervision of image–sentence pairs. With the support of pre-trained object detectors, previous work assigned pseudo-captions, i.e., sentences that contain the detected object labels, to a given image. They focused on aligning the pseudo-captions with input images at the sentence level. However, pseudo-captions contain many words that are irrelevant to a given image. To shed light on the problem of partial mismatches between images and pseudo-captions, we focus on removing mismatched words from image–sentence alignment. We propose a simple gating mechanism that is trained to align image features with only the most reliable words in pseudo-captions: the detected object labels. The superior performance of our method empirically demonstrates the importance of removing the partial mismatches. Detailed analysis elucidates that our method successfully improves its performance in predicting the words likely to be mismatched during training. Furthermore, we show that using our method as an initialization method significantly boosts the performance of the previous sentence-level alignment method. These results confirm the importance of careful alignment in word-level details.</p>
収録刊行物
-
- 人工知能学会論文誌
-
人工知能学会論文誌 37 (2), H-L82_1-12, 2022-03-01
一般社団法人 人工知能学会
- Tweet
詳細情報 詳細情報について
-
- CRID
- 1390291767464072320
-
- NII論文ID
- 130008166003
-
- ISSN
- 13468030
- 13460714
-
- 本文言語コード
- ja
-
- データソース種別
-
- JaLC
- Crossref
- CiNii Articles
-
- 抄録ライセンスフラグ
- 使用不可