擬似教師ありキャプション生成における部分的不一致の除去

本多 右京, 橋本 敦史, 渡辺 太郎, 松本 裕治

doi:10.1527/tjsai.37-2_h-l82

書誌事項

タイトル別名

Removing Partial Mismatches in Unsupervised Image Captioning

抄録

<p>Unsupervised image captioning is a task to describe images without the supervision of image–sentence pairs. With the support of pre-trained object detectors, previous work assigned pseudo-captions, i.e., sentences that contain the detected object labels, to a given image. They focused on aligning the pseudo-captions with input images at the sentence level. However, pseudo-captions contain many words that are irrelevant to a given image. To shed light on the problem of partial mismatches between images and pseudo-captions, we focus on removing mismatched words from image–sentence alignment. We propose a simple gating mechanism that is trained to align image features with only the most reliable words in pseudo-captions: the detected object labels. The superior performance of our method empirically demonstrates the importance of removing the partial mismatches. Detailed analysis elucidates that our method successfully improves its performance in predicting the words likely to be mismatched during training. Furthermore, we show that using our method as an initialization method significantly boosts the performance of the previous sentence-level alignment method. These results confirm the importance of careful alignment in word-level details.</p>

収録刊行物

人工知能学会論文誌

人工知能学会論文誌 37 (2), H-L82_1-12, 2022-03-01

一般社団法人人工知能学会

キーワード

詳細情報詳細情報について

CRID: 1390291767464072320

NII論文ID: 130008166003

DOI: 10.1527/tjsai.37-2_h-l82

ISSN: 13468030; 13460714

Web Site: https://www.jstage.jst.go.jp/article/tjsai/37/2/37_37-2_H-L82/_pdf

本文言語コード: ja

データソース種別

JaLC
Crossref
CiNii Articles

抄録ライセンスフラグ: 使用不可

擬似教師ありキャプション生成における部分的不一致の除去

書誌事項

抄録

収録刊行物

参考文献 (43)*注記

キーワード

詳細情報詳細情報について

書き出し

問題の指摘

擬似教師ありキャプション生成における部分的不一致の除去

書誌事項

抄録

収録刊行物

参考文献 (43)*注記

キーワード

詳細情報 詳細情報について

書き出し

問題の指摘

詳細情報詳細情報について