Removing Partial Mismatches in Unsupervised Image Captioning

Honda Ukyo, Hashimoto Atsushi, Watanabe Taro, Matsumoto Yuji

doi:10.1527/tjsai.37-2_h-l82

Bibliographic Information

Other Title

擬似教師ありキャプション生成における部分的不一致の除去

Description

<p>Unsupervised image captioning is a task to describe images without the supervision of image–sentence pairs. With the support of pre-trained object detectors, previous work assigned pseudo-captions, i.e., sentences that contain the detected object labels, to a given image. They focused on aligning the pseudo-captions with input images at the sentence level. However, pseudo-captions contain many words that are irrelevant to a given image. To shed light on the problem of partial mismatches between images and pseudo-captions, we focus on removing mismatched words from image–sentence alignment. We propose a simple gating mechanism that is trained to align image features with only the most reliable words in pseudo-captions: the detected object labels. The superior performance of our method empirically demonstrates the importance of removing the partial mismatches. Detailed analysis elucidates that our method successfully improves its performance in predicting the words likely to be mismatched during training. Furthermore, we show that using our method as an initialization method significantly boosts the performance of the previous sentence-level alignment method. These results confirm the importance of careful alignment in word-level details.</p>

Journal

Transactions of the Japanese Society for Artificial Intelligence

Transactions of the Japanese Society for Artificial Intelligence 37 (2), H-L82_1-12, 2022-03-01

The Japanese Society for Artificial Intelligence

Keywords

Details 詳細情報について

CRID: 1390291767464072320

NII Article ID: 130008166003

DOI: 10.1527/tjsai.37-2_h-l82

ISSN: 13468030; 13460714

Web Site: https://www.jstage.jst.go.jp/article/tjsai/37/2/37_37-2_H-L82/_pdf

Text Lang: ja

Data Source

JaLC
Crossref
CiNii Articles
OpenAIRE

Abstract License Flag: Disallowed

Export

Report a problem