Gaze-Dependent Image Re-Ranking Technique for Enhancing Content-Based Image Retrieval

DOI Web Site 参考文献47件 オープンアクセス
  • Yuhu Feng
    Graduate School of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Hokkaido, Japan
  • Keisuke Maeda
    Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Hokkaido, Japan
  • Takahiro Ogawa
    Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Hokkaido, Japan
  • Miki Haseyama
    Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Hokkaido, Japan

説明

<jats:p>Content-based image retrieval (CBIR) aims to find desired images similar to the image input by the user, and it is extensively used in the real world. Conventional CBIR methods do not consider user preferences since they only determine retrieval results by referring to the degree of resemblance or likeness between the query and potential candidate images. Because of the above reason, a “semantic gap” appears, as the model may not accurately understand the potential intention that a user has included in the query image. In this article, we propose a re-ranking method for CBIR that considers a user’s gaze trace as interactive information to help the model predict the user’s inherent attention. The proposed method uses the user’s gaze trace corresponding to the image obtained from the initial retrieval as the user’s preference information. We introduce image captioning to effectively express the relationship between images and gaze information by generating image captions based on the gaze trace. As a result, we can transform the coordinate data into a text format and explicitly express the semantic information of the images. Finally, image retrieval is performed again using the generated gaze-dependent image captions to obtain images that align more accurately with the user’s preferences or interests. The experimental results on an open image dataset with corresponding gaze traces and human-generated descriptions demonstrate the efficacy or efficiency of the proposed method. Our method considers visual information as the user’s feedback to achieve user-oriented image retrieval.</jats:p>

収録刊行物

参考文献 (47)*注記

もっと見る

関連プロジェクト

もっと見る

詳細情報 詳細情報について

問題の指摘

ページトップへ