人間の画像認識とコンピュータビジョンの画像認識はどこが違うのか

書誌事項

タイトル別名
  • What is the Difference between Image Cognition in Humans and Computer Vision?

抄録

<p>There exist some Computer Vision Models (CVMs) such as CNN, Vision Transformer (ViT), and CLIP, which were pre-trained on a huge amount of training data. The image cognition power of these CVMs is very high. In our environmental cognition research using photos, we manually measured the inter-photo visual similarity. Our previous study found that CVM-based photo similarity and visual similarity were quite similar, when compared by photo MDS. However, it was also suggested that the difference in image cognition between humans and CVM was related to representation of humans. We investigated here numerically in detail the difference between CVM-based photo similarity and visual similarity, using six types of photo sets. The influence of representation could be evaluated by cluster size on MDS. It was shown that representation influences the cognition of shrines and temples, foods, insects, buildings, greens, garden styles, perspective views, night views, the symbol tree, and so on.</p>

収録刊行物

参考文献 (21)*注記

もっと見る

詳細情報 詳細情報について

問題の指摘

ページトップへ