この論文をさがす
抄録
<p>Providing explanation and interpretability for CNNs has received considerable interest in recent years. Due to the high computation cost and complexity of video data, the explanation of 3D video recognition CNNs is relatively less studied. Moreover, existing 3D explanation methods are not able to produce a high-level explanation. In this paper, we provide a comprehensive introduction to a 3D explanation model that is not only capable of producing a human-understandable high-level explanation for 3D CNNs, but is also applicable to real-world applications. The Spatial-Temporal Concept-based Explanation (STCE) framework is composed of two steps : (1) the videos are segmented into multiple supervoxels, similar supervoxels are clustered as a high-level concept;and (2) the interpreting framework calculates a score for each concept, with a high score indicating that the network gives the concept more attention. STCE's success in video recognition enables its application to real-world tasks, such as social relation atmosphere recognition.</p>
収録刊行物
-
- 日本画像学会誌
-
日本画像学会誌 62 (6), 610-621, 2023-12-10
一般社団法人 日本画像学会
- Tweet
キーワード
詳細情報 詳細情報について
-
- CRID
- 1390861383235115264
-
- NII書誌ID
- AA1137305X
-
- ISSN
- 18804675
- 13444425
-
- NDL書誌ID
- 033225820
-
- 本文言語コード
- en
-
- データソース種別
-
- JaLC
- NDL
-
- 抄録ライセンスフラグ
- 使用不可