Video clustering using spatio-temporal image with fixed length

説明

In order to handle video media such as TV programs efficiently and effectively, we need to segment a video stream into video segments and structuralize them based on their contents. We focus on similarity, which is one of the important relations between video segments, and describe a method to cluster similar segments in a video stream. The conventional clustering methods are based on shots, but no complete method to detect shot boundaries has yet been established. Our method is based on fixed length video stream segments, called video packets. Generating spatio-temporal images, we employ cooccurrence matrices to express features in the time dimension explicitly. From clustering experiments for actual TV programs, we obtained clustering accuracy of 81%.

収録刊行物

詳細情報 詳細情報について

問題の指摘

ページトップへ