-
- 高山 夏樹
- 電気通信大学大学院情報理工学研究科
-
- Gibran BENITEZ-GARCIA
- 電気通信大学大学院情報理工学研究科
-
- 高橋 裕樹
- 電気通信大学大学院情報理工学研究科 電気通信大学人工知能先端研究センター
書誌事項
- タイトル別名
-
- Sign Language Recognition Based on Spatial-Temporal Graph Convolution-Transformer
抄録
<p>This paper reports on sign language recognition based on human body part tracking. Tracking-based sign language recognition has practical advantages, such as robustness against variations in clothes and scene backgrounds. However, there is still room for improving feature extraction in tracking-based sign language recognition. In this paper, a tracking-based continuous sign language word recognition method called Spatial-Temporal Graph Convolution-Transformer is presented. Spatial-temporal graph convolution is employed to improve framewise feature extraction using tracking points, while Transformer enables the model to recognize word sequences of arbitrary lengths. Besides the model design, the training strategy also has an impact on the recognition performance. Multi-task learning, which combines connectionist temporal classification and cross-entropy losses, is employed to train the proposed method in this study. This training strategy improved the recognition performance by a significant margin. The proposed method was evaluated statistically using a sign language video dataset consisting of 275 types of isolated words and 120 types of sentences. The evaluation results show that STGC-Transformer with multi-task learning achieved 12.14% and 2.07% word error rates for isolated words and sentences, respectively.</p>
収録刊行物
-
- 精密工学会誌
-
精密工学会誌 87 (12), 1028-1035, 2021-12-05
公益社団法人 精密工学会
- Tweet
キーワード
詳細情報 詳細情報について
-
- CRID
- 1390008832634514048
-
- NII論文ID
- 130008125001
-
- ISSN
- 1882675X
- 09120289
-
- 本文言語コード
- ja
-
- データソース種別
-
- JaLC
- Crossref
- CiNii Articles
-
- 抄録ライセンスフラグ
- 使用不可