攻撃者の振る舞い抽出のための遠距離教師あり学習

山嵜, 麿与

効果的なインシデント対応のために脅威レポートの共有が行われているが，自然文で記述された膨大な知識を人手で利活用することは容易ではない．このため，教師あり学習を用いて自然文から攻撃者の振る舞い情報を抽出する手法が検討されている．しかし，人手による教師データの作成コストが高いために，利用可能な教師データが不足しており，高精度な抽出器が実現できていない．そこで本研究では，攻撃者の振る舞いに関連する攻撃手法の名称や観測事象等の特徴語に着目し，人手によるラベル付け無しに大規模な擬似教師データを作成可能な遠距離教師あり学習手法を提案する．文に対する攻撃者の振る舞いのマルチラベル分類を行う評価実験の結果，擬似教師データとノイズモデリングネットワーク用いることにより，従来手法の精度を上回るF1値0.82で攻撃者の振る舞いを抽出可能であることを示す．また，大規模な脅威レポートから抽出した攻撃者の振る舞いの共起関係の可視化により，提案手法を用いることで，インシデント対応等の業務支援に活用可能な知見が得られることを示す．

Although threat reports are shared for effective incident responses, it is difficult to utilize vast knowledge of natural language reports manually. Therefore, it has been proposed that supervised machine learning-based extraction methods for adversary behaviors. However, these methods suffer from low performance because of a shortage of labeled data with high costs to develop manually. This paper proposes a distantly-supervised learning method that utilizes attack method names and observables related to adversary behaviors and creates large pseudo labeled datasets without human annotations. On a multi-label sentence classification task, this paper experimentally shows that the proposed method with the pseudo labeled data and noise modeling networks, achieves 0.82 F1 value, and the method outperforms existing methods. Furthermore, visualization of co-occurrence relationships in adversary behaviors extracted from large scale threat reports shows that the proposed method should be useful to obtain intelligence to be used to support operations such as incident response.

攻撃者の振る舞い抽出のための遠距離教師あり学習

書誌事項

抄録

収録刊行物

キーワード

詳細情報詳細情報について

書き出し

問題の指摘

攻撃者の振る舞い抽出のための遠距離教師あり学習

書誌事項

抄録

収録刊行物

キーワード

詳細情報 詳細情報について

書き出し

問題の指摘

詳細情報詳細情報について