マルウェア検知のための正規表現を含むシグネチャグラフ生成手法

久保田, 稜, 碓井, 利宣, 川古谷, 裕平, 大月, 勇人, 岩村, 誠, ⽻⽥, ⼤樹

Endpoint Detection and Response（EDR）では，検知ルールとして，マルウェアが残す痕跡を表すIndicator of Compromise（IOC）が用いられている．多くのマルウェアを検知できるよう，正規表現を含んだ IOC を生成する手法は提案されているが，IOC の形式の制約上，その精度には限界があると考えられる．一方で，より表現力の高いグラフ形式のルールを生成する手法が提案されているものの，正規表現を導入する試みはまだされていない．本論文では，グラフの高い表現力と正規表現による頑健性を備えた新しい検知ルールの形式とその生成手法を提案する．グラフマイニングを用いて頻出する挙動パターンを抽出する前に，類似した頂点ラベルを同一の正規表現に変換することで，類似の検体間で痕跡に軽微な差異があっても，共通部分を正規表現で捉えた高精度なルールを生成できる．2022-2023 年のマルウェア約 4,000 検体を用いた実験では，本手法のファミリ分類精度（マクロF1）が86.6% となり，既存の IOC 生成手法を約 19 ポイント上回ったことで，本手法の有効性が確認された．

Endpoint Detection and Response (EDR) detects malware using Indicators of Compromise (IOCs), which describe evidences of malware infection. Existing studies can automatically generate IOCs that contain regular expressions. However, we believe that IOCs' detection accuracy is limited by the structural constraints. Graph-based signatures, which are more expressive, are also studied, but no attempt has been made to incorporate regular expressions into them. In this paper, we propose a method to automatically create signatures that combines the expressiveness of graphs and the robustness of regular expressions. By converting similar node labels into the same regular expression before mining frequent subgraphs, it can create highly-accurate signatures, even if there is slight variation in the indicators among samples of the same family. We evaluated its detection performance using 4,000 malware samples collected in 2022-2023. The Macro-F1 score was 86.6%, which surpasses that of an existing IOC generation method by 19 percentage points.

マルウェア検知のための正規表現を含むシグネチャグラフ生成手法

書誌事項

抄録

収録刊行物

キーワード

詳細情報詳細情報について

書き出し

問題の指摘

マルウェア検知のための正規表現を含むシグネチャグラフ生成手法

書誌事項

抄録

収録刊行物

キーワード

詳細情報 詳細情報について

書き出し

問題の指摘

詳細情報詳細情報について