ロバスト性向上を目的とした予測モデルの不確実性に基づくデータセットの水増し手法

明神 智之, 小川 秀人, 來間 啓伸, 佐藤 直人, Tomoyuki Myojin, Hideto Ogawa, Hironobu Kuruma, Naoto Sato

doi:10.20729/00224274

機械学習技術に基づく予測モデルを組み込んだソフトウェアシステムでは，デプロイされた環境において訓練データと異なる分布のデータに対して想定外の挙動を引き起こす可能性がある．予測モデルのロバスト性を向上させてモデルの挙動をできる限り想定の範囲内に収めるためには，あらかじめ多様な分布のデータを訓練データに含めることが有効であるが，単純に訓練データを水増しすると，本来予測したかった分布のデータに対する予測の性能が悪化する可能性がある．本論文では，予測モデルの不確実性を指標として，訓練が不十分なデータに限定して訓練データを水増しすることで，多様な分布のデータに対するロバスト性の向上と不必要なデータによる性能の悪化の抑制を両立させるデータセットの水増し手法を提案する．MNISTデータセットを用いた画像識別問題で提案手法のフィージビリティを評価した結果，ロバスト性の向上と性能の悪化の抑制を確認した．

Software systems that contain predictive models based on machine learning techniques may exhibit unexpected behavior in deployed environments when the distribution of data differs from that of the training data. To improve the robustness of the predictive model and keep the model's behavior within the expected range as much as possible, it is effective to include data with various distributions in the training data in advance. However, simply augmenting the training data may worse the performance of predictions for data with the distribution that was originally intended. In this paper, we propose a data augmentation method that improves robustness to data with diverse distributions while suppressing performance deterioration due to unnecessary data by augmenting training data only for insufficiently trained data, using the model's prediction uncertainty as an indicator. We evaluated the feasibility of the proposed method on an image identification problem using the MNIST dataset, and confirmed that the method improves robustness and suppresses performance deterioration.

ロバスト性向上を目的とした予測モデルの不確実性に基づくデータセットの水増し手法

書誌事項

この論文をさがす

説明

収録刊行物

キーワード

詳細情報詳細情報について

書き出し

問題の指摘

ロバスト性向上を目的とした予測モデルの不確実性に基づくデータセットの水増し手法

書誌事項

この論文をさがす

説明

収録刊行物

キーワード

詳細情報 詳細情報について

書き出し

問題の指摘

詳細情報詳細情報について