Measures for evaluating risk prediction models: a review

  • Shinozaki Tomohiro
    Department of Information and Computer Technology, Faculty of Engineering, Tokyo University of Science
  • Yokota Isao
    Department of Biostatistics, Graduate School of Medicine, Hokkaido University
  • Oba Koji
    Department of Biostatistics, School of Public Health, the University of Tokyo Interfaculty Initiative in Information Studies, the University of Tokyo
  • Kozuma Kayoko
    Department of Biostatistics, School of Public Health, the University of Tokyo
  • Sakamaki Kentaro
    Center for Data Science, Yokohama City University

Bibliographic Information

Other Title
  • イベント予測モデルの評価指標
  • イベント ヨソク モデル ノ ヒョウカ シヒョウ

Search this article

Description

<p>Prediction models are usually developed through model-construction and validation. Especially for binary or time-to-event outcomes, the risk prediction models should be evaluated through several aspects of the accuracy of prediction. With unified algebraic notation, we present such evaluation measures for model validation from five statistical viewpoints that are frequently reported in medical literature: 1) Brier score for prediction error; 2) sensitivity, specificity, and C-index for discrimination; 3) calibration-in-the-large, calibration slope, and Hosmer-Lemeshow statistic for calibration; 4) net reclassification and integrated discrimination improvement indexes for reclassification; and 5) net benefit for clinical usefulness. Graphical representation such as a receiver operating characteristic curve, a calibration plot, or a decision curve helps researchers interpret these evaluation measures. The interrelationship between them is discussed, and their definitions and estimators are extended to time-to-event data suffering from outcome-censoring. We illustrate their calculation through example datasets with the SAS codes provided in the web appendix.</p>

Journal

References(58)*help

See more

Details 詳細情報について

Report a problem

Back to top