Sex prediction by machine learning methods for small sample sizes: a case study of cranial measurements of killer whales (<i>Orcinus orca</i>)

  • Takahashi Megumi
    Tokyo University of Marine Science and Technology The Institute of Cetacean Research
  • Kato Hidehiro
    Tokyo University of Marine Science and Technology The Institute of Cetacean Research
  • Kitakado Toshihide
    Tokyo University of Marine Science and Technology

Bibliographic Information

Other Title
  • 機械学習法を用いたシャチ頭骨形態計測データに基づく性判別分析
  • キカイ ガクシュウホウ オ モチイタ シャチ トウコツ ケイタイ ケイソク データ ニ モトズク セイ ハンベツ ブンセキ

Search this article

Abstract

<p>Because the opportunity to obtain morphological information of whales is generally limited, analyses are sometimes conducted using insufficient sample sizes. In this study, the possibility of improving the accuracy of adult killer whale (Orcinus orca) sex prediction was evaluated statistically and from a practical viewpoint using sex discrimination rules with traditional linear discriminant analysis (LDA) and Random Forests (RF) machine learning algorithms. Eighteen cranial measurements from six female and six male adult killer whales (Orcinus orca) were collected in the west coast of Japan and analyzed. Simulation results using an independent test dataset and leave-one-out cross-validation suggested that LDA classifiers with variable selection generated overfitting rules. In contrast, RF classifiers suppressed overfitting in the development of classification rules from a small number of samples, and thereby constructed highly accurate sex prediction classification rules. Therefore, RF is more effective than LDA in solving the small sample size (SSS) problem in cranial measurements, as it reduced the risk of overestimation, and improved accuracy.</p>

Journal

Details 詳細情報について

Report a problem

Back to top