Information Extraction from Japanese Case Report Corpus for Structuring Clinical Texts

DOI

Bibliographic Information

Other Title
  • 診療テキストの構造化に向けた症例報告コーパスからの情報抽出

Abstract

<p>[Background] Significant information related to symptoms and findings of the patients is often written in a free-text form in clinical texts. To utilize these texts, information extraction using Natural Language Processing is required. [Objective] In this study, we evaluated named entity recognition (NER) and relation extraction (RE) performances with machine learning methods. We utilized the Japanese Case report corpus, which has manually annotated 70 type of entities and 35 type of relations. [Method] This study utilized the aforementioned corpus containing 183 cases. Having pre-processed them, we finally used 182 cases consisting of 2,172 sentences. Furthermore, a machine learning model based on Bidirectional Encoder Representations from Transformers was used. [Result] The results revealed that the maximum micro-averaged F1 scores of NER and RE were 0.931 and 0.826, respectively. [Discussion] We obtained comparable results to previous studies. Hence, these results could be substantial accuracies as baselines.</p>

Journal

Details 詳細情報について

Report a problem

Back to top