Information Extraction from Japanese Case Report Corpus for Structuring Clinical Texts

SHIBATA Daisaku, KAWAZOE Yoshimasa, SHINOHARA Emiko, SHIMAMOTO Kiminori

doi:10.11517/pjsai.jsai2022.0_1j4os13a03

Bibliographic Information

Other Title

診療テキストの構造化に向けた症例報告コーパスからの情報抽出

Description

<p>[Background] Significant information related to symptoms and findings of the patients is often written in a free-text form in clinical texts. To utilize these texts, information extraction using Natural Language Processing is required. [Objective] In this study, we evaluated named entity recognition (NER) and relation extraction (RE) performances with machine learning methods. We utilized the Japanese Case report corpus, which has manually annotated 70 type of entities and 35 type of relations. [Method] This study utilized the aforementioned corpus containing 183 cases. Having pre-processed them, we finally used 182 cases consisting of 2,172 sentences. Furthermore, a machine learning model based on Bidirectional Encoder Representations from Transformers was used. [Result] The results revealed that the maximum micro-averaged F1 scores of NER and RE were 0.931 and 0.826, respectively. [Discussion] We obtained comparable results to previous studies. Hence, these results could be substantial accuracies as baselines.</p>

Journal

Proceedings of the Annual Conference of JSAI

Proceedings of the Annual Conference of JSAI JSAI2022 (0), 1J4OS13a03-1J4OS13a03, 2022

The Japanese Society for Artificial Intelligence

Keywords

Details 詳細情報について

CRID: 1390292706092133504

DOI: 10.11517/pjsai.jsai2022.0_1j4os13a03

Text Lang: ja

Data Source

JaLC

Abstract License Flag: Disallowed

Export