A study of language model evaluation metrics in the field of civil engineering

OGATA Riku, OKUBO Junichi, FUJII Junichiro, AMAKATA Masazumi

doi:10.11532/jsceiii.5.1_66

Bibliographic Information

Other Title

土木分野における言語モデル評価指標の検討

Abstract

<p>Natural language processing (NLP) is expected to be one of the interfaces to strengthen the tight connection between physical and virtual spaces in the digital twin, and is being considered for practical use in the field of civil engineering. For this technology to be fully functional, it is necessary for the language models to understand civil engineering terminology in the context, and appropriate evaluation of the models is required. However, most of the previous studies have focused on how to adapt the technology to the civil engineering field, and have not focused on the evaluation of the capability of language models to generate sentences. Therefore, this study aims to establish an evaluation metric to evaluate the capability of language models in the field of civil engineering. As a first step, we created a new dataset for evaluation and compared which of the existing metrics are appropriate for the civil engineering field. Finally, we discuss the issues involved in considering an automatic evaluation index.</p>

Journal

Artificial Intelligence and Data Science

Artificial Intelligence and Data Science 5 (1), 66-76, 2024

Japan Society of Civil Engineers

Keywords

Details 詳細情報について

CRID: 1390300080901325696

DOI: 10.11532/jsceiii.5.1_66

ISSN: 24359262

Text Lang: ja

Data Source

JaLC

Abstract License Flag: Disallowed

Export