Design, Implementation, and Operation of Annotation Support System for Morphological Information of BCCWJ
-
- Ogiso Toshinobu
- National Institute for Japanese Language and Linguistics
-
- Nakamura Takenori
- ManpowerGroup Co., Ltd.
Bibliographic Information
- Other Title
-
- 『現代日本語書き言葉均衡コーパス』形態論情報アノテーション支援システムの設計・実装・運用
- 『 ゲンダイ ニホンゴ カキコトバ キンコウ コーパス 』 ケイタイロン ジョウホウ アノテーション シエン システム ノ セッケイ ・ ジッソウ ・ ウンヨウ
Search this article
Abstract
“Balanced Corpus of Contemporary Written Japanese” is a large-scale Japanese corpus of 100 million words. It contains 170,000 XML files annotated with two levels of morphological information: short-unit word and long-unit word. We have constructed an annotation system to compile this corpus. The system allows many users to modify corpus annotations and dictionary entries, which are related to each other, while ensuring consistency. The system consists of a relational database server called the “Morphological Information Database,” a client tool that maintains the morphological information of the corpus called “Dynagon,” and a tool that manages dictionary entries for morphological analysis called “UniDic Explorer.” This paper describes the design, implementation, and operation of this “Morphological Information Database” for BCCWJ.
Journal
-
- Journal of Natural Language Processing
-
Journal of Natural Language Processing 21 (2), 301-332, 2014
The Association for Natural Language Processing
- Tweet
Details 詳細情報について
-
- CRID
- 1390001204476479488
-
- NII Article ID
- 130004566474
-
- NII Book ID
- AN10472659
-
- ISSN
- 21858314
- 13407619
-
- NDL BIB ID
- 025501142
-
- Text Lang
- ja
-
- Data Source
-
- JaLC
- NDL
- Crossref
- CiNii Articles
- NINJAL
-
- Abstract License Flag
- Disallowed