- 【Updated on May 12, 2025】 Integration of CiNii Dissertations and CiNii Books into CiNii Research
- Trial version of CiNii Research Automatic Translation feature is available on CiNii Labs
- Suspension and deletion of data provided by Nikkei BP
- Regarding the recording of “Research Data” and “Evidence Data”
On Document Similarity Measures
-
- Asahara Masayuki
- National Institute for Japanese Language and Linguistics
-
- Kato Sachi
- National Institute for Japanese Language and Linguistics
Bibliographic Information
- Other Title
-
- 文書間類似度について
- ブンショ カン ルイジド ニ ツイテ
Search this article
Description
<p>Document similarity measuring techniques are used to evaluate both content and writing style. Evaluation measures for comparing the summary or translation of a system-generated source text with that of human-generated text have been proposed in text summarization and machine translation fields. The distance metrics are measures in terms of morphemes or morpheme sequences to evaluate or register different writing styles. In this study, we discuss the relations among the equivalence properties of mathematical metrics, similarities, kernels, ordinal scales, and correlations. In addition, we investigate the behavior of techniques for measuring content and style similarities for several corpora having similar content. The analysis results obtained using different document similarity measurement techniques indicate the instability of the evaluate system. </p>
Journal
-
- Journal of Natural Language Processing
-
Journal of Natural Language Processing 23 (5), 463-499, 2016
The Association for Natural Language Processing
- Tweet
Details 詳細情報について
-
- CRID
- 1390001204474497664
-
- NII Article ID
- 130005439793
- 40021040299
-
- NII Book ID
- AN10472659
-
- ISSN
- 21858314
- 13407619
-
- NDL BIB ID
- 027807849
-
- Text Lang
- ja
-
- Article Type
- journal article
-
- Data Source
-
- JaLC
- NDL Search
- Crossref
- CiNii Articles
- NINJAL
- KAKEN
- OpenAIRE
-
- Abstract License Flag
- Disallowed