Balancing up Efficiency and Accuracy in Translation Retrieval.
-
- Baldwin Timothy
- Tokyo Institute of Technology, Department of Computer Science
-
- Tanaka Hozumi
- Tokyo Institute of Technology, Department of Computer Science
Search this article
Abstract
This research looks at the effects of segment order and segmentation on translation retrieval performance for an experimental Japanese-English translation memory system. We implement a number of both bag-of-words and segment order-sensitive string comparison methods, and test each over character-based and word-based indexing. The translation retrieval performance of each system configuration is evaluated empirically through the notion of segment edit distance between the translation output and model translation. Our results indicate that character-based indexing is consistently superior to word-based indexing in terms of raw accuracy, although segmentation does have an accelerating effect on TM search times in combination with a number of retrieval optimisation techniques. Segment order-sensitive approaches are demonstrated to generally outperform bag-of-words methods, with 3-operation edit distance proving the most effective comparison method. We additionally reproduced the same basic results over alphabetised data as for lexically differentiated data containing kanji characters.
Journal
-
- Journal of Natural Language Processing
-
Journal of Natural Language Processing 8 (2), 19-37, 2001
The Association for Natural Language Processing
- Tweet
Details 詳細情報について
-
- CRID
- 1390282679450639232
-
- NII Article ID
- 130004292152
- 10008830400
-
- NII Book ID
- AN10472659
-
- ISSN
- 21858314
- 13407619
-
- NDL BIB ID
- 5759148
-
- Text Lang
- en
-
- Data Source
-
- JaLC
- NDL
- Crossref
- CiNii Articles
-
- Abstract License Flag
- Disallowed