A practical framework for formalizing and extracting Chinese collocations
説明
In this paper we argue for a word-sense based formalization for collocation, and proposes a seed-based approach for collocation extraction for specific purposes. The approach uses RFR_SUM model to iteratively classify polysemous word sense in the corpus. The collocation strength is also obtained by RFR. To capture the syntactic relation inside collocations, this paper presents a frame-based collocation extraction method, which uses word-related frames to obtain collocation with structural information automatically from a large-scale corpus with an average accuracy rate of 89.69%.
収録刊行物
-
- 2011 7th International Conference on Natural Language Processing and Knowledge Engineering
-
2011 7th International Conference on Natural Language Processing and Knowledge Engineering 390-396, 2011-11-01
IEEE