A Method of Extracting and Evaluating Popularity and Unpopularity for Natural Language Expressions

説明

Although a user’s opinion, or a live voice, is very useful information for text mining of the business, it is difficult to extract popularity and unpopularity impressions of users from texts written in natural language. The popularity and unpopularity impressions discussed here depend on user’s claims, interests and demands. This paper presents a method of determining these impressions in commodity review sentences. Multi-attribute rule is introduced to extract the impressions from sentences, and four-stage-rules are defined in order to evaluate popularity and unpopularity impressions step by step. A deterministic multi-attribute pattern matching algorithm is utilized to determine the impressions efficiently. From simulation results for 2,240 review comments, it is verified that the multi-attribute pattern matching algorithm is 44.5 times faster than the Aho and Corasick method. The precision and recall of extracted impressions for each commodity are 94% and 93%. Moreover, the precision and recall of the resulting impressions for each rule are 95% and 95%, respectively.

詳細情報 詳細情報について

問題の指摘

ページトップへ