Using data-driven feature enrichment of text representation and ensemble technique for sentence-level polarity classification
-
- Pu Zhang
- College of Computer Science, Chongqing University, China and College of Computer Science and Technology, Chongqing University of Posts and Telecommunications, China
-
- Zhongshi He
- College of Computer Science, Chongqing University, China
Description
<jats:p> As an important issue in sentiment analysis, sentence-level polarity classification plays a critical role in many opinion-mining applications such as opinion question answering, opinion retrieval and opinion summarization. Employing a supervised learning paradigm to train a classifier from sentences often faces the data sparseness problem owing to the short-length limit introduced to texts. In this article, regarding this problem, we exploit two different feature sets learned from external data sets as additional features to enrich data representation: one is a latent topic feature set obtained using a topic model, and the other is a related word feature set derived using word embeddings. Furthermore, we propose an ensemble approach by using these additional features to guide the design of different members of the ensemble. Experimental results on the public movie review dataset demonstrate that the enriched representations are effective for improving the performance of polarity classification, and the proposed ensemble approach can further improve the overall performance. </jats:p>
Journal
-
- Journal of Information Science
-
Journal of Information Science 41 (4), 531-549, 2015-05-19
SAGE Publications
- Tweet
Details 詳細情報について
-
- CRID
- 1360855568453269632
-
- ISSN
- 17416485
- 01655515
-
- Data Source
-
- Crossref