Performance Evaluation on Collaborative Filtering using Latent Topics under Several Sparsity Levels

Bibliographic Information

Other Title
  • 潜在トピックを利用した協調フィルタリングにおけるスパース度合いに応じた性能評価


<p>A recommendation system is one of the methods to support users' information acquisition. Collaborative filtering (CF) is a popular method to achieve it. Users' preferences (how much users like items) are represented as rating matrix for using CF. However, the percentage of missing ratings (called sparsity) is usually high, which results in poor performance. To address this problem, a hybrid method has been proposed to mitigate the performance degradation even at high sparsity by combining not only the ratings but also other information about items and users. Collaborative topic regression (CTR) is a pioneer model, which extracts latent topics from documents about items or users and uses them together with rating matrix. However, due to the increase of model complexity of CTR compared to CF, the expected performance will not be achieved unless the word selection in the document and the hyperparameter settings should be done appropriately. In this study, we experimentally show how the performance of CTR changes under different sparsity levels, by varying the words selected based on document frequency and by varying the combination of hyperparameters that adjust the influence of the ratings and that of the latent topics. We prepare three datasets with different domains and evaluate the generality of the results in our experiments.</p>



See more


Report a problem

Back to top