-
- HIGASHI Kazuyuki
- Graduate School of Information Science and Technology, Osaka University
-
- TAKAHASHI Hitoshi
- Graduate School of Information Science and Technology, Osaka University
-
- NAKAGAWA Hiroyuki
- Graduate School of Information Science and Technology, Osaka University
-
- TSUCHIYA Tatsuhiro
- Graduate School of Information Science and Technology, Osaka University
Bibliographic Information
- Other Title
-
- 単語の出現頻度と類似性に基づいたトピックモデル洗練化手法
- タンゴ ノ シュツゲン ヒンド ト ルイジセイ ニ モトズイタ トピックモデル センレンカ シュホウ
Search this article
Description
<p>Software developers have made increasing use of natural language documents in many cases. Documents may contain useful information for software developers; however, it is difficult to extract such information when the number of the documents is considerably large. Latent Dirichlet Allocation (LDA) is a promising way of topic modeling. LDA-based topic modeling can be useful in facilitating comprehension of such documents. In LDA, a stop word list is used to filter general words for accurate topic classification. However, when using an existing stop word list, it is difficult to filter words that are not general but frequently appear in the target documents. In this paper, we propose a method that consists of two steps: stop word extraction from target documents and similar topic merging. We experimentally evaluate the method by applying it to mailing list. The experimental results demonstrate that our method constructs a topic model more accurately than the existing method.</p>
Journal
-
- Computer Software
-
Computer Software 36 (4), 4_25-4_31, 2019-10-25
Japan Society for Software Science and Technology
- Tweet
Details 詳細情報について
-
- CRID
- 1390283659833300992
-
- NII Article ID
- 130007772583
-
- NII Book ID
- AN10075819
-
- NDL BIB ID
- 030076870
-
- ISSN
- 02896540
-
- Text Lang
- ja
-
- Article Type
- journal article
-
- Data Source
-
- JaLC
- NDL Search
- CiNii Articles
- KAKEN
-
- Abstract License Flag
- Disallowed