An improved method using k-means to determine the optimal number of clusters, considering the relations between several variables
-
- Toyoda Hideki
- Waseda University
-
- Ikehara Kazuya
- Waseda University
Bibliographic Information
- Other Title
-
- 変数間の関係性を考慮してクラスター数を決定するk-means法の改良
- ヘンスウカン ノ カンケイセイ オ コウリョ シテ クラスタースウ オ ケッテイ スル k meansホウ ノ カイリョウ
Search this article
Description
In this article, we propose a non-hierarchical clustering method that can consider the relations between several variables and determine the optimal number of clusters. By utilizing the Mahalanobis distance instead of the Euclidean distance, which is calculated in k-means, we could consider the relations between several variables and obtain better groupings. Assuming that the data are samples from a mixture normal distribution, we could also calculate Akaike's information criterion (AIC) and the Bayesian information criterion (BIC) to determine the number of clusters. We used simulation and real data examples to confirm the usefulness of the proposed method. This method allows determination of the optimal number of clusters, considering the relations between several variables.
Journal
-
- The Japanese journal of psychology
-
The Japanese journal of psychology 82 (1), 32-40, 2011
The Japanese Psychological Association
- Tweet
Details 詳細情報について
-
- CRID
- 1390001205078546048
-
- NII Article ID
- 130000992467
-
- NII Book ID
- AN00123620
-
- COI
- 1:STN:280:DC%2BC3MnjslyltA%3D%3D
-
- ISSN
- 18841082
- 00215236
-
- NDL BIB ID
- 11050922
-
- PubMed
- 21706821
-
- Text Lang
- ja
-
- Article Type
- journal article
-
- Data Source
-
- JaLC
- NDL Search
- Crossref
- PubMed
- CiNii Articles
- KAKEN
- OpenAIRE
-
- Abstract License Flag
- Disallowed