On Spherical k-Means Clustering

Bibliographic Information

Other Title
  • 球面k-mean++法について
  • 球面k-means++法について
  • キュウメン k-means++ホウ ニ ツイテ

Search this article

Abstract

k-means clustering (KM) algorithm, also called hard c-means clustering (HCM) algorithm, is a very powerful clustering algorithm, but it has a serious problem of strong initial value dependence. To decrease the dependence, Arthur and Vassilvitskii proposed an algorithm of k-means++ clustering (KM++) algorithm on 2007. By the way, there are many case that each object is allocated on an unit sphere, e.g. text clustering. Honik, Kober, and Buchta proposed spherical k-means clustering (SKM) algorithm to classify such objects on 2012. However, the algorithm also has the same problem of initial value dependence as KM. Therefore, this report discuss the following points: (1) the dissimilarity of SKM is extended to satisfy the triangle inequality, and (2) spherical k-means++ clustering (SKM++) algorithm which works well for the problem is proposed. The report shows that the effectiveness of SKM+ is theoretically guaranteed.

Journal

Details 詳細情報について

Report a problem

Back to top