On Hard Clustering for Data with Tolerance

  • HAMASUNA Yukihiro
    Department of Risk Engineering, Graduate School of Systems and Information Engineering, University of Tsukuba
  • ENDO Yasunori
    Department of Risk Engineering, Faculty of Systems and Information Engineering, University of Tsukuba
  • MIYAMOTO Sadaaki
    Department of Risk Engineering, Faculty of Systems and Information Engineering, University of Tsukuba
  • HASEGAWA Yasushi
    Department of Risk Engineering, Graduate School of Systems and Information Engineering, University of Tsukuba

Bibliographic Information

Other Title
  • 許容範囲付きデータに対するハードクラスタリング
  • キョヨウ ハンイツキ データ ニ タイスル ハード クラスタリング

Search this article

Abstract

In this paper, two clustering algorithms that handle data with tolerance are proposed. One is based on hard c-means (HCM) while the other is based on the learning vector quantization (LVQC). We consider a tolerance which is a new concept to handle data with uncertainty such as errors, ranges, or a lost attribute of data in the optimization framework. The concept of tolerance is included in both algorithms. Dissimilarity in the former clustering algorithms is defined by using nearest-neighbor, furthest-neighbor or Hausdorff distance. On the other hand, dissimilarity in the proposed algorithms is defined by squared L2 (euclidean)-norm and the algorithm can handle the data with uncertainty in the strict optimization problems. First, the concept of tolerance which implies errors, ranges and the loss of attribute of data is described. Optimization problems that take the tolerance into account are formulated. A unique and explicit optimal solution is given by Karush-Kuhn-Tucker conditions. An alternate minimization algorithm and a learning algorithm are constructed. Moreover, effectiveness of the proposed algorithms is verified through numerical examples.

Journal

References(18)*help

See more

Details 詳細情報について

Report a problem

Back to top