DIVERSITY-BASED ATTRIBUTE WEIGHTING FOR K-MODES CLUSTERING
Abstract: Categorical data is
a kind of data that is used for computational in computer science. To obtain
the information from categorical data input, it needs a clustering algorithm.
There are so many clustering algorithms that are given by the researchers. One
of the clustering algorithms for categorical data is k-modes. K-modes uses a
simple matching approach. This simple matching approach uses similarity values.
In K-modes, the two similar objects have similarity value 1, and 0 if it is
otherwise. Actually, in each attribute, there are some kinds of different
attribute value and each kind of attribute value has different number. The
similarity value 0 and 1 is not enough to represent the real semantic distance
between a data object and a cluster. Thus in this paper, we generalize a
k-modes algorithm for categorical data by adding the weight and diversity value
of each attribute value to optimize categorical data clustering.
Author: Muhammad Misbachul
Huda, Dian Rahma Hayun, Annisaa Sri Indarwanti
Journal Code: jptkomputergg140011