DOCUMENT CLUSTERING BY DYNAMIC HIERARCHICAL ALGORITHM BASED ON FUZZY SET TYPE-II FROM FREQUENT ITEMSET
Abstract: One of ways to
facilitate process of information retrieval is by performing clustering toward
collection of the existing documents. The existing text documents are often
unstructured. The forms are varied and their groupings are ambiguous. This
cases cause difficulty on information retrieval process. Moreover, every second
new documents emerge and need to be clustered. Generally, static document
clustering method performs clustering of document after whole documents are
collected. However, performing re-clustering toward whole documents when new
document arrives causes inefficient clustering process. In this paper, we
proposed a new method for document clustering with dynamic hierarchy algorithm
based on fuzzy set type - II from frequent itemset. To achieve the goals, there
are three main phases, namely: determination of key-term, the extraction of
candidates clusters and cluster hierarchical construction. Based on the
experiment, it resulted the value of F-measure 0.40 for Newsgroup, 0.62 for
Classic and 0.38 for Reuters. Meanwhile, time of computation when addition of
new document is lower than to the previous static method. The result shows that
this method is suitable to produce solution of clustering with hierarchy in
dynamical environment effectively and efficiently. This method also gives
accurate clustering result.
Author: Saiful Bahri Musa,
Andi Baso Kaswar, Supria Supria, Susiana Sari
Journal Code: jptkomputergg160012