Improving Multi-Document Summary Method Based on Sentence Distribution
Abstract: Automatic
multi-document summaries had been developed by researchers. The method used to select
sentences from the source document would determine the quality of the summary
result. One of the most popular methods used in weighting sentences was by
calculating the frequency of occurrence of words forming the sentences.
However, choosing sentences with that method could lead to a chosen sentence
which didn't represent the content of the source document optimally. This was
because the weighting of sentences was only measured by using the number of
occurrences of words. This study proposed a new strategy of weighting sentences
based on sentences distribution to choose the most important sentences which
paid much attention to the elements of sentences that were formed as a distribution
of words. This method of sentence distribution enables the extraction of an
important sentence in multi-document summarization which served as a strategy
to improve the quality of sentence summaries. In that respect were three
concepts used in this study: (1) clustering sentences with similarity based
histogram clustering, (2) ordering cluster by cluster importance and (3)
selection of important sentence by sentence distribution. Results of
experiments showed that the proposed method had a better performance when
compared with SIDeKiCK and LIGI methods. Results of ROUGE-1 showed the proposed
method increasing 3% compared with the SIDeKiCK method and increasing 5.1%
compared with LIGI method. Results of ROUGE-2 proposed method increase 13.7%
compared with the SIDeKiCK and increase 14.4% compared with LIGI method.
Author: Aminul Wahib
Journal: jptkomputergg160177