Stemming Influence on Similarity Detection of Abstract Written in Indonesia
Abstract: In this paper we
would like to discuss about stemming effect by using Nazief and Adriani algorithm
against similarity detection result of Indonesian written abstract. The
contents of the publication abstract similarity detection can be used as an
early indication of whether or not the act of plagiarism in a writing. Mostly
in processing the text adding a pre-process, one of it which is called a
stemming by changing the word into the root word in order to maximize the
searching process. The result of stemming process will be changed as a certain
word n-gram set then applied an analysis of similarity using Fingerprint
Matching to perform similarity matching between text. Based on the F 1 -score
which used to balance the precision and recall number, the detection that
implements stemming and stopword removal has a better result in detecting
similarity between the text with an average is 42%. It is higher comparing to the
similarity detection by using only stemming process (31%) or the one that was
done without involving the text pre-process (34%) while applying the bigram.
Author: Tari Mardiana
Journal Code: jptkomputergg160170