A Sentiment Knowledge Discovery Model in Twitter’s TV Content Using Stochastic Gradient Descent Algorithm
Abstract: The use of social
media that the explosive can be a rich source for data mining. Meanwhile, the
development of television programs become increased and varied so motivate
people to make comments on it’s via social media. Social network contains
abundant information which is unstructured, heterogeneous, high dimensional and
incremental in nature. Abundant data can be a rich source of information but it
is difficult to identify manually. The contributions of this research are to
perform preprocessing to address unstructured data, a lot of noise and
heterogeneous; find patterns of information and knowledge of social media user
activities in the form of positive and negative sentiment on twitter TV
content. Some methodologies and techniques are used to perform preprocessing.
They are eliminates punctuation and symbols, eliminates number, replace numbers
into letters, translation of Alay words, eliminate stop word and Stemming
Porter Algorithm. Methodology of this study was used Stochastic Gradient
Descent (SGD).The text that has been through preprocessing produces a more
structured text, reducing noise and reducing the diversity of text. So,
preprocessing affect to the correctly classified istances and processing time.
The experiment results reveal that the use of SGD for discovery of the positive
and negative sentiment tends to be faster for large data or stream data.
Correctly classified instance with a maximum of 88%.
Keywords: Stochastic Gradient
Descent; opinion mining; sentiment analysis;Stemming Porter; stream data mining
Author: Lira Ruhwinaningsih,
Taufik Djatna
Journal Code: jptkomputergg160203