Predicting the Level of Emotion by Means of Indonesian Speech Signal
Abstract: Understanding human
emotion is of importance for developing better and facilitating smooth interpersonal
relations. It becomes much more important because human thinking process and
behavior are strongly influenced by the emotion. Align with these needs, an
expert system that capable of predicting the emotion state would be useful for
many practical applications. Based on a speech signal, the system has been
widely developed for various languages. This study intends to evaluate to which
extent Mel-Frequency Cepstral Coefficients (MFCC) features, besides Teager
energy feature, derived from Indonesian speech signal relates to four emotional
types: happy, sad, angry, and fear. The study utilizes empirical data of nearly
300 speech signals collected from four amateur actors and actresses speaking 15
prescribed Indonesian sentences. Using support vector machine classifier, the
empirical findings suggest that the Teager energy, as well as the first
coefficient of MFCCs, are a crucial feature and the prediction can achieve the
accuracy level of 86%. The accuracy increases quickly with a few initial MFCC
features. The fourth and more features have negligible effects on the accuracy.
Keywbords: Indonesia speech, mel frequency cepstral coefficient, teager
energy, support vector machine
Author: Fergyanto E. Gunawan,
Kanyadian Idananta
Journal Code: jptkomputergg170121