An Optimum Database for Isolated Word in Speech Recognition System
Abstract: Speech recognition
system (ASR) is a technology that allows computers receive the input using the
spoken words. This technology requires sample words in the pattern matching
process that is stored in the database. There is no reference as the
fundamental theory to develop database in ASR. So, the research of database
development to optimize the performance of the system is required. Mel-scale frequency
cepstral coefficients (MFCCs) is used to extract the characteristics of speech
signal and backpropagation neural network in quantized vector is used to
evaluate likelihood the maximum log values to the nearest pattern in the
database. The results shows the robustness of ASR is optimum using 140 samples
of data reference for each word with an average of accuracy is 99.95% and
duration process is 27.4 msec. The investigation also reported the gender
doesn’t have significantly influence to the accuracy. From these results it
concluded that the performance of ASR can be increased by optimizing the database.
Author: Syifaun Nafisah
Journal Code: jptkomputergg160179