IMPROVED DESIGN OF DTW AND GMM CASCADED ARABIC SPEAKER
Abstract: In this paper, we
discuss about the design, implementation and assessment of a two-stage Arabic
speaker recognition system, which aims to recognize a target Arabic speaker
among several people. The first stage uses improved DTW (Dynamic Time Warping)
algorithm and the second stage uses SA-KM-based GMM (Gaussian Mixture Model).
MFCC (Mel Frequency Cepstral Coefficients) and its differences form, as
acoustic feature, are extracted from the sample speeches. DTW provides three
most possible speakers and then the recognition results are conveyed to GMM
training processes. A specified similarity assessment algorithm, KL distance,
is applied to find the best match with the target speaker. Experimental results
show that text-independent recognition rate of the cascaded system reaches 90
percent.
Author: Shuoshuo Chen, Junbo
Zhao, Ruiqi Yang
Journal Code: jptkomputergg130011