Volume 8, Issue 1 (2008)                   MJEE 2008, 8(1): 13-27 | Back to browse issues page

XML Persian Abstract Print

Download citation:
BibTeX | RIS | EndNote | Medlars | ProCite | Reference Manager | RefWorks
Send citation to:

Gharavian D, احدی س م. Emotional Speech Recognition and Emotion Identification in Farsi Language. MJEE. 8 (1) :13-27
URL: http://mjee.modares.ac.ir/article-17-11954-en.html
1- P. O. Box: 16756-1719, Tehran, Iran
2- دانشگاه صنعتی امیر کبیر
Abstract:   (3533 Views)
Speech emotion can add more information to speech in comparison to available textual information. However, it will also lead to some problems in speech recognition process. In a previous study, we depicted the substantial changes of speech parameters caused by speech emotion. Therefore, in order to improve emotional speech recognition rate, in a first step, the effects of emotion on speech parameters should be evaluated and in the next steps, emotional speech recognition accuracy be improved through application of suitable parameters. The changes in speech parameters, i.e. formant frequencies and pitch frequency, due to anger and grief were evaluated for Farsi language in our former research. In this research, using those results, we try to improve emotional speech recognition accuracy using baseline models. We show that adding parameters such as formant and pitch frequencies to the speech feature vector can improve recognition accuracy. The amount of improvement depends on parameter type, number of mixture components and the emotional condition. Proper identification of emotional condition can also help in improving speech recognition accuracy. To recognize emotional condition of speech, formant and pitch frequencies were used successfully in two different approaches, namley decision tree and GMM.
Full-Text [PDF 884 kb]   (1647 Downloads)    

Received: 2010/11/21 | Accepted: 2008/12/24 | Published: 2010/11/21

Add your comments about this article : Your username or Email: