Feature mapping using deep belief networks for robust speech recognition

Gholamipoor, Mojtaba; Nasersharif, Babak

All

Webpages

Books

Journals

Tarbiat Modares University Press

The Modares Journal of Electrical Engineering

Volume 14, Issue 3 (2014) MJEE 2014, 14(3): 24-30 | Back to browse issues page

Mendeley

Zotero

RefWorks

Gholamipoor M, Nasersharif B. Feature mapping using deep belief networks for robust speech recognition. MJEE 2014; 14 (3) :24-30
URL: http://mjee.modares.ac.ir/article-17-3327-en.html

Feature mapping using deep belief networks for robust speech recognition

Mojtaba Gholamipoor

¹, Babak Nasersharif²

1- MSc in Artificial Intelligence from the School of Computer Engineering University of Technology, Tusi
2- Assistant Professor Department of Computer Engineering, K.N.Toosi University of Technology.

Abstract: (5208 Views)

Performance of automatic speech recognition (ASR) systems degrades in noisy conditions due to mismatch between training and test environments. Many methods have been proposed for reducing this mismatch in ASR systems. In recent years, deep neural networks (DNNs) have been widely used in ASR systems and also robust speech recognition and feature extraction. In this paper, we propose to use deep belief network (DBN) as a post-processing method for de-noising Mel frequency cepstral coefficients (MFCCs). In addition, we use deep belief network for extracting tandem features (posterior probability of phones occurrence) from de-noised MFCCs (obtained from previous stage) to obtain more robust and discriminative features. The final robust feature vector consists of de-noised MFCCs concatenated to mentioned tandem features. Evaluation results on Aurora2 database show that the proposed feature vector performs better than similar and conventional techniques, where it increases recognition accuracy in average by 28% in comparison to MFCCs.

Keywords: MFCC, Tandem feature, DBN, Robustness, Speech recognition

Full-Text [PDF 564 kb] (3568 Downloads)

Received: 2016/04/20 | Accepted: 2014/11/22 | Published: 2016/07/26

Rights and permissions
	This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.