ridm@nrct.go.th   ระบบคลังข้อมูลงานวิจัยไทย   รายการโปรดที่คุณเลือกไว้

Robust speech features and acoustic models for speech recognition.

หน่วยงาน Nanyang Technological University, Singapore

รายละเอียด

ชื่อเรื่อง : Robust speech features and acoustic models for speech recognition.
นักวิจัย : Xiao, Xiong.
คำค้น : DRNTU::Engineering::Computer science and engineering::Computing methodologies::Pattern recognition.
หน่วยงาน : Nanyang Technological University, Singapore
ผู้ร่วมงาน : -
ปีพิมพ์ : 2552
อ้างอิง : Xiao, X. (2009). Robust speech features and acoustic models for speech recognition. Doctoral thesis, Nanyang Technological University, Singapore. , http://hdl.handle.net/10356/20733
ที่มา : -
ความเชี่ยวชาญ : -
ความสัมพันธ์ : -
ขอบเขตของเนื้อหา : -
บทคัดย่อ/คำอธิบาย :

This thesis examines techniques to improve the robustness of automatic speech recognition (ASR) systems against noise distortions. The study is important as the performance of ASR systems degrades dramatically in adverse environments, and hence greatly limits the speech recognition application deployment in realistic environments. Towards this end, we examine a feature compensation approach and a discriminative model training approach to improve the robustness of speech recognition system. The degradation of recognition performance is mainly due to the statistical mismatch between clean-trained acoustical model and noisy testing speech features. To reduce the feature-model mismatch, we propose to normalize the temporal structure of both training and testing speech features. Speech features' temporal structures are represented by the power spectral density (PSD) functions of feature trajectories. We propose to normalize the temporal structures by applying equalizing filters to the feature trajectories. The proposed filter is called temporal structure normalization (TSN) filter. Compared to other temporal filters used in speech recognition, the advantage of the TSN filter is its adaptability to changing environments. The TSN filter can also be viewed as a feature normalization technique that normalizes the PSD function of features, while other normalization methods, such as histogram equalization (HEQ), normalize the probability density function (p.d.f.) of features. Experimental study shows that the TSN filter produces better performance than other state-of-the-art temporal filters on both small vocabulary Aurora-2 task and large vocabulary Aurora-4 task.

บรรณานุกรม :
Xiao, Xiong. . (2552). Robust speech features and acoustic models for speech recognition..
    กรุงเทพมหานคร : Nanyang Technological University, Singapore.
Xiao, Xiong. . 2552. "Robust speech features and acoustic models for speech recognition.".
    กรุงเทพมหานคร : Nanyang Technological University, Singapore.
Xiao, Xiong. . "Robust speech features and acoustic models for speech recognition.."
    กรุงเทพมหานคร : Nanyang Technological University, Singapore, 2552. Print.
Xiao, Xiong. . Robust speech features and acoustic models for speech recognition.. กรุงเทพมหานคร : Nanyang Technological University, Singapore; 2552.