ridm@nrct.go.th   ระบบคลังข้อมูลงานวิจัยไทย   รายการโปรดที่คุณเลือกไว้

Towards an improved modeling of the glottal source in statistical parametric speech synthesis

หน่วยงาน Edinburgh Research Archive, United Kingdom

รายละเอียด

ชื่อเรื่อง : Towards an improved modeling of the glottal source in statistical parametric speech synthesis
นักวิจัย : Cabral, Joao P , Renals, Steve , Richmond, Korin , Yamagishi, Junichi
คำค้น : speech technology
หน่วยงาน : Edinburgh Research Archive, United Kingdom
ผู้ร่วมงาน : -
ปีพิมพ์ : 2550
อ้างอิง : J. Cabral, S. Renals, K. Richmond, and J. Yamagishi. Towards an improved modeling of the glottal source in statistical parametric speech synthesis. In Proc.of the 6th ISCA Workshop on Speech Synthesis, Bonn, Germany, 2007 , http://hdl.handle.net/1842/2003
ที่มา : -
ความเชี่ยวชาญ : -
ความสัมพันธ์ : -
ขอบเขตของเนื้อหา : -
บทคัดย่อ/คำอธิบาย :

This paper proposes the use of the Liljencrants-Fant model (LF-model) to represent the glottal source signal in HMM-based speech synthesis systems. These systems generally use a pulse train to model the periodicity of the excitation signal of voiced speech. However, this model produces a strong and uniform harmonic structure throughout the spectrum of the excitation which makes the synthetic speech sound buzzy. The use of a mixed band excitation and phase manipulation reduces this effect but it can result in degradation of the speech quality if the noise component is not weighted carefully. In turn, the LF-waveform has a decaying spectrum at higher frequencies, which is more similar to the real glottal source excitation signal. We conducted a perceptual experiment to test the hypothesis that the LF-model can perform as well as or better than the pulse train in a HMM-based speech synthesizer. In the synthesis, we used the mean values of the LF-parameters, calculated by measurements of the recorded speech. The result of this study is important not only regarding the improvement in speech quality of these type of systems, but also because the LF-model can be used to model many characteristics of the glottal source, such as voice quality, which are important for voice transformation and generation of expressive speech.

บรรณานุกรม :
Cabral, Joao P , Renals, Steve , Richmond, Korin , Yamagishi, Junichi . (2550). Towards an improved modeling of the glottal source in statistical parametric speech synthesis.
    กรุงเทพมหานคร : Edinburgh Research Archive, United Kingdom .
Cabral, Joao P , Renals, Steve , Richmond, Korin , Yamagishi, Junichi . 2550. "Towards an improved modeling of the glottal source in statistical parametric speech synthesis".
    กรุงเทพมหานคร : Edinburgh Research Archive, United Kingdom .
Cabral, Joao P , Renals, Steve , Richmond, Korin , Yamagishi, Junichi . "Towards an improved modeling of the glottal source in statistical parametric speech synthesis."
    กรุงเทพมหานคร : Edinburgh Research Archive, United Kingdom , 2550. Print.
Cabral, Joao P , Renals, Steve , Richmond, Korin , Yamagishi, Junichi . Towards an improved modeling of the glottal source in statistical parametric speech synthesis. กรุงเทพมหานคร : Edinburgh Research Archive, United Kingdom ; 2550.