ridm@nrct.go.th   ระบบคลังข้อมูลงานวิจัยไทย   รายการโปรดที่คุณเลือกไว้

Solving the Syllable Segmentation Problem in Connected Thai-Digit SpeechRecognition using Multiple-Instance Learning

หน่วยงาน ฐานข้อมูลวิทยานิพนธ์ไทย

รายละเอียด

ชื่อเรื่อง : Solving the Syllable Segmentation Problem in Connected Thai-Digit SpeechRecognition using Multiple-Instance Learning
นักวิจัย : ผกาเกษ วัตถุยา
คำค้น : -
หน่วยงาน : ฐานข้อมูลวิทยานิพนธ์ไทย
ผู้ร่วมงาน : -
ปีพิมพ์ : 2546
อ้างอิง : http://www.thaithesis.org/detail.php?id=1162546000012
ที่มา : -
ความเชี่ยวชาญ : -
ความสัมพันธ์ : -
ขอบเขตของเนื้อหา : -
บทคัดย่อ/คำอธิบาย :

This thesis studies a new machine learning framework of learning from ambiguity called multiple-instance learning problem. We first proposed a novel method called multiple-instance neural network with strong boundary criteria (MILAN-SBC) for learning from real valued multipleinstance data in multiple-instance problems. The proposed method was evaluated on MUSK data sets, a standard benchmark, and compared with the existing multiple-instance learning algorithms in the literatures. The ccuracies are 91.3% and 90.2% for MUSKI and MUSK2, respectively and better than those of the other neural network approaches tested in our experiments. The experimental results show that MINN-SBC was success in learning from ambiguous examples. Next, we focused on automatic connected Thai digit speech recognition application. The major cause of errors in automatic speech recognition is the inaccuracy of utomatic speech segmentation. Most of the automatic speech segmentation works are based on the thresholds of parameters for segmenting the speech data into syllabic segments. Because speech data is variability, the only one appropriate set of thresholds of segmentation parameters is hard to set. In our thesis, we tried three different sets of thresholds in segmentation processes. We then established a conventional speech recognition framework using a feed-forward neural network for each set of thresholds. We also explored several combinations of classifier schemes. The recognition rate results of these frameworks are used as the baseline results. However, these frameworks were not concentrated on the syllable segmentation problem. A multiple-instance learning can be taken into account by treating the selection of syllabic segments task as a learning problem, in order to avoid the decision making for segmentation thresholds. By examining all alternative syllabic segments of all utterances provided by multiple sets of thresholds, the multiple-instance learner could then be able to select the best syllabic segment that can be used to infer the acoustic features that best represent the observed classifications. Therefore, we proposed to apply MILAN-SBC for this application. The other usefulness of our proposed method is that manually segmented data is not needed in order to train acoustic models. Finally, we focused on the modification of multiple-instance learning for multi-class problems because traditional multiple-instance learning framework is binary-class classification, while speech recognition is the multi-class classification problem. This thesis is the first endeavor to modify the multiple-instance learning framework to multi-class problems. We first redefined a multiple-instance leaning problem to the multi-class problem, and then proposed a new method, extended from MINK-SBC, called ~imulti-class multiple-instance neural network with strong boundary criteria~i (MMINN-SBC). After that we apply the MMINN-SBC to solve the syllable segmentation problem in connected Thai digit speech recognition. The experimental results show that this problem is best treated as a multiple-instance learning problem instead of as a classical supervised learning problem. Compared to baseline results, both MILAN-SBC and MMINN-SBC significantly increase recognition rates, but the training time of MMINN-SBC is much less than that of MINN-SBC.

บรรณานุกรม :
ผกาเกษ วัตถุยา . (2546). Solving the Syllable Segmentation Problem in Connected Thai-Digit SpeechRecognition using Multiple-Instance Learning.
    กรุงเทพมหานคร : ฐานข้อมูลวิทยานิพนธ์ไทย.
ผกาเกษ วัตถุยา . 2546. "Solving the Syllable Segmentation Problem in Connected Thai-Digit SpeechRecognition using Multiple-Instance Learning".
    กรุงเทพมหานคร : ฐานข้อมูลวิทยานิพนธ์ไทย.
ผกาเกษ วัตถุยา . "Solving the Syllable Segmentation Problem in Connected Thai-Digit SpeechRecognition using Multiple-Instance Learning."
    กรุงเทพมหานคร : ฐานข้อมูลวิทยานิพนธ์ไทย, 2546. Print.
ผกาเกษ วัตถุยา . Solving the Syllable Segmentation Problem in Connected Thai-Digit SpeechRecognition using Multiple-Instance Learning. กรุงเทพมหานคร : ฐานข้อมูลวิทยานิพนธ์ไทย; 2546.