Hierarchical Reinforcement Learning for Spoken Dialogue Systems

ridm@nrct.go.th ระบบคลังข้อมูลงานวิจัยไทย รายการโปรดที่คุณเลือกไว้

Hierarchical Reinforcement Learning for Spoken Dialogue Systems

หน่วยงาน Edinburgh Research Archive, United Kingdom

รายละเอียด

ชื่อเรื่อง	:	Hierarchical Reinforcement Learning for Spoken Dialogue Systems
นักวิจัย	:	Cuayáhuitl, Heriberto
คำค้น	:	Spoken dialogue systems , Semi-automatic dialogue strategy design , Hierarchical control , Prior expert knowledge , Semi-Markov decision processes , Hierarchical reinforcement learning
หน่วยงาน	:	Edinburgh Research Archive, United Kingdom
ผู้ร่วมงาน	:	Renals, Steve
ปีพิมพ์	:	2552
อ้างอิง	:	http://hdl.handle.net/1842/2750
ที่มา	:	-
ความเชี่ยวชาญ	:	-
ความสัมพันธ์	:	Cuayáhuitl, H., Renals, S., Lemon, O., and Shimodaira, H. (2006). Reinforcement learning of dialogue strategies using hierarchical abstract machines. In IEEE Workshop on Spoken Language Technology (SLT), pp. 182-185, Palm Beach, Aruba. , Cuayáhuitl, H., Renals, S., Lemon, O., and Shimodaira, H. (2007). Hierarchical dialogue optimization using Semi-Markov decision processes. In INTERSPEECH,pp. 2693–2696, Antwerp, Belgium. , Levin, E., Pieraccini, R., and Eckert,W. (2000). A stochastic model of human machine interaction for learning dialog strategies. IEEE Transactions on Speech and Audio Processing, 8(1):11–23. , Dietterich, T. (2000). An overview of MAXQ hierarchical reinforcement learning. In Symposium on Abstraction, Reformulation, and Approximation (SARA), pp. 26–44, Horseshoe Bay, TX, USA. , Parr, R. and Russell, S. (1997). Reinforcement learning with hierarchies of machines. In Neural Information Processing Systems Conference (NIPS), pp. 1043–1049, Denver, CO, USA.
ขอบเขตของเนื้อหา	:	-
บทคัดย่อ/คำอธิบาย	:	Institute for Communicating and Collaborative Systems This thesis focuses on the problem of scalable optimization of dialogue behaviour in speech-based conversational systems using reinforcement learning. Most previous investigations in dialogue strategy learning have proposed flat reinforcement learning methods, which are more suitable for small-scale spoken dialogue systems. This research formulates the problem in terms of Semi-Markov Decision Processes (SMDPs), and proposes two hierarchical reinforcement learning methods to optimize sub-dialogues rather than full dialogues. The first method uses a hierarchy of SMDPs, where every SMDP ignores irrelevant state variables and actions in order to optimize a sub-dialogue. The second method extends the first one by constraining every SMDP in the hierarchy with prior expert knowledge. The latter method proposes a learning algorithm called 'HAM+HSMQ-Learning', which combines two existing algorithms in the literature of hierarchical reinforcement learning. Whilst the first method generates fully-learnt behaviour, the second one generates semi-learnt behaviour. In addition, this research proposes a heuristic dialogue simulation environment for automatic dialogue strategy learning. Experiments were performed on simulated and real environments based on a travel planning spoken dialogue system. Experimental results provided evidence to support the following claims: First, both methods scale well at the cost of near-optimal solutions, resulting in slightly longer dialogues than the optimal solutions. Second, dialogue strategies learnt with coherent user behaviour and conservative recognition error rates can outperform a reasonable hand-coded strategy. Third, semi-learnt dialogue behaviours are a better alternative (because of their higher overall performance) than hand-coded or fully-learnt dialogue behaviours. Last, hierarchical reinforcement learning dialogue agents are feasible and promising for the (semi) automatic design of adaptive behaviours in larger-scale spoken dialogue systems. This research makes the following contributions to spoken dialogue systems which learn their dialogue behaviour. First, the Semi-Markov Decision Process (SMDP) model was proposed to learn spoken dialogue strategies in a scalable way. Second, the concept of 'partially specified dialogue strategies' was proposed for integrating simultaneously hand-coded and learnt spoken dialogue behaviours into a single learning framework. Third, an evaluation with real users of hierarchical reinforcement learning dialogue agents was essential to validate their effectiveness in a realistic environment.
บรรณานุกรม	:	APA Chicago MLA Vancouver Cuayáhuitl, Heriberto . (2552). Hierarchical Reinforcement Learning for Spoken Dialogue Systems. กรุงเทพมหานคร : Edinburgh Research Archive, United Kingdom . Cuayáhuitl, Heriberto . 2552. "Hierarchical Reinforcement Learning for Spoken Dialogue Systems". กรุงเทพมหานคร : Edinburgh Research Archive, United Kingdom . Cuayáhuitl, Heriberto . "Hierarchical Reinforcement Learning for Spoken Dialogue Systems." กรุงเทพมหานคร : Edinburgh Research Archive, United Kingdom , 2552. Print. Cuayáhuitl, Heriberto . Hierarchical Reinforcement Learning for Spoken Dialogue Systems. กรุงเทพมหานคร : Edinburgh Research Archive, United Kingdom ; 2552.