| ชื่อเรื่อง | : | Hierarchical Reinforcement Learning for Spoken Dialogue Systems |
| นักวิจัย | : | Cuayáhuitl, Heriberto |
| คำค้น | : | Spoken dialogue systems , Semi-automatic dialogue strategy design , Hierarchical control , Prior expert knowledge , Semi-Markov decision processes , Hierarchical reinforcement learning |
| หน่วยงาน | : | Edinburgh Research Archive, United Kingdom |
| ผู้ร่วมงาน | : | Renals, Steve |
| ปีพิมพ์ | : | 2552 |
| อ้างอิง | : | http://hdl.handle.net/1842/2750 |
| ที่มา | : | - |
| ความเชี่ยวชาญ | : | - |
| ความสัมพันธ์ | : | Cuayáhuitl, H., Renals, S., Lemon, O., and Shimodaira, H. (2006). Reinforcement learning of dialogue strategies using hierarchical abstract machines. In IEEE Workshop on Spoken Language Technology (SLT), pp. 182-185, Palm Beach, Aruba. , Cuayáhuitl, H., Renals, S., Lemon, O., and Shimodaira, H. (2007). Hierarchical dialogue optimization using Semi-Markov decision processes. In INTERSPEECH,pp. 2693–2696, Antwerp, Belgium. , Levin, E., Pieraccini, R., and Eckert,W. (2000). A stochastic model of human machine interaction for learning dialog strategies. IEEE Transactions on Speech and Audio Processing, 8(1):11–23. , Dietterich, T. (2000). An overview of MAXQ hierarchical reinforcement learning. In Symposium on Abstraction, Reformulation, and Approximation (SARA), pp. 26–44, Horseshoe Bay, TX, USA. , Parr, R. and Russell, S. (1997). Reinforcement learning with hierarchies of machines. In Neural Information Processing Systems Conference (NIPS), pp. 1043–1049, Denver, CO, USA. |
| ขอบเขตของเนื้อหา | : | - |
| บทคัดย่อ/คำอธิบาย | : | Institute for Communicating and Collaborative Systems This thesis focuses on the problem of scalable optimization of dialogue behaviour in speech-based conversational systems using reinforcement learning. Most previous investigations in dialogue strategy learning have proposed flat reinforcement learning methods, which are more suitable for small-scale spoken dialogue systems. This research formulates the problem in terms of Semi-Markov Decision Processes (SMDPs), and proposes two hierarchical reinforcement learning methods to optimize sub-dialogues rather than full dialogues. The first method uses a hierarchy of SMDPs, where every SMDP ignores irrelevant state variables and actions in order to optimize a sub-dialogue. The second method extends the first one by constraining every SMDP in the hierarchy with prior expert knowledge. The latter method proposes a learning algorithm called 'HAM+HSMQ-Learning', which combines two existing algorithms in the literature of hierarchical reinforcement learning. Whilst the first method generates fully-learnt behaviour, the second one generates semi-learnt behaviour. In addition, this research proposes a heuristic dialogue simulation environment for automatic dialogue strategy learning. Experiments were performed on simulated and real environments based on a travel planning spoken dialogue system. Experimental results provided evidence to support the following claims: First, both methods scale well at the cost of near-optimal solutions, resulting in slightly longer dialogues than the optimal solutions. Second, dialogue strategies learnt with coherent user behaviour and conservative recognition error rates can outperform a reasonable hand-coded strategy. Third, semi-learnt dialogue behaviours are a better alternative (because of their higher overall performance) than hand-coded or fully-learnt dialogue behaviours. Last, hierarchical reinforcement learning dialogue agents are feasible and promising for the (semi) automatic design of adaptive behaviours in larger-scale spoken dialogue systems. This research makes the following contributions to spoken dialogue systems which learn their dialogue behaviour. First, the Semi-Markov Decision Process (SMDP) model was proposed to learn spoken dialogue strategies in a scalable way. Second, the concept of 'partially specified dialogue strategies' was proposed for integrating simultaneously hand-coded and learnt spoken dialogue behaviours into a single learning framework. Third, an evaluation with real users of hierarchical reinforcement learning dialogue agents was essential to validate their effectiveness in a realistic environment. |
| บรรณานุกรม | : |
Cuayáhuitl, Heriberto . (2552). Hierarchical Reinforcement Learning for Spoken Dialogue Systems.
กรุงเทพมหานคร : Edinburgh Research Archive, United Kingdom . Cuayáhuitl, Heriberto . 2552. "Hierarchical Reinforcement Learning for Spoken Dialogue Systems".
กรุงเทพมหานคร : Edinburgh Research Archive, United Kingdom . Cuayáhuitl, Heriberto . "Hierarchical Reinforcement Learning for Spoken Dialogue Systems."
กรุงเทพมหานคร : Edinburgh Research Archive, United Kingdom , 2552. Print. Cuayáhuitl, Heriberto . Hierarchical Reinforcement Learning for Spoken Dialogue Systems. กรุงเทพมหานคร : Edinburgh Research Archive, United Kingdom ; 2552.
|
