Evaluation of Real-Time Deep Learning Turn-Taking Models for Multiple Dialogue Scenarios

被引:22
|
作者
Lala, Divesh [1 ]
Inoue, Koji [1 ]
Kawahara, Tatsuya [1 ]
机构
[1] Kyoto Univ, Grad Sch Informat, Kyoto, Japan
关键词
dialogue systems; turn-taking; evaluation methods; deep learning; neural networks;
D O I
10.1145/3242969.3242994
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The task of identifying when to take a conversational turn is an important function of spoken dialogue systems. The turn-taking system should also ideally be able to handle many types of dialogue, from structured conversation to spontaneous and unstructured discourse. Our goal is to determine how much a generalized model trained on many types of dialogue scenarios would improve on a model trained only for a specific scenario. To achieve this goal we created a large corpus of Wizard-of-Oz conversation data which consisted of several different types of dialogue sessions, and then compared a generalized model with scenario-specific models. For our evaluation we go further than simply reporting conventional metrics, which we show are not informative enough to evaluate turn-taking in a real-time system. Instead, we process results using a performance curve of latency and false cut-in rate, and further improve our model's real-time performance using a finite-state turn-taking machine. Our results show that the generalized model greatly outperformed the individual model for attentive listening scenarios but was worse in job interview scenarios. This implies that a model based on a large corpus is better suited to conversation which is more user-initiated and unstructured. We also propose that our method of evaluation leads to more informative performance metrics in a real-time system.
引用
收藏
页码:78 / 86
页数:9
相关论文
共 50 条
  • [31] Ensemble of deep transfer learning models for real-time automatic detection of face mask
    Rubul Kumar Bania
    [J]. Multimedia Tools and Applications, 2023, 82 : 25131 - 25153
  • [32] Real-time multiple spatiotemporal action localization and prediction approach using deep learning
    Hammam, Ahmed Ali
    Soliman, Mona M.
    Hassanien, Aboul Ella
    [J]. NEURAL NETWORKS, 2020, 128 : 331 - 344
  • [33] Real-Time Multiple Face Recognition using Deep Learning on Embedded GPU System
    Saypadith, Savath
    Aramvith, Supavadee
    [J]. 2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1318 - 1324
  • [34] Real-time model calibration with deep reinforcement learning
    Tian, Yuan
    Chao, Manuel Arias
    Kulkarni, Chetan
    Goebel, Kai
    Fink, Olga
    [J]. MECHANICAL SYSTEMS AND SIGNAL PROCESSING, 2022, 165
  • [35] Unsupervised Deep Representation Learning for Real-Time Tracking
    Wang, Ning
    Zhou, Wengang
    Song, Yibing
    Ma, Chao
    Liu, Wei
    Li, Houqiang
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2021, 129 (02) : 400 - 418
  • [36] Real-time Yoga recognition using deep learning
    Yadav, Santosh Kumar
    Singh, Amitojdeep
    Gupta, Abhishek
    Raheja, Jagdish Lal
    [J]. NEURAL COMPUTING & APPLICATIONS, 2019, 31 (12): : 9349 - 9361
  • [37] Deep learning for real-time image steganalysis: a survey
    Ruan, Feng
    Zhang, Xing
    Zhu, Dawei
    Xu, Zhanyang
    Wan, Shaohua
    Qi, Lianyong
    [J]. JOURNAL OF REAL-TIME IMAGE PROCESSING, 2020, 17 (01) : 149 - 160
  • [38] Real-Time Traffic Classification through Deep Learning
    Priymak, Maxim
    Sinnott, Richard O.
    [J]. 8TH IEEE/ACM INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING, APPLICATIONS AND TECHNOLOGIES, BDCAT 2021, 2021, : 128 - 133
  • [39] Real-Time Classification of Earthquake using Deep Learning
    Kuyuk, H. Serdar
    Susumu, Ohno
    [J]. CYBER PHYSICAL SYSTEMS AND DEEP LEARNING, 2018, 140 : 298 - 305
  • [40] Real-Time Lane Detection Based on Deep Learning
    Sun-Woo Baek
    Myeong-Jun Kim
    Upendra Suddamalla
    Anthony Wong
    Bang-Hyon Lee
    Jung-Ha Kim
    [J]. Journal of Electrical Engineering & Technology, 2022, 17 : 655 - 664