Machine Comprehension of Spoken Content: TOEFL Listening Test and Spoken SQuAD

被引:7
|
作者
Lee, Chia-Hsuan [1 ]
Lee, Hung-yi [2 ]
Wu, Szu-Lin [3 ]
Liu, Chi-Liang [4 ]
Fang, Wei [5 ]
Hsu, Juei-Yang [3 ]
Tseng, Bo-Hsiang [6 ]
机构
[1] Natl Taiwan Univ, Grad Inst Networking & Multimedia, Taipei 10617, Taiwan
[2] Natl Taiwan Univ, Dept Elect Engn, Taipei 10617, Taiwan
[3] Natl Taiwan Univ, Grad Inst Elect Engn, Taipei 10617, Taiwan
[4] Natl Taiwan Univ, Grad Inst Commun, Taipei 10617, Taiwan
[5] MIT, Dept Elect Engn & Comp Sci, Cambridge, MA 02139 USA
[6] Univ Cambridge, Dept Engn, Cambridge CB2 1PZ, England
关键词
Speech question answering; TOEFL; SQuAD; attention model; deep learning; SPEECH RECOGNITION ERRORS; DYNAMIC MEMORY NETWORKS; QUESTION; IMPACT;
D O I
10.1109/TASLP.2019.2913499
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
A user can scan through a text easily, but it is not the case for spoken content, because they cannot be directly displayed on-screen. As a result, accessing large collections of spoken content is much more difficult and time-consuming than doing so for the text content. It would therefore he helpful to develop machines that understand spoken content. In this paper, we propose two new tasks for machine comprehension of spoken content. The first is a listening comprehension test for TOEFL, a challenging academic English examination for English learners who are not the native English speakers. We show that the proposed model outperforms the naive approaches and other neural network based models by exploiting the hierarchical structures of natural languages and the selective power of attention mechanism. For the second listening comprehension task - spoken SQuAD - we find that speech recognition errors severely impair machine comprehension; we propose the use of subword units to mitigate the impact of these errors.
引用
收藏
页码:1469 / 1480
页数:12
相关论文
共 50 条
  • [1] Towards Machine Comprehension of Spoken Content: Initial TOEFL Listening Comprehension Test by Machine
    Tseng, Bo-Hsiang
    Shen, Sheng-Syun
    Lee, Hung-Yi
    Lee, Lin-Shan
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2731 - 2735
  • [2] Spoken SQuAD: A Study of Mitigating the Impact of Speech Recognition Errors on Listening Comprehension
    Li, Chia-Hsuan
    Wu, Szu-Lin
    Liu, Chi-Liang
    Lee, Hung-yi
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3459 - 3463
  • [3] HIERARCHICAL ATTENTION MODEL FOR IMPROVED MACHINE COMPREHENSION OF SPOKEN CONTENT
    Fang, Wei
    Hsu, Juei-Yang
    Lee, Hung-yi
    Lee, Lin-Shan
    2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016), 2016, : 232 - 238
  • [4] COMPREHENSION OF SPOKEN LANGUAGE AND ITS IMPLICATIONS FOR TEACHING LISTENING
    Xue Jianguo Beijing Normal University
    Teaching English in China, 1994, (00) : 7 - 11
  • [5] An Analysis of TOEFL Listening Comprehension
    侯新民
    外语教学, 1990, (04) : 59 - 72+77
  • [6] SQuAD-SRC: A Dataset for Multi-Accent Spoken Reading Comprehension
    Tang, Yixuan
    Tung, Anthony K. H.
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 5206 - 5214
  • [7] Listening to the Spoken and Unspoken
    Rubinstein, Terri
    CONTEMPORARY PSYCHOANALYSIS, 2024, 60 (1-2) : 102 - 123
  • [8] When listening is spoken
    Collins, Hanne K.
    CURRENT OPINION IN PSYCHOLOGY, 2022, 47
  • [9] HOW SPOKEN LANGUAGE COMPREHENSION IS ACHIEVED BY OLDER LISTENERS IN DIFFICULT LISTENING SITUATIONS
    Schneider, Bruce A.
    Avivi-Reich, Meital
    Daneman, Meredyth
    EXPERIMENTAL AGING RESEARCH, 2016, 42 (01) : 40 - 63
  • [10] LISTENING AND STUDY OF SPOKEN LANGUAGE
    WILKINSON, A
    STRATTA, L
    EDUCATIONAL REVIEW, 1972, 25 (01) : 3 - +