Speech recognition for medical conversations

被引：0

作者：

Chiu, Chung-Cheng ^{[1
]}

Tripathi, Anshuman

Chou, Katherine

Co, Chris

Jaitly, Navdeep

Jaunzeikare, Diana

Kannan, Anjuli

Nguyen, Patrick

Sak, Hasim

Sankar, Ananth ^{[1
,2
]}

Tansuwan, Justin

Wan, Nathan ^{[1
]}

Wu, Yonghui

Zhang, Xuedong ^{[1
]}

机构：

[1] Google, Mountain View, CA USA

[2] LinkedIn, Mountain View, CA USA

来源：

19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES | 2018年

关键词：

medical transcription; conversational transcription; end-to-end attention models; CTC;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper we document our experiences with developing speech recognition for medical transcription a system that automatically transcribes doctor-patient conversations. Towards this goal, we built a system along two different methodological lines a Connectionist Temporal Classification (CTC) phoneme based model and a Listen Attend and Spell (LAS) grapheme based model. To train these models we used a corpus of anonymized conversations representing approximately 14,000 hours of speech. Because of noisy transcripts and alignments in the corpus, a significant amount of effort was invested in data cleaning issues. We describe a two-stage strategy we followed for segmenting the data. The data cleanup and development of a matched language model was essential to the success of the CTC based models. The LAS based models, however were found to be resilient to alignment and transcript noise and did not require the use of language models. CTC models were able to achieve a word error rate of 20.1%, and the LAS models were able to achieve 18.3%. Our analysis shows that both models perform well on important medical utterances and therefore can be practical for transcribing medical conversations.

引用

页码：2972 / 2976

页数：5

共 50 条

[41] MEDIATION OF ATTITUDES BY SPEECH IN DYADIC CONVERSATIONS
HJELMQUIST, E
BRENNER, SO
HUMAN RELATIONS, 1979, 32 (12) : 983 - 997
[42] An online intelligent electronic medical record system via speech recognition
Xia, Xin
Ma, Yunlong
Luo, Ye
Lu, Jianwei
INTERNATIONAL JOURNAL OF DISTRIBUTED SENSOR NETWORKS, 2022, 18 (11):
[43] Developing Medical Virtual Teaching Assistant Based on Speech Recognition Technology
Purbohadi, Dwijoko
Afriani, Silvia
Rachmanio, Nicko
Dewi, Arlina
INTERNATIONAL JOURNAL OF ONLINE AND BIOMEDICAL ENGINEERING, 2021, 17 (04) : 107 - 120
[44] An Analysis of Speech as a Modality for Activity Recognition during Complex Medical Teamwork
Jagannath, Swathi
Sarcevic, Aleksandra
Marsic, Ivan
PROCEEDINGS OF THE 12TH EAI INTERNATIONAL CONFERENCE ON PERVASIVE COMPUTING TECHNOLOGIES FOR HEALTHCARE (PERVASIVEHEALTH 2018), 2018, : 88 - 97
[45] Speech Emotion Recognition Applied to Real-World Medical Consultation
Huang, Ching-Tzu
Huang, Chih-Wei
Yang, Hsuan-Chia
Li, Yu-Chuan
MEDINFO 2023 - THE FUTURE IS ACCESSIBLE, 2024, 310 : 1121 - 1125
[46] Dictating a report to control the interaction The use of speech recognition in medical consultations
El Haik-Wagner, Nicolas
M S-MEDECINE SCIENCES, 2022, 38 (10): : 827 - 831
[47] An online intelligent electronic medical record system via speech recognition
Xia, Xin
Ma, Yunlong
Luo, Ye
Lu, Jianwei
INTERNATIONAL JOURNAL OF DISTRIBUTED SENSOR NETWORKS, 2022, 18 (11)
[48] Creation of Language Resources for the Development of a Medical Speech Recognition System for Latvian
Dargis, Roberts
Gruzitis, Normunds
Auzina, Ilze
Stepanovs, Kaspars
HUMAN LANGUAGE TECHNOLOGIES - THE BALTIC PERSPECTIVE (HLT 2020), 2020, 328 : 135 - 141
[49] Prototyping Virtual Reality Interactions in Medical Simulation Employing Speech Recognition
Yang, Jacky
Chan, Michael
Uribe-Quevedo, Alvaro
Kapralos, Bill
Jaimes, Norman
Dubrowski, Adam
2020 22ND SYMPOSIUM ON VIRTUAL AND AUGMENTED REALITY (SVR 2020), 2020, : 351 - 355
[50] SPEECH RECOGNITION WITH NO SPEECH OR WITH NOISY SPEECH
Krishna, Gautam
Co Tran
Yu, Jianguo
Tewfik, Ahmed H.
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 1090 - 1094

← 1 2 3 4 5 →