Real-Time ASR from Meetings

被引:0
|
作者
Garner, Philip N. [1 ]
Dines, John [1 ]
Hain, Thomas [2 ]
El Hannani, Asmaa [2 ]
Karafiar, Martin [3 ]
Korchagin, Danil [1 ]
Lincoln, Mike [4 ]
Wan, Vincent [2 ]
Zhang, Le [4 ]
机构
[1] Idiap Res Inst, Martigny, Switzerland
[2] Univ Sheffield, Speech & Hearing Res Grp, Sheffield, S Yorkshire, England
[3] Brno Univ Technol, Speech Processing Grp, Brno, Czech Republic
[4] Univ Edinburgh, Ctr Speech Technol Res, Edinburgh, Scotland
关键词
real-time speech recognition; meeting ASR; beam-forming; speech meta-data;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The AMI(DA) system is a meeting room speech recognition system that has been developed and evaluated in the context of the NIST Rich Text (RT) evaluations. Recently, the "Distant Access" requirements of the AMIDA project have necessitated that the system operate in real-time. Another more difficult requirement is that the system fit into a live meeting transcription scenario. We describe an infrastructure that has allowed the AMI(DA) system to evolve into one that fulfils these extra requirements. We emphasise the components that address the live and real-time aspects.
引用
收藏
页码:2067 / +
页数:2
相关论文
共 50 条
  • [1] Portal support real-time meetings
    Anon
    [J]. Computer-Aided Engineering, 2001, 20 (03):
  • [2] ASR - A real-time speech recognition on portable devices
    Sharma, Ajay Shiv
    Bhalley, Rahul
    [J]. 2016 2ND INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATION, & AUTOMATION (ICACCA) (FALL), 2016, : 312 - 315
  • [3] Inserting Punctuation to ASR Output in a Real-Time Production Environment
    Hlubik, Pavel
    Spanel, Martin
    Bohac, Marek
    Weingartova, Lenka
    [J]. TEXT, SPEECH, AND DIALOGUE (TSD 2020), 2020, 12284 : 418 - 425
  • [4] Robust speech detection and segmentation for real-time ASR applications
    Shafran, I
    Rose, R
    [J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 432 - 435
  • [5] REAL-TIME SPEECH RECOGNITION CAPTIONING OF EVENTS AND MEETINGS
    Boulianne, Gilles
    Boisvert, Maryse
    Osterrath, Frederic
    [J]. 2008 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY: SLT 2008, PROCEEDINGS, 2008, : 197 - 200
  • [6] Automatic Classification of Usability of ASR Result for Real-time Captioning of Lectures
    Akita, Yuya
    Kuwahara, Nobuhiro
    Kawahara, Tatsuya
    [J]. 2015 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2015, : 19 - 22
  • [7] Road Utilization Adhesion Coefficient Real-Time Estimation for ASR System
    Ruan, Jiuhong
    Li, Yibin
    Yang, Fuguang
    Rong, Xuewen
    Song, Rui
    [J]. 2010 8TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA), 2010, : 91 - 95
  • [8] TalkTraces: Real-Time Capture and Visualization of Verbal Content in Meetings
    Chandrasegaran, Senthil
    Bryan, Chris
    Shidara, Hidekazu
    Chuang, Tung-Yen
    Ma, Kwan-Liu
    [J]. CHI 2019: PROCEEDINGS OF THE 2019 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, 2019,
  • [9] A robust, real-time endpoint detector with energy normalization for ASR in adverse environments
    Li, Q
    Zheng, JS
    Zhou, QR
    Lee, CH
    [J]. 2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 233 - 236
  • [10] A real-time prototype for small-vocabulary audio-visual ASR
    Connell, JH
    Haas, N
    Marcheret, E
    Neti, C
    Potamianos, G
    Velipasalar, S
    [J]. 2003 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL II, PROCEEDINGS, 2003, : 469 - 472