Context-aware RNNLM Rescoring for Conversational Speech Recognition

被引：1

作者：

Wei, Kun ^{[1
]}

Guo, Pengcheng ^{[1
]}

Lv, Hang ^{[1
]}

Tu, Zhen ^{[2
]}

Xie, Lei ^{[1
]}

机构：

[1] Northwestern Polytech Univ, Sch Comp Sci, Audio Speech & Language Proc Grp ASLP NPU, Xian, Peoples R China

[2] Zhuiyi Technol, Shenzhen, Peoples R China

来源：

2021 12TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP) | 2021年

关键词：

conversational speech recognition; recurrent neural network language model; lattice-rescoring; LANGUAGE MODEL ADAPTATION;

D O I：

10.1109/ISCSLP49672.2021.9362109

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Conversational speech recognition is regarded as a challenging task due to its free-style speaking and long-term contextual dependencies. Prior work has explored the modeling of long-range context through RNNLM rescoring with improved performance. To further take advantage of the persisted nature during a conversation, such as topics or speaker turn, we extend the rescoring procedure to a new context-aware manner. For RNNLM training, we capture the contextual dependencies by concatenating adjacent sentences with various tag words, such as speaker or intention information. For lattice rescoring, the lattice of adjacent sentences are also connected with the first-pass decoded result by tag words. Besides, we also adopt a selective concatenation strategy based on tf-idf, making the best use of contextual similarity to improve transcription performance. Results on four different conversation test sets show that our approach yields up to 13.1% and 6% relative char-error-rate (CER) reduction compared with 1st-pass decoding and common lattice-rescoring, respectively.

引用

页数：5

共 50 条

[1] Controllable Context-aware Conversational Speech Synthesis
Cong, Jian
Yang, Shan
Hu, Na
Li, Guangzhi
Xie, Lei
Su, Dan
INTERSPEECH 2021, 2021, : 4658 - 4662
[2] A PRUNED RNNLM LATTICE-RESCORING ALGORITHM FOR AUTOMATIC SPEECH RECOGNITION
Xu, Hainan
Chen, Tongfei
Gao, Dongji
Wang, Yiming
Li, Ke
Goel, Nagendra
Carmiel, Yishay
Povey, Daniel
Khudanpur, Sanjeev
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5929 - 5933
[3] Towards Contrastive Context-Aware Conversational Emotion Recognition
Zhang, Hanqing
Song, Dawei
IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2022, 13 (04) : 1879 - 1891
[4] VISUAL FEATURES FOR CONTEXT-AWARE SPEECH RECOGNITION
Gupta, Abhinav
Miao, Yajie
Neves, Leonardo
Metze, Florian
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5020 - 5024
[5] CONTEXT-AWARE TRANSFORMER TRANSDUCER FOR SPEECH RECOGNITION
Chang, Feng-Ju
Liu, Jing
Radfar, Martin
Mouchtaris, Athanasios
Omologo, Maurizio
Rastrow, Ariya
Kunzmann, Siegfried
2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 503 - 510
[6] CONTEXT-AWARE ATTENTION MECHANISM FOR SPEECH EMOTION RECOGNITION
Ramet, Gaetan
Garner, Philip N.
Baeriswyl, Michael
Lazaridis, Alexandros
2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 126 - 131
[7] Context-Aware Conversational Developer Assistants
Bradley, Nick C.
Fritz, Thomas
Holmes, Reid
PROCEEDINGS 2018 IEEE/ACM 40TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), 2018, : 993 - 1003
[8] Context-Aware Speech Recognition Using Prompts for Language Learners
Cheng, Jian
INTERSPEECH 2024, 2024, : 4009 - 4013
[9] An Architecture for the Design of Context-Aware Conversational Agents
Griol, David
Sanchez-Pi, Nayat
Carbo, Javier
Molina, Jose M.
ADVANCES IN PRACTICAL APPLICATIONS OF AGENTS AND MULTIAGENT SYSTEMS, 2010, 70 : 41 - 46
[10] A Context-Aware Conversational Agent in the Rehabilitation Domain
Mavropoulos, Thanassis
Meditskos, Georgios
Symeonidis, Spyridon
Kamateri, Eleni
Rousi, Maria
Tzimikas, Dimitris
Papageorgiou, Lefteris
Eleftheriadis, Christos
Adamopoulos, George
Vrochidis, Stefanos
Kompatsiaris, Ioannis
FUTURE INTERNET, 2019, 11 (11):

← 1 2 3 4 5 →