THE FOSAFER SYSTEM FOR THE ICASSP2024 IN-CAR MULTI-CHANNEL AUTOMATIC SPEECH RECOGNITION CHALLENGE

被引:0
|
作者
Huang, Shangkun [1 ]
Du, Yuxuan [1 ]
Wang, Yankai [1 ]
Deng, Jing [1 ]
Zheng, Rong [1 ]
机构
[1] Beijing Fosafer Informat Technol Co Ltd, Beijing, Peoples R China
关键词
Robust automatic speech recognition; self-supervised learning representation; speech enhancement; speaker diarization;
D O I
10.1109/ICASSPW62465.2024.10625781
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents the Fosafer's submissions to the ICASSP 2024 In-Car Multi-Channel Automatic Speech Recognition Challenge (ICMC-ASR), which includes both the Automatic Speech Recognition (ASR) and Automatic Speech Diarization and Recognition (ASDR) systems. In Track1, a robust ASR system with data augmentation, self-supervised learning representation (SSLR), and speech enhancement (SE) achieved the second place. In Track2, different speaker diarization algorithms were fully exploited and achieved the fifth place.
引用
收藏
页码:5 / 6
页数:2
相关论文
共 50 条
  • [41] A GENERATIVE-DISCRIMINATIVE HYBRID APPROACH TO MULTI-CHANNEL NOISE REDUCTION FOR ROBUST AUTOMATIC SPEECH RECOGNITION
    Mentzner, Hendrik
    Araki, Shoko
    Fujimoto, Masakiyo
    Nakatani, Totohiro
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5740 - 5744
  • [42] MULTI-CHANNEL OVERLAPPED SPEECH RECOGNITION WITH LOCATION GUIDED SPEECH EXTRACTION NETWORK
    Chen, Zhuo
    Xiao, Xiong
    Yoshioka, Takuya
    Erdogan, Hakan
    Li, Jinyu
    Gong, Yifan
    2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 558 - 565
  • [43] A Spatiotemporal Multi-Channel Learning Framework for Automatic Modulation Recognition
    Xu, Jialang
    Luo, Chunbo
    Parr, Gerard
    Luo, Yang
    IEEE WIRELESS COMMUNICATIONS LETTERS, 2020, 9 (10) : 1629 - 1632
  • [44] FPGA IMPLEMENTATION OF AUTOMATIC SPEECH RECOGNITION SYSTEM IN A CAR ENVIRONMENT
    Syu, Dong-Fong
    Syu, Su-Wei
    Ruan, Shanq-Jang
    Huang, Yu-Chang
    Yang, Chuan-Kai
    2015 IEEE 4TH GLOBAL CONFERENCE ON CONSUMER ELECTRONICS (GCCE), 2015, : 485 - 486
  • [45] A unified network for multi-speaker speech recognition with multi-channel recordings
    Liu, Conggui
    Inoue, Nakamasa
    Shinoda, Koichi
    2017 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC 2017), 2017, : 1304 - 1307
  • [46] Multi-channel digital automatic ultrasonic detecting system
    Tang, JJ
    Ni, QZ
    Wang, YF
    PROCEEDINGS OF THE 3RD WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-5, 2000, : 2572 - 2574
  • [47] Multi-channel Automatic Calibration System of Pressure Sensor
    Jin Wanyu
    Zuo Siran
    Sun Dehui
    Wang Zhongyu
    PROCEEDINGS OF 2016 IEEE ADVANCED INFORMATION MANAGEMENT, COMMUNICATES, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (IMCEC 2016), 2016, : 506 - 510
  • [48] KS-NET: MULTI-BAND JOINT SPEECH RESTORATION AND ENHANCEMENT NETWORK FOR 2024 ICASSP SSI CHALLENGE
    Yu, Guochen
    Han, Runqiang
    Xu, Chenglin
    Zhao, Haoran
    Li, Nan
    Zhang, Chen
    Zheng, Xiguang
    Zhou, Chao
    Huang, Qi
    Yu, Bing
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW 2024, 2024, : 33 - 34
  • [49] Multi-Channel Pipetting System for Automatic ELISA Instrument
    Na, Yunxiao
    Zhu, Lianqing
    Guo, Yangkuan
    SENSORS, MEASUREMENT AND INTELLIGENT MATERIALS, PTS 1-4, 2013, 303-306 : 1759 - 1762
  • [50] Acoustic Model Combination Incorporated With Mask-Based Multi-Channel Source Separation for Automatic Speech Recognition
    Yoon, Jae Sam
    Park, Ji Hun
    Kim, Hong Kook
    Kim, Hoirin
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2010, 4 (05) : 772 - 784