Application for Real-time Personalized Speaker Extraction

被引:0
|
作者
Ronssin, Damien [1 ]
Cernak, Milos [1 ]
机构
[1] Logitech Europe SA, CH-1015 Lausanne, Switzerland
来源
关键词
speaker extraction; personalized speech enhancement; real-time audio processing; speech separation;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This short paper demonstrates an audio processing desktop application that allows isolating in real-time the voice of a specific speaker from the possibly noisy audio input after a short enrollment phase. The machine learning model embedded in this application suppresses all other sounds than the target voice from the incoming audio stream, including disturbing distractor voices. In the context of a growing need for video-collaboration solutions, personalized speech enhancement enables the use of such technologies in more challenging acoustic environments, i.e., in the presence of near distractor speech. In this situation, classical speech enhancement systems typically fail as they do not filter out any speech, hence the need for personalized methods. The presented application is an all-in-one solution for personalized speech enhancement: it allows the user to enroll and then to apply the effect seamlessly for one-to-one or one-to-many online meetings.
引用
收藏
页码:1955 / 1956
页数:2
相关论文
共 50 条
  • [1] Speaker pruning algorithm for real-time speaker identification
    Kinnunen, T
    Karpov, E
    Fränti, P
    [J]. AUDIO-BASED AND VIDEO-BASED BIOMETRIC PERSON AUTHENTICATION, PROCEEDINGS, 2003, 2688 : 639 - 646
  • [2] Real-Time Personalized Margins
    Rottmann, J.
    Keall, P.
    Berbeco, R.
    [J]. MEDICAL PHYSICS, 2014, 41 (06) : 474 - 474
  • [3] Real-time speaker identification and verification
    Kinnunen, T
    Karpov, E
    Fränti, P
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (01): : 277 - 288
  • [4] Real-time speaker identification system
    Al-Shboul, Bashar
    Alsawalqah, Hamad
    Lee, Dongman
    [J]. PROCEEDINGS OF THE 7TH WSEAS INTERNATIONAL CONFERENCE ON APPLIED COMPUTER SCIENCE: COMPUTER SCIENCE CHALLENGES, 2007, : 422 - +
  • [5] Real-Time Speaker Identification Using Speaker Model Distance
    Zeinali, Hossein
    Sameti, Hossein
    Hadian, Hossein
    [J]. 2015 23RD IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE), 2015, : 643 - 647
  • [6] Real-Time Semi-Blind Speech Extraction with Speaker Direction Tracking on Kinect
    Onuma, Yuji
    Kamado, Noriyoshi
    Saruwatari, Hiroshi
    Shikano, Kiyohiro
    [J]. 2012 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2012,
  • [7] Real-Time Optimization of Personalized Assortments
    Golrezaei, Negin
    Nazerzadeh, Hamid
    Rusmevichientong, Paat
    [J]. MANAGEMENT SCIENCE, 2014, 60 (06) : 1532 - 1551
  • [8] Real-time unsupervised speaker change detection
    Lu, L
    Zhang, HJ
    [J]. 16TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL II, PROCEEDINGS, 2002, : 358 - 361
  • [9] TOWARDS REAL-TIME AUDIOVISUAL SPEAKER LOCALIZATION
    Monaci, Gianluca
    [J]. 19TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO-2011), 2011, : 1055 - 1059
  • [10] REAL-TIME TECHNIQUE FOR SPEAKER VERIFICATION BY COMPUTER
    LUMMIS, RC
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1971, 50 (01): : 106 - &