Application for Real-time Personalized Speaker Extraction

被引：0

作者：

Ronssin, Damien ^{[1
]}

Cernak, Milos ^{[1
]}

机构：

[1] Logitech Europe SA, CH-1015 Lausanne, Switzerland

来源：

INTERSPEECH 2022 | 2022年

关键词：

speaker extraction; personalized speech enhancement; real-time audio processing; speech separation;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This short paper demonstrates an audio processing desktop application that allows isolating in real-time the voice of a specific speaker from the possibly noisy audio input after a short enrollment phase. The machine learning model embedded in this application suppresses all other sounds than the target voice from the incoming audio stream, including disturbing distractor voices. In the context of a growing need for video-collaboration solutions, personalized speech enhancement enables the use of such technologies in more challenging acoustic environments, i.e., in the presence of near distractor speech. In this situation, classical speech enhancement systems typically fail as they do not filter out any speech, hence the need for personalized methods. The presented application is an all-in-one solution for personalized speech enhancement: it allows the user to enroll and then to apply the effect seamlessly for one-to-one or one-to-many online meetings.

引用

下载

页码：1955 / 1956

页数：2

共 50 条

[21] Real-time and Personalized Search over a Microblogging System
Gao, Ming
Jin, Cheqing
Qian, Weining
Gong, Xueqing
COMPUTER JOURNAL, 2014, 57 (09): : 1281 - 1295
[22] Real-time and personalized search over a microblogging system
Gong, Xueqing, 1600, Oxford University Press (57):
[23] GPU-accelerated phase extraction algorithm for interferograms: A real-time application
Zhu, Xiaoqiang
Wu, Yongqian
Liu, Fengwei
OPTICAL METROLOGY AND INSPECTION FOR INDUSTRIAL APPLICATIONS IV, 2016, 10023
[24] APPLICATION OF THE ULTRASONIC TECHNIQUE FOR REAL-TIME HOLDUP MONITORING FOR THE CONTROL OF EXTRACTION COLUMNS
TSOURIS, C
TAVLARIDES, LL
BONNET, JC
CHEMICAL ENGINEERING SCIENCE, 1990, 45 (10) : 3055 - 3062
[25] Real-Time Speaker Adaptation for Speech Recognition on Mobile Devices
Lee, Gil Ho
2010 7TH IEEE CONSUMER COMMUNICATIONS AND NETWORKING CONFERENCE-CCNC 2010, 2010, : 403 - 404
[26] Implementation of a Real-Time Text Dependent Speaker Identification System
Andrei, Valentin
Paleologu, Constantin
Burileanu, Corneliu
2011 6TH CONFERENCE ON SPEECH TECHNOLOGY AND HUMAN-COMPUTER DIALOGUE (SPED), 2011,
[27] ACCELERATION OF SEQUENCE KERNEL COMPUTATION FOR REAL-TIME SPEAKER IDENTIFICATION
Yamada, Makoto
Sugiyama, Masashi
Wichern, Gordon
Matsui, Tomoko
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 1626 - 1629
[28] Real-Time Speaker Verification System Implemented on Reconfigurable Hardware
Ramos-Lara, Rafael
Lopez-Garcia, Mariano
Canto-Navarro, Enrique
Puente-Rodriguez, Luis
JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2013, 71 (02): : 89 - 103
[29] A real-time text-independent speaker identification system
Cordella, LP
Foggia, P
Sansone, C
Vento, M
12TH INTERNATIONAL CONFERENCE ON IMAGE ANALYSIS AND PROCESSING, PROCEEDINGS, 2003, : 632 - 637
[30] FFTNET: A REAL-TIME SPEAKER-DEPENDENT NEURAL VOCODER
Jin, Zeyu
Finkelstein, Adam
Mysore, Gautham J.
Lu, Jingwan
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 2251 - 2255

← 1 2 3 4 5 →