TRAED: Speech audio editing using imperfect transcripts

被引：0

作者：

Masoodian, Masood ^{[1
]}

Rogers, Bill ^{[1
]}

Ware, David ^{[1
]}

McKoy, Sam ^{[1
]}

机构：

[1] Univ Waikato, Dept Comp Sci, Hamilton, New Zealand

来源：

12TH INTERNATIONAL MULTI-MEDIA MODELLING CONFERENCE PROCEEDINGS | 2006年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Although digital recording, of speech is widespread, and an increasing range of applications allow recording and inclusion of speech data in documents, editing mid retrievol of speech audio remains generally a challenging task. We have previously developed a speech audio editing and browsing application which utilizes imperfect transcripts of speech os a mechanism for text-based editing and retrieval of speech audio documents. This paper presents a second prototype, called TRAED, which enhances the functionality provided by our earlier prototype, and further facilitates the task of speech audio editing and access.

引用

页码：454 / 459

页数：6

共 50 条

[1] A GAN Speech Inpainting Model for Audio Editing Software
Zhao, Haixin
INTERSPEECH 2023, 2023, : 5127 - 5131
[2] Improving Sentence-level Alignment of Speech with Imperfect Transcripts using Utterance Concatenation and VAD
Moldovan, Alexandru
Stan, Adriana
Giurgiu, Mircea
2016 IEEE 12TH INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTER COMMUNICATION AND PROCESSING (ICCP), 2016, : 171 - 174
[3] Human detection of political speech deepfakes across transcripts, audio, and video
Groh, Matthew
Sankaranarayanan, Aruna
Singh, Nikhil
Kim, Dong Young
Lippman, Andrew
Picard, Rosalind
NATURE COMMUNICATIONS, 2024, 15 (01)
[4] Bypass Temporal Classification: Weakly Supervised Automatic Speech Recognition with Imperfect Transcripts
Gao, Dongji
Wiesner, Matthew
Xu, Hainan
Garcia, Leibny Paola
Povey, Daniel
Khudanpur, Sanjeev
INTERSPEECH 2023, 2023, : 924 - 928
[5] Cheating with imperfect transcripts
Placeway, P
Lafferty, J
ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 2115 - 2118
[6] AUDIO CONTROL AND EDITING USING TIME CODE
THIRKELL, GJ
JOURNAL OF THE AUDIO ENGINEERING SOCIETY, 1984, 32 (11): : 920 - 920
[7] Enhancing Audio Speech using Visual Speech Features
Almajai, Ibrahim
Milner, Ben
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1915 - 1918
[8] Integrating imperfect transcripts into speech recognition systems for building high-quality corpora
Lecouteux, Benjamin
Linares, Georges
Oger, Stanislas
COMPUTER SPEECH AND LANGUAGE, 2012, 26 (02): : 67 - 89
[9] Speech driven video editing via an audio-conditioned diffusion model
Bigioi, Dan
Basak, Shubhajit
Stypulkowski, Michal
Zieba, Maciej
Jordan, Hugh
Mcdonnell, Rachel
Corcoran, Peter
IMAGE AND VISION COMPUTING, 2024, 142
[10] TED Talk Recommender Using Speech Transcripts
Oh, Jaehoon
Lee, Injung
Seonwoo, Yeon
Sung, Simin
Kwon, Ilbong
Lee, Jae-Gil
2018 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM), 2018, : 598 - 600

← 1 2 3 4 5 →