TRAED: Speech audio editing using imperfect transcripts

被引:0
|
作者
Masoodian, Masood [1 ]
Rogers, Bill [1 ]
Ware, David [1 ]
McKoy, Sam [1 ]
机构
[1] Univ Waikato, Dept Comp Sci, Hamilton, New Zealand
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Although digital recording, of speech is widespread, and an increasing range of applications allow recording and inclusion of speech data in documents, editing mid retrievol of speech audio remains generally a challenging task. We have previously developed a speech audio editing and browsing application which utilizes imperfect transcripts of speech os a mechanism for text-based editing and retrieval of speech audio documents. This paper presents a second prototype, called TRAED, which enhances the functionality provided by our earlier prototype, and further facilitates the task of speech audio editing and access.
引用
收藏
页码:454 / 459
页数:6
相关论文
共 50 条
  • [1] A GAN Speech Inpainting Model for Audio Editing Software
    Zhao, Haixin
    INTERSPEECH 2023, 2023, : 5127 - 5131
  • [2] Improving Sentence-level Alignment of Speech with Imperfect Transcripts using Utterance Concatenation and VAD
    Moldovan, Alexandru
    Stan, Adriana
    Giurgiu, Mircea
    2016 IEEE 12TH INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTER COMMUNICATION AND PROCESSING (ICCP), 2016, : 171 - 174
  • [3] Human detection of political speech deepfakes across transcripts, audio, and video
    Groh, Matthew
    Sankaranarayanan, Aruna
    Singh, Nikhil
    Kim, Dong Young
    Lippman, Andrew
    Picard, Rosalind
    NATURE COMMUNICATIONS, 2024, 15 (01)
  • [4] Bypass Temporal Classification: Weakly Supervised Automatic Speech Recognition with Imperfect Transcripts
    Gao, Dongji
    Wiesner, Matthew
    Xu, Hainan
    Garcia, Leibny Paola
    Povey, Daniel
    Khudanpur, Sanjeev
    INTERSPEECH 2023, 2023, : 924 - 928
  • [5] Cheating with imperfect transcripts
    Placeway, P
    Lafferty, J
    ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 2115 - 2118
  • [6] AUDIO CONTROL AND EDITING USING TIME CODE
    THIRKELL, GJ
    JOURNAL OF THE AUDIO ENGINEERING SOCIETY, 1984, 32 (11): : 920 - 920
  • [7] Enhancing Audio Speech using Visual Speech Features
    Almajai, Ibrahim
    Milner, Ben
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1915 - 1918
  • [8] Integrating imperfect transcripts into speech recognition systems for building high-quality corpora
    Lecouteux, Benjamin
    Linares, Georges
    Oger, Stanislas
    COMPUTER SPEECH AND LANGUAGE, 2012, 26 (02): : 67 - 89
  • [9] Speech driven video editing via an audio-conditioned diffusion model
    Bigioi, Dan
    Basak, Shubhajit
    Stypulkowski, Michal
    Zieba, Maciej
    Jordan, Hugh
    Mcdonnell, Rachel
    Corcoran, Peter
    IMAGE AND VISION COMPUTING, 2024, 142
  • [10] TED Talk Recommender Using Speech Transcripts
    Oh, Jaehoon
    Lee, Injung
    Seonwoo, Yeon
    Sung, Simin
    Kwon, Ilbong
    Lee, Jae-Gil
    2018 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM), 2018, : 598 - 600