TRAED: Speech audio editing using imperfect transcripts

被引:0
|
作者
Masoodian, Masood [1 ]
Rogers, Bill [1 ]
Ware, David [1 ]
McKoy, Sam [1 ]
机构
[1] Univ Waikato, Dept Comp Sci, Hamilton, New Zealand
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Although digital recording, of speech is widespread, and an increasing range of applications allow recording and inclusion of speech data in documents, editing mid retrievol of speech audio remains generally a challenging task. We have previously developed a speech audio editing and browsing application which utilizes imperfect transcripts of speech os a mechanism for text-based editing and retrieval of speech audio documents. This paper presents a second prototype, called TRAED, which enhances the functionality provided by our earlier prototype, and further facilitates the task of speech audio editing and access.
引用
收藏
页码:454 / 459
页数:6
相关论文
共 50 条
  • [41] Audio-visual speech recognition using lstm and cnn
    El Maghraby E.E.
    Gody A.M.
    Farouk M.H.
    Recent Advances in Computer Science and Communications, 2021, 14 (06) : 2023 - 2039
  • [42] Generating Subtitles Automatically using Audio Extraction and Speech Recognition
    Mathur, Abhinav
    Saxena, Tanya
    Krishnamurthi, Rajalakshmi
    2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION TECHNOLOGY CICT 2015, 2015, : 621 - 626
  • [43] AntiFake: Using Adversarial Audio to Prevent Unauthorized Speech Synthesis
    Yu, Zhiyuan
    Zhai, Shixuan
    Zhang, Ning
    PROCEEDINGS OF THE 2023 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, CCS 2023, 2023, : 460 - 474
  • [44] Hate Speech Detection in Audio Using SHAP - An Explainable AI
    Imbwaga, Joan L.
    Chittaragi, Nagaratna B.
    Koolagudi, Shashidhar G.
    ADVANCED NETWORK TECHNOLOGIES AND INTELLIGENT COMPUTING, ANTIC 2023, PT II, 2024, 2091 : 289 - 304
  • [45] Speech Emotion Recognition Using Deep Learning on audio recordings
    Suganya, S.
    Charles, E. Y. A.
    2019 19TH INTERNATIONAL CONFERENCE ON ADVANCES IN ICT FOR EMERGING REGIONS (ICTER - 2019), 2019,
  • [46] Speech Emotion Classification using Raw Audio Input and Transcriptions
    Lima, Gabriel
    Bak, JinYeong
    2018 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND MACHINE LEARNING (SPML 2018), 2018, : 41 - 46
  • [47] Using Audio Books for Training a Text-to-Speech System
    Chalamandaris, Aimilios
    Tsiakoulis, Pirros
    Karabetsos, Sotiris
    Raptis, Spryos
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 3076 - 3080
  • [48] Audio-visual speech recognition using deep learning
    Noda, Kuniaki
    Yamaguchi, Yuki
    Nakadai, Kazuhiro
    Okuno, Hiroshi G.
    Ogata, Tetsuya
    APPLIED INTELLIGENCE, 2015, 42 (04) : 722 - 737
  • [49] Wideband speech and audio coding using gammatone filter banks
    Ambikairajah, E
    Epps, J
    Lin, L
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 773 - 776
  • [50] Indoor/Outdoor Audio Classification using Foreground Speech Segmentation
    Khonglah, Banriskhem K.
    Deepak, K. T.
    Prasanna, S. R. Mahadeva
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 464 - 468