VTalk: A system for generating text-to-audio-visual speech

被引:0
|
作者
Kalra, Prem [1 ]
Kapoor, Ashish [1 ]
Kumar Goyal, Udit [1 ]
机构
[1] Dept. of Computer Science and Eng., Indian Institute of Technology, New Delhi 110 016, India
关键词
D O I
暂无
中图分类号
学科分类号
摘要
20
引用
收藏
页码:307 / 314
相关论文
共 50 条
  • [1] VTalk: A system for generating text-to-audio-visual speech
    Kalra, P
    Kapoor, A
    Goyal, UK
    IETE TECHNICAL REVIEW, 2001, 18 (04) : 307 - 314
  • [2] Generating Intelligible Audio Speech From Visual Speech
    Le Cornu, Thomas
    Milner, Ben
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (09) : 1447 - 1457
  • [3] A Real-Time Text to Audio-Visual Speech Synthesis System
    Wang, Lijuan
    Qian, Xiaojun
    Ma, Lei
    Qian, Yao
    Chen, Yining
    Soong, Frank
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 2338 - +
  • [4] An audio-visual speech recognition system for testing new audio-visual databases
    Pao, Tsang-Long
    Liao, Wen-Yuan
    VISAPP 2006: PROCEEDINGS OF THE FIRST INTERNATIONAL CONFERENCE ON COMPUTER VISION THEORY AND APPLICATIONS, VOL 2, 2006, : 192 - +
  • [5] Using Audio Books for Training a Text-to-Speech System
    Chalamandaris, Aimilios
    Tsiakoulis, Pirros
    Karabetsos, Sotiris
    Raptis, Spryos
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 3076 - 3080
  • [6] Humanoid Audio-Visual Avatar With Emotive Text-to-Speech Synthesis
    Tang, Hao
    Fu, Yun
    Tu, Jilin
    Hasegawa-Johnson, Mark
    Huang, Thomas S.
    IEEE TRANSACTIONS ON MULTIMEDIA, 2008, 10 (06) : 969 - 981
  • [7] Generating Audio-Visual Slideshows from Text Articles Using Word Concreteness
    Leake, Mackenzie
    Shin, Hijung Valentina
    Kim, Joy O.
    Agrawala, Maneesh
    PROCEEDINGS OF THE 2020 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS (CHI'20), 2020,
  • [8] Lips Detection for Audio-Visual Speech Recognition System
    Chin, Siew Wen
    Ang, Li-Minn
    Seng, Kah Phooi
    2008 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATIONS SYSTEMS (ISPACS 2008), 2008, : 311 - 314
  • [9] AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation with Unified Audio-Visual Speech Representation
    Choi, Jeongsoo
    Park, Se Jin
    Kim, Minsu
    Ro, Yong Man
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 27315 - 27327
  • [10] Building audio-visual phonetically annotated Arabic corpus for expressive text to speech
    Abdo, Omnia
    Abdou, Sherif
    Fashal, Mervat
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3767 - 3771