A prototype of Audio-Visual Broadcast Transcription System

被引：0

作者：

Chaloupka, Josef ^{[1
]}

机构：

[1] Tech Univ Liberec, Inst Informat Technol & Elect, Liberec, Czech Republic

来源：

2019 42ND INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND SIGNAL PROCESSING (TSP) | 2019年

关键词：

automatic broadcast transcription; LVCSR; machine vision;

D O I：

10.1109/tsp.2019.8769103

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

This paper focuses on the use of methods and algorithms from the area of speech processing and recognition and from the area of machine vision for designing of system for automatic audio-visual broadcast transcription. The resulting audio-visual system has been designed and created mainly for transcription of huge video databases with TV recordings in this work. The visual signal processing and recognition is usually several times computationally more demanding than audio signal processing and recognition. Therefore, all applied machine vision methods and algorithms were considered with respect to low computing time as well as the highest possible recognition rate. Our proposed broadcast transcription system was extended by several modules for visual signal segmentation, for TV channel identification, for face detection and identification and for Optical Character Recognition (OCR).

引用

页码：543 / 547

页数：5

共 50 条

[1] Design of Audio-Visual TV Broadcast News Transcription System Prototype
Chaloupka, Josef
53RD INTERNATIONAL SYMPOSIUM ELMAR-2011, 2011, : 209 - 212
[2] Logo Detection and Identification in System for Audio-Visual Broadcast Transcription
Palecek, Karel
Chaloupka, Josef
2021 44TH INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND SIGNAL PROCESSING (TSP), 2021, : 357 - 360
[3] Optical Character Recognition for Audio-Visual Broadcast Transcription System
Chaloupka, Josef
Palecek, Karel
Cerva, Petr
Zdansky, Jindrich
2020 11TH IEEE INTERNATIONAL CONFERENCE ON COGNITIVE INFOCOMMUNICATIONS (COGINFOCOM 2020), 2020, : 229 - 232
[4] Audio-visual speaker recognition for video broadcast news
Maison, B
Neti, C
Senior, A
JOURNAL OF VLSI SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2001, 29 (1-2): : 71 - 79
[5] Audio-Visual Speaker Recognition for Video Broadcast News
Benoît Maison
Chalapathy Neti
Andrew Senior
Journal of VLSI signal processing systems for signal, image and video technology, 2001, 29 : 71 - 79
[6] Semi-supervised Cross-domain Visual Feature Learning for Audio-Visual Broadcast Speech Transcription
Su, Rongfeng
Liu, Xunying
Wang, Lan
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3509 - 3513
[7] NEW AUDIO-VISUAL SYSTEM
不详
EDUCATIONAL TECHNOLOGY, 1967, 7 (16) : 11 - 13
[8] An audio-visual speech recognition system for testing new audio-visual databases
Pao, Tsang-Long
Liao, Wen-Yuan
VISAPP 2006: PROCEEDINGS OF THE FIRST INTERNATIONAL CONFERENCE ON COMPUTER VISION THEORY AND APPLICATIONS, VOL 2, 2006, : 192 - +
[9] Audio-visual vibraphone transcription in real time
Tavares, Tiago F.
Odowichuck, Gabrielle
Zehtabi, Sonmaz
Tzanetakis, George
2012 IEEE 14TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2012, : 215 - 220
[10] ADVANCES IN ONLINE AUDIO-VISUAL MEETING TRANSCRIPTION
Yoshioka, Takuya
Abramovski, Igor
Aksoylar, Cem
Chen, Zhuo
David, Moshe
Dimitriadis, Dimitrios
Gong, Yifan
Gurvich, Ilya
Huang, Xuedong
Huang, Yan
Hurvitz, Aviv
Jiang, Li
Koubi, Sharon
Krupka, Eyal
Leichter, Ido
Liu, Changliang
Parthasarathy, Partha
Vinnikov, Alon
Wu, Lingfeng
Xiao, Xiong
Xiong, Wayne
Wang, Huaming
Wang, Zhenghao
Zhang, Jun
Zhao, Yong
Zhou, Tianyan
2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 276 - 283

← 1 2 3 4 5 →