Reducing the cost of metadata generation by using video/audio indexing and natural language processing techniques

被引：0

作者：

Kuwano, Hidetaka ^{[1
]}

Matsuo, Yoshihiro ^{[2
]}

Kawazoe, Katsuhiko ^{[2
]}

机构：

[1] NTT Cyber Solutions Laboratories, Yokosuka-shi, 239-0947, Japan

[2] NTT Cyber Solutions Laboratories

来源：

NTT Technical Review | 2004年 / 2卷 / 08期

关键词：

Character recognition - Costs - Image segmentation - Indexing (of information) - Natural language processing systems - Network protocols - Semantics - Speech recognition - Television broadcasting - User interfaces;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Reducing the cost of generating metadata will allow more broadcast contents to be transmitted with advanced viewing options. In this article, we describe SceneCabinet, a system that automatically extracts scene-based semantic metadata from video content. It extracts meaningful video slices and their associated textual information such as the title, synopsis, and keywords by using natural language processing based on the results of speech and on-screen text recognition. Moreover, it can import video program scripts and use them for automatic keyword extraction. SceneCabinet provides an intuitive user operation interface including a video browser with key images detected automatically based on scene changes, on-screen text, camerawork, speech, and music information. Experiments showed that SceneCabinet can significantly reduce metadata generation costs.

引用

页码：68 / 74

共 50 条

[1] Effects of task-cost reduction on metadata generation using audio/visual indexing and natural-language processing techniques
Kuwano, Hidetaka
Matsuo, Yoshihiro
Kawazoe, Katsuhiko
[J]. Kyokai Joho Imeji Zasshi/Journal of the Institute of Image Information and Television Engineers, 2007, 61 (06): : 842 - 852
[2] Semantic Hierarchical Indexing for Online Video Lessons Using Natural Language Processing
Arazzi, Marco
Ferretti, Marco
Nocera, Antonino
[J]. BIG DATA AND COGNITIVE COMPUTING, 2023, 7 (02)
[3] Summary Generation Using Natural Language Processing Techniques and Cosine Similarity
Pal, Sayantan
Chang, Maiga
Iriarte, Maria Fernandez
[J]. INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS, ISDA 2021, 2022, 418 : 508 - 517
[4] Indexing audio-visual sequences by joint audio and video processing
Saraceno, C
Leonardi, R
[J]. VSMM98: FUTUREFUSION - APPLICATION REALITIES FOR THE VIRTUAL AGE, VOLS 1 AND 2, 1998, : 686 - 691
[5] Video indexing using speech recognition techniques in audio channel preliminary system design
Gu, LY
[J]. Proceedings of the Fourth IEEE International Symposium on Signal Processing and Information Technology, 2004, : 342 - 345
[6] Indexing audiovisual databases through joint audio and video processing
Saraceno, C
Leonardi, R
[J]. INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY, 1998, 9 (05) : 320 - 331
[7] Automated Extraction of Semantic Legal Metadata Using Natural Language Processing
Sleimi, Amin
Sannier, Nicolas
Sabetzadeh, Mehrdad
Briand, Lionel C.
Dann, John
[J]. 2018 IEEE 26TH INTERNATIONAL REQUIREMENTS ENGINEERING CONFERENCE (RE 2018), 2018, : 124 - 135
[8] Generation of Oracles using Natural Language Processing
Leong, Iat Tou
Barbosa, Raul
[J]. 2021 28TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE WORKSHOPS (APSECW 2021), 2021, : 25 - 31
[9] Content-based Recommendation for Podcast Audio-items using Natural Language Processing Techniques
Xing, Zhou
Parandehgheibi, Marzieh
Xiao, Fei
Kulkarni, Nilesh
Pouliot, Chris
[J]. 2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 2378 - 2383
[10] Food Recipe Alternation and Generation with Natural Language Processing Techniques
Pan, Yuran
Xu, Qiangwen
Li, Yanjun
[J]. 2020 IEEE 36TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOPS (ICDEW 2020), 2020, : 94 - 97

← 1 2 3 4 5 →