Wikipedia Based News Video Topic Modeling for Information Extraction

被引:0
|
作者
Roy, Sujoy [1 ]
Mak, Mun-Thye [1 ]
Wan, Kong Wah [1 ]
机构
[1] ASTAR, Inst Infocomm Res, Singapore, Singapore
来源
关键词
Video Topic Modeling; Non-linear Search; Wikipedia;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Determining the topic of a news video story (NVS) from its audio-visual footage is an important part of meta-data generation. In this paper we propose a news story topic modeling approach that takes advantage of online knowledge resources like Wikipedia to model the topic of a news story. A NVS is modeled as a distribution over several Wikipedia pages related to the story. The mapping of the NVS to a Wikipedia page table-of-contents (TOC) is also determined. The specific advantages of this topic modeling approach are. (1) The topic is interpretable as a weighted distribution over a set of semantically meaningful story title phrases instead of just being a collection of words. (2) It facilitates organizing news video stories as a taxonomy that captures several perspectives to the story. (3) The taxonomy facilitates exploration and non-linear search. Performance evaluations from an information extraction perspective validate the efficacy of the proposed topic modeling approach compared to TIF-IDF and LDA based approaches on a large news video corpus.
引用
收藏
页码:411 / 420
页数:10
相关论文
共 50 条
  • [1] News Video Story Segmentation Based on Topic Caption Text and Audio Information
    Zhao Yaqin
    Zhou Xianzhong
    Chen Huiming
    [J]. PROCEEDINGS OF THE 2009 WRI GLOBAL CONGRESS ON INTELLIGENT SYSTEMS, VOL IV, 2009, : 482 - +
  • [2] News Video Clip Retrieval Based on Topic Caption Text and Audio Information
    Zhao Yaqin
    Zheng Jiaqiang
    Zhou Hongping
    [J]. PROCEEDINGS OF THE 2009 WRI GLOBAL CONGRESS ON INTELLIGENT SYSTEMS, VOL IV, 2009, : 477 - 481
  • [3] Labeling News Topic Threads with Wikipedia Entries
    Okuoka, Tomoki
    Takahashi, Tomokazu
    Deguchi, Daisuke
    Ide, Ichiro
    Murase, Hiroshi
    [J]. 2009 11TH IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM 2009), 2009, : 501 - +
  • [4] Topic Modeling for Wikipedia Link Disambiguation
    Skaggs, Bradley
    Getoor, Lise
    [J]. ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2014, 32 (03)
  • [5] Labeling Blog Posts with Wikipedia Entries through LDA-Based Topic Modeling of Wikipedia
    Makita, Kensaku
    Suzuki, Hiroko
    Koike, Daichi
    Utsuro, Takehito
    Kawada, Yasuhide
    Fukuhara, Tomohiro
    [J]. JOURNAL OF INTERNET TECHNOLOGY, 2013, 14 (02): : 297 - 306
  • [6] TOPIC MODELING OF NEWS BASED ON SPARK MLLIB
    Gui, Jing
    Wang, Qi
    [J]. 2017 14TH INTERNATIONAL COMPUTER CONFERENCE ON WAVELET ACTIVE MEDIA TECHNOLOGY AND INFORMATION PROCESSING (ICCWAMTIP), 2017, : 224 - 228
  • [7] FEATURE EXTRACTION AND CLASSIFICATION FOR AUDIO INFORMATION IN NEWS VIDEO
    Song, Yu
    Wang, Wen-Hong
    Guo, Feng-Juan
    [J]. PROCEEDINGS OF 2009 INTERNATIONAL CONFERENCE ON WAVELET ANALYSIS AND PATTERN RECOGNITION, 2009, : 43 - +
  • [8] Probabilistic Explicit Topic Modeling Using Wikipedia
    Hansen, Joshua A.
    Ringger, Eric K.
    Seppi, Kevin D.
    [J]. LANGUAGE PROCESSING AND KNOWLEDGE IN THE WEB, 2013, 8105 : 69 - 82
  • [9] FACETED TOPIC RETRIEVAL OF NEWS VIDEO USING JOINT TOPIC MODELING OF VISUAL FEATURES AND SPEECH TRANSCRIPTS
    Wan, Kong-Wah
    Tan, Ah-Hwee
    Lim, Joo-Hwee
    Chia, Liang-Tien
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME 2010), 2010, : 843 - 848
  • [10] Using Topic Modeling and Adversarial Neural Networks for Fake News Video Detection
    Choi, Hyewon
    Ko, Youngjoong
    [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 2950 - 2954