Story Segmentation and Topic Classification of Broadcast News via a Topic-Based Segmental Model and a Genetic Algorithm

被引:13
|
作者
Wu, Chung-Hsien [1 ]
Hsieh, Chia-Hsin [2 ]
机构
[1] Natl Cheng Kung Univ, Dept Comp Sci & Informat Engn, Tainan 701, Taiwan
[2] Inst Informat Ind, Kaohsiung 806, Taiwan
关键词
Genetic algorithm (GA); segmental model; story segmentation; topic classification; SPEECH; TEXT;
D O I
10.1109/TASL.2009.2021304
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents a two-stage approach to story segmentation and topic classification of broadcast news. The two-stage paradigm adopts a decision tree and a maximum entropy model to identify the potential story boundaries in the broadcast news within a sliding window. The problem for story segmentation is thus transformed to the determination of a boundary position sequence from the potential boundary regions. A genetic algorithm is then applied to determine the chromosome, which corresponds to the final boundary position sequence. A topic-based segmental model is proposed to define the fitness function applied in the genetic algorithm. The syllable- and word-based story segmentation schemes are adopted to evaluate the proposed approach. Experimental results indicate that a miss probability of 0.1587 and a false alarm probability of 0.0859 are achieved for story segmentation on the collected broadcast news corpus. On the TDT-3 Mandarin audio corpus, a miss probability of 0.1232 and a false alarm probability of 0.1298 are achieved. Moreover, an outside classification accuracy of 74.55% is obtained for topic classification on the collected broadcast news, while an inside classification accuracy of 88.82% is achieved on the TDT-2 Mandarin audio corpus.
引用
收藏
页码:1612 / 1623
页数:12
相关论文
共 50 条
  • [1] Topic-Based Hierarchical Segmentation
    Chien, Jen-Tzung
    Chueh, Chuang-Hua
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (01): : 55 - 66
  • [2] Exploring the Structure of Broadcast News for Topic Segmentation
    Amaral, Rui
    Trancoso, Isabel
    [J]. HUMAN LANGUAGE TECHNOLOGY: CHALLENGES OF THE INFORMATION SOCIETY, 2009, 5603 : 1 - 12
  • [3] Topic-based ranking in Folksonomy via probabilistic model
    Yan’an Jin
    Ruixuan Li
    Kunmei Wen
    Xiwu Gu
    Fei Xiao
    [J]. Artificial Intelligence Review, 2011, 36 : 139 - 151
  • [4] Topic-based ranking in Folksonomy via probabilistic model
    Jin, Yan'an
    Li, Ruixuan
    Wen, Kunmei
    Gu, Xiwu
    Xiao, Fei
    [J]. ARTIFICIAL INTELLIGENCE REVIEW, 2011, 36 (02) : 139 - 151
  • [5] Topic-Based Microblog Polarity Classification Based on Cascaded Model
    Liu, Quanchao
    Hu, Yue
    Lei, Yangfan
    Wei, Xiangpeng
    Liu, Guangyong
    Bi, Wei
    [J]. COMPUTATIONAL SCIENCE - ICCS 2018, PT II, 2018, 10861 : 206 - 220
  • [6] Feature selection for the topic-based mixture model in factored classification
    Chen, Qiong
    [J]. 2006 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY, PTS 1 AND 2, PROCEEDINGS, 2006, : 39 - 44
  • [7] News Text Classification Model Based on Topic Model
    Li, Zhenzhong
    Shang, Wenqian
    Yan, Menghan
    [J]. 2016 IEEE/ACIS 15TH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCE (ICIS), 2016, : 1197 - 1201
  • [8] Multimodal topic segmentation and classification of news video
    Raaijmakers, S
    den Hartog, J
    Baan, J
    [J]. IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL I AND II, PROCEEDINGS, 2002, : A33 - A36
  • [9] Topic-based Classification through Unigram Unmasking
    HaCohen-Kerner, Yaakov
    Rosenfeld, Avi
    Sabag, Asaf
    Tzidkani, Maor
    [J]. KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS (KES-2018), 2018, 126 : 69 - 76
  • [10] UNSUPERVISED TOPIC MODEL FOR BROADCAST PROGRAM SEGMENTATION
    Boulianne, Gilles
    Dumouchel, Pierre
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 8455 - 8459