Chinese spoken document summarization using probabilistic latent topical information

被引:0
|
作者
Chen, Berlin [1 ]
Yeh, Yao-Ming [1 ]
Huang, Yao-Min [1 ]
Chen, Yi-Ting [1 ]
机构
[1] Natl Taiwan Normal Univ, Grad Inst Comp Sci & Informat Engn, Taipei, Taiwan
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The purpose of extractive summarization is to automatically select a number of indicative sentences, passages, or paragraphs from the original document according to a target summarization ratio and then sequence them to form a concise summary. In the paper, we proposed the use of probabilistic latent topical information for extractive summarization of spoken documents. Various kinds of modeling structures and learning approaches were extensively investigated. In addition, the summarization capabilities were verified by comparison with the conventional vector space model and latent semantic indexing model, as well as the HMM model. The experiments were performed on the Chinese broadcast news collected in Taiwan. Noticeable performance gains were obtained.
引用
收藏
页码:969 / 972
页数:4
相关论文
共 50 条
  • [11] Improved summarization of Chinese spoken documents by probabilistic latent semantic analysis (PLSA) with further analysis and integrated scoring
    Kong, Sheng-yi
    Lee, Lin-shan
    2006 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, 2006, : 26 - +
  • [12] Learning Spoken Document Similarity and Recommendation using Supervised Probabilistic Latent Semantic Analysis
    Thambiratnam, K.
    Seide, F.
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2840 - 2843
  • [13] Document Summarization with Latent Queries
    Xu, Yumo
    Lapata, Mirella
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2022, 10 : 623 - 638
  • [14] Spoken document summarization and retrieval for wireless application
    Wu, CH
    Huang, CL
    Hsieh, CH
    2005 INTERNATIONAL CONFERENCE ON WIRELESS NETWORKS, COMMUNICATIONS AND MOBILE COMPUTING, VOLS 1 AND 2, 2005, : 1388 - 1393
  • [15] Leveraging Word Embeddings for Spoken Document Summarization
    Chen, Kuan-Yu
    Liu, Shih-Hung
    Wang, Hsin-Min
    Chen, Berlin
    Chen, Hsin-Hsi
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1383 - 1387
  • [16] Neural Latent Extractive Document Summarization
    Zhang, Xingxing
    Lapata, Mirella
    Wei, Furu
    Zhou, Ming
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 779 - 784
  • [17] LATENT DIRICHLET LEARNING FOR DOCUMENT SUMMARIZATION
    Chang, Ying-Lang
    Chien, Jen-Tzung
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 1689 - 1692
  • [18] Multi-layered Summarization of Spoken Document Archives by Information Extraction and Semantic Structuring
    Lee, Lin-shan
    Kong, Sheng-yi
    Pan, Yi-cheng
    Fu, Yi-sheng
    Huang, Yu-tsun
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1539 - 1542
  • [19] Discriminatively trained spoken document similarity models and their application to probabilistic latent semantic analysis
    Thambiratnam, K.
    Seide, F.
    Yu, P.
    2006 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, 2006, : 42 - +
  • [20] Spoken document representations for probabilistic retrieval
    Jourlin, P
    Johnson, SE
    Sparck-Jones, K
    Woodland, PC
    SPEECH COMMUNICATION, 2000, 32 (1-2) : 21 - 36