Spoken document summarization using topic-related corpus and semantic dependency grammar

被引:0
|
作者
Hsieh, CH [1 ]
Huang, CL [1 ]
Wu, CH [1 ]
机构
[1] Natl Cheng Kung Univ, Dept Comp Sci & Informat Engn, Tainan 70101, Taiwan
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This study presents a spoken document summarization scheme using a topic-related corpus and semantic dependency grammars. The summarization score considers speech recognition confidence, word significance, word trigram, semantic dependency grammar (SDG) and probabilistic context free grammar (PCFG). In addition, a topic-related corpus consisting of keywords as well as article is used to estimate the word significance score using latent semantic indexing (LSI). Semantic relations between words are determined by SDG using HowNet and Sinica Treebank. The dynamic programming algorithm is applied to decide the summarization ratio and look for the best summarization result according to summarization scores. Experimental results indicate that the proposed approach effectively extracts important words with semantic dependency and gives a promising speech summary.
引用
收藏
页码:333 / 336
页数:4
相关论文
共 50 条
  • [41] Using Latent Semantic Indexing for Morph-based Spoken Document Retrieval
    Turunen, Ville T.
    Kurimo, Mikko
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 341 - 344
  • [42] Unsupervised neural networks for automatic Arabic text summarization using document clustering and topic modeling
    Alami, Nabil
    Meknassi, Mohammed
    En-nahnahi, Noureddine
    El Adlouni, Yassine
    Ammor, Ouafae
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 172
  • [43] Web Document Categorization Using Knowledge Graph and Semantic Textual Topic Detection
    Rinaldi, Antonio M.
    Russo, Cristiano
    Tommasino, Cristian
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS, ICCSA 2021, PT III, 2021, 12951 : 40 - 51
  • [44] Cross-Document Knowledge Discovery Using Semantic Concept Topic Model
    Li, Xin
    Jin, Wei
    2016 15TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2016), 2016, : 108 - 114
  • [45] An efficient single document Arabic text summarization using a combination of statistical and semantic features
    Qaroush, Aziz
    Abu Farha, Ibrahim
    Ghanem, Wasel
    Washaha, Mahdi
    Maali, Eman
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2021, 33 (06) : 677 - 692
  • [46] Personalized Document Summarization Using Non-negative Semantic Feature and Non-negative Semantic Variable
    Park, Sun
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2008, 2008, 5326 : 298 - 305
  • [47] Abstractive Spoken Document Summarization using Hierarchical Model with Multi-stage Attention Diversity Optimization
    Manakul, Potsawee
    Gales, Mark J. F.
    Wang, Linlin
    INTERSPEECH 2020, 2020, : 4248 - 4252
  • [48] Learning Spoken Document Similarity and Recommendation using Supervised Probabilistic Latent Semantic Analysis
    Thambiratnam, K.
    Seide, F.
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2840 - 2843
  • [49] Collecting topic-related web pages for link structure analysis by using a potential hub and authority first approach
    Wang, LH
    Lee, TW
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2005, 3518 : 832 - 837
  • [50] Related Document Extraction based on Topic Modeling using Cloud System
    Hwang, Myeong-Ha
    Ha, Suwook
    In, Minkyo
    Lee, Kangchan
    INTERNATIONAL JOURNAL OF GRID AND DISTRIBUTED COMPUTING, 2018, 11 (05): : 91 - 100