Query-based summarization of discussion threads

被引:6
|
作者
Verberne, Suzan [1 ]
Krahmer, Emiel [2 ]
Wubben, Sander [2 ]
van den Bosch, Antal [3 ,4 ]
机构
[1] Leiden Univ, Leiden Inst Adv Comp Sci, Leiden, Netherlands
[2] Tilburg Univ, Tilburg Sch Humanities, Tilburg, Netherlands
[3] Radboud Univ Nijmegen, Ctr Language Studies, Nijmegen, Netherlands
[4] Meertens Inst, Amsterdam, Netherlands
关键词
query-based summarization; discussion forums; reference summaries; word embeddings; evaluation; AGREEMENT; NETWORKS; DOCUMENT;
D O I
10.1017/S1351324919000123
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we address query-based summarization of discussion threads. New users can profit from the information shared in the forum, Please check if the inserted city and country names in the affiliations are correct. if they can find back the previously posted information. However, discussion threads on a single topic can easily comprise dozens or hundreds of individual posts. Our aim is to summarize forum threads given real web search queries. We created a data set with search queries from a discussion forum's search engine log and the discussion threads that were clicked by the user who entered the query. For 120 thread-query combinations, a reference summary was made by five different human raters. We compared two methods for automatic summarization of the threads: a query-independent method based on post features, and Maximum Marginal Relevance (MMR), a method that takes the query into account. We also compared four different word embeddings representations as alternative for standard word vectors in extractive summarization. We find (1) that the agreement between human summarizers does not improve when a query is provided that: (2) the query-independent post features as well as a centroid-based baseline outperform MMR by a large margin; (3) combining the post features with query similarity gives a small improvement over the use of post features alone; and (4) for the word embeddings, a match in domain appears to be more important than corpus size and dimensionality. However, the differences between the models were not reflected by differences in quality of the summaries created with help of these models. We conclude that query-based summarization with web queries is challenging because the queries are short, and a click on a result is not a direct indicator for the relevance of the result.
引用
收藏
页码:3 / 29
页数:27
相关论文
共 50 条
  • [1] Query-based summarization of customer reviews
    Feiguina, Olga
    Lapalme, Guy
    ADVANCES IN ARTIFICIAL INTELLIGENCE, 2007, 4509 : 452 - +
  • [2] Query-Based Summarization for search lists
    Ye, Xinghuo
    Wei, Hai
    FIRST INTERNATIONAL WORKSHOP ON KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2007, : 330 - 333
  • [3] Research on Query-based Automatic Summarization of Webpage
    Chen, Zhimin
    Shen, Jie
    2009 ISECS INTERNATIONAL COLLOQUIUM ON COMPUTING, COMMUNICATION, CONTROL, AND MANAGEMENT, VOL I, 2009, : 173 - 176
  • [4] Query-based Summarization for Indonesian News Articles
    Annisa, Dininta
    Khodra, Masayu Leylia
    2017 4TH INTERNATIONAL CONFERENCE ON ADVANCED INFORMATICS, CONCEPTS, THEORY, AND APPLICATIONS (ICAICTA) PROCEEDINGS, 2017,
  • [5] Intertopic Information Mining for Query-Based Summarization
    Ouyang, You
    Li, Wenjie
    Li, Sujian
    Lu, Qin
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2010, 61 (05): : 1062 - 1072
  • [6] Query-Based Extractive Text Summarization for Sanskrit
    Barve, Siddhi
    Desai, Shaba
    Sardinha, Razia
    PROCEEDINGS OF THE 4TH INTERNATIONAL CONFERENCE ON FRONTIERS IN INTELLIGENT COMPUTING: THEORY AND APPLICATIONS (FICTA) 2015, 2016, 404 : 559 - 568
  • [7] QUERY-BASED VIDEO SUMMARIZATION WITH PSEUDO LABEL SUPERVISION
    Huang, Jia-Hong
    Murn, Luka
    Mrak, Marta
    Worring, Marcel
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 1430 - 1434
  • [8] Improving query-based summarization using document graphs
    Mohamed, Ahmed A.
    Rajasekaran, Sanguthevar
    2006 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY, VOLS 1 AND 2, 2006, : 408 - +
  • [9] Context Sensitive Query Correction Method for Query-Based Text Summarization
    Rahman, Nazreena
    Borah, Bhogeswar
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2017, PT VI, 2017, 10409 : 17 - 30
  • [10] Semantic Query-Based Patent Summarization System (SQPSS)
    Girthana, K.
    Swamynathan, S.
    ADVANCES IN DATA SCIENCE, 2019, 941 : 169 - 179