Leveraging User-Generated Content for News Search

被引:0
|
作者
McCreadie, Richard M. C. [1 ]
机构
[1] Univ Glasgow, Dept Comp Sci, Glasgow G12 8QQ, Lanark, Scotland
关键词
News; Blogs; Social Media;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Over the last few years both availability and accessibility of current news stories on the Web have dramatically improved [3]. In particular, users can now access news from a variety of sources hosted on the Web, from newswire presences such as the New York Times, to integrated news search within Web search engines. However, of central interest is the emerging impact that user-generated content (UGC) is having on this online news landscape. Indeed, the emergence of Web 2.0 has turned a static news consumer base into a dynamic news machine, where news stories are summarised and commented upon. In summary, value is being added to each news story in terms of additional content. Importantly, however, while there has been movement in commercial circles to exploit this extra value to enrich online news [5], there has been little research from the academic community on how can be achieved. Indeed, the main purpose of this thesis is to research practical techniques for the integration of UGC to improve the news search component of the most ubiquitous of Web tools, i.e the Web search engine. Importantly, we identify the following three key aspects of news search which might be improved through the application of UGC. Intuitively, the first task that the news vertical search aspect of a Web search engine needs to accomplish when confronted with a user query is to decide whether the query is in fact news-related, and hence requires news content to be included. However, queries themselves are sparse in nature, being often comprised of one of two tokens only. This presents issues when performing query classification, as there are few features to distinguish the news related queries. We attest that UGC can help alleviate this ambiguity. Indeed, we hypothesise that there is a strong link between the volume of UGC content being posted mentioning a query and the likelihood of that query being news-related within a specific timeframe. Secondly, we consider the task of real-time event detection. It is imperative for search engines to maintain knowledge of the events of the moment, such that the results displayed are updated. Traditionally, systems have detected new events through the clustering of newswire articles [1]. However, in the current fast-paced news search environment where users begin querying for events within a couple of minutes of their occurrence [4], relying on slow newswire reporting is unacceptable. On the other-hand, UGC sources such as Twitter provide a natural alternative, as the high post rate and popularity of news topics makes a site such as this an ideal medium from which to monitor emerging events. Indeed, many paid journalists maintain personal blogs and other social media accounts for the reporting of fast-breaking news stories [2]. Lastly, we examine the presentation of results to the user. The presentation of news articles to satisfy news-searches is generally accepted. However, with the ever-increasing pace of news reporting world-wide, there is now no guarantee that a trusted news source will have yet published upon the story. In these cases, one must look else-where for content to satisfy the user. We hypothesise that UGC is ideal for presentation in these cases as the delay between an event occurring and commentary appearing in UGC sources like Twitter or the Blogosphere is mear minutes. Moreover some information needs cannot be easily solved using newswire articles alone. For example, the correct result for the query 'current news' would be a list of news stories ranked by their importance for the day in question. This is a difficult ranking problem, as 'importance' is greatly dependent upon the perspective of the user. In this case, one solution might be to leverage 'public opinion' as represented in UGC, for example by taking 'the pulse of the Blogosphere'. Indeed, we have examined such during TREC 2009. In conclusion, we have identified multiple areas of the news-search process which cannot be satisfied by traditional newswire articles. We hypothesise that the application of user-generated content can be leveraged to improve the field of news-search in relation to the rich and timely information that UGC provides.
引用
下载
收藏
页码:919 / 919
页数:1
相关论文
共 50 条
  • [41] User-Generated Content: The Case for Mobile Services
    Jensen, Christian S.
    Vicente, Carmen Ruiz
    Wind, Rico
    COMPUTER, 2008, 41 (12) : 115 - 117
  • [42] USER-GENERATED CONTENT AS WORD-OF-MOUTH
    Ramirez, Edward
    Gau, Roland
    Hadjimarcou, John
    Xu, Zhenning
    JOURNAL OF MARKETING THEORY AND PRACTICE, 2018, 26 (1-2) : 90 - 98
  • [43] Detecting coverage bias in user-generated content
    Kerkhof, Anna
    Munster, Johannes
    JOURNAL OF MEDIA ECONOMICS, 2019, 32 (3-4) : 99 - 130
  • [44] Integrating User-Generated Content and Pervasive Communications
    Baladron, Carlos
    Aguiar, Javier
    Carro, Belen
    Sanchez-Esguevillas, Antonio
    Baldauf, Matthias
    Froehlich, Peter
    Musialski, Przemyslaw
    Falcarin, Paolo
    Rocha, Oscar Rodriguez
    Costabello, Luca
    Goix, Laurent Walter
    Cadenas, Alejandro
    Sanchez-Esguevillas, Antonio
    Carro, Belen
    Raibulet, Claudia
    Ubezio, Luigi
    Valle, Enrico
    Serrano, Martin
    Foghlu, Micheal O.
    Strassner, John
    IEEE PERVASIVE COMPUTING, 2008, 7 (04) : 58 - 61
  • [45] Silos, us, them, and user-generated content
    Ojala, Marydee
    ONLINE, 2007, 31 (04): : 5 - 5
  • [46] USER-GENERATED CONTENT AND GATEKEEPING AT THE BBC HUB
    Harrison, Jackie
    JOURNALISM STUDIES, 2010, 11 (02) : 243 - 256
  • [47] Recognizing Musical Entities in User-generated Content
    Porcaro, Lorenzo
    Saggion, Horacio
    COMPUTACION Y SISTEMAS, 2019, 23 (03): : 1079 - 1088
  • [48] Anonymous authorship control for user-generated content
    Lee, Suk-Bong
    Sim, Sang-Gyoo
    Kim, Yeo-Jin
    Oh, Yun-Sang
    Jung, Kyung-Im
    Noh, Bong-Nam
    WMSCI 2007: 11TH WORLD MULTI-CONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL I, PROCEEDINGS, 2007, : 77 - +
  • [49] Learning opinions in user-generated web content
    Sokolova, M.
    Lapalme, G.
    NATURAL LANGUAGE ENGINEERING, 2011, 17 : 541 - 567
  • [50] Blogs and wikis:: pioneers of user-generated content
    Robert, Pinter
    INFORMACIOS TARSADALOM, 2008, 8 (01): : 10 - +