Leveraging User-Generated Content for News Search

被引:0
|
作者
McCreadie, Richard M. C. [1 ]
机构
[1] Univ Glasgow, Dept Comp Sci, Glasgow G12 8QQ, Lanark, Scotland
关键词
News; Blogs; Social Media;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Over the last few years both availability and accessibility of current news stories on the Web have dramatically improved [3]. In particular, users can now access news from a variety of sources hosted on the Web, from newswire presences such as the New York Times, to integrated news search within Web search engines. However, of central interest is the emerging impact that user-generated content (UGC) is having on this online news landscape. Indeed, the emergence of Web 2.0 has turned a static news consumer base into a dynamic news machine, where news stories are summarised and commented upon. In summary, value is being added to each news story in terms of additional content. Importantly, however, while there has been movement in commercial circles to exploit this extra value to enrich online news [5], there has been little research from the academic community on how can be achieved. Indeed, the main purpose of this thesis is to research practical techniques for the integration of UGC to improve the news search component of the most ubiquitous of Web tools, i.e the Web search engine. Importantly, we identify the following three key aspects of news search which might be improved through the application of UGC. Intuitively, the first task that the news vertical search aspect of a Web search engine needs to accomplish when confronted with a user query is to decide whether the query is in fact news-related, and hence requires news content to be included. However, queries themselves are sparse in nature, being often comprised of one of two tokens only. This presents issues when performing query classification, as there are few features to distinguish the news related queries. We attest that UGC can help alleviate this ambiguity. Indeed, we hypothesise that there is a strong link between the volume of UGC content being posted mentioning a query and the likelihood of that query being news-related within a specific timeframe. Secondly, we consider the task of real-time event detection. It is imperative for search engines to maintain knowledge of the events of the moment, such that the results displayed are updated. Traditionally, systems have detected new events through the clustering of newswire articles [1]. However, in the current fast-paced news search environment where users begin querying for events within a couple of minutes of their occurrence [4], relying on slow newswire reporting is unacceptable. On the other-hand, UGC sources such as Twitter provide a natural alternative, as the high post rate and popularity of news topics makes a site such as this an ideal medium from which to monitor emerging events. Indeed, many paid journalists maintain personal blogs and other social media accounts for the reporting of fast-breaking news stories [2]. Lastly, we examine the presentation of results to the user. The presentation of news articles to satisfy news-searches is generally accepted. However, with the ever-increasing pace of news reporting world-wide, there is now no guarantee that a trusted news source will have yet published upon the story. In these cases, one must look else-where for content to satisfy the user. We hypothesise that UGC is ideal for presentation in these cases as the delay between an event occurring and commentary appearing in UGC sources like Twitter or the Blogosphere is mear minutes. Moreover some information needs cannot be easily solved using newswire articles alone. For example, the correct result for the query 'current news' would be a list of news stories ranked by their importance for the day in question. This is a difficult ranking problem, as 'importance' is greatly dependent upon the perspective of the user. In this case, one solution might be to leverage 'public opinion' as represented in UGC, for example by taking 'the pulse of the Blogosphere'. Indeed, we have examined such during TREC 2009. In conclusion, we have identified multiple areas of the news-search process which cannot be satisfied by traditional newswire articles. We hypothesise that the application of user-generated content can be leveraged to improve the field of news-search in relation to the rich and timely information that UGC provides.
引用
下载
收藏
页码:919 / 919
页数:1
相关论文
共 50 条
  • [31] Editorial: Online User Behavior and User-Generated Content
    Saura, Jose Ramon
    Dwivedi, Yogesh K.
    Palacios-Marques, Daniel
    FRONTIERS IN PSYCHOLOGY, 2022, 13
  • [32] NExT: NUS-Tsinghua Center for Extreme Search of User-Generated Content
    Chua, Tat-Seng
    Luan, Huanbo
    Sun, Maosong
    Yang, Shiqiang
    IEEE MULTIMEDIA, 2012, 19 (03) : 81 - 87
  • [33] BERT-Based Movie Keyword Search Leveraging User-Generated Movie Rankings and Reviews
    Miyashita, Tensho
    Shoji, Yoshiyuki
    Fujita, Sumio
    Durst, Martin J.
    2024 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING, IEEE BIGCOMP 2024, 2024, : 246 - 253
  • [34] The institutionalization of YouTube: From user-generated content to professionally generated content
    Kim, Jin
    MEDIA CULTURE & SOCIETY, 2012, 34 (01) : 53 - 67
  • [35] Aggression in news comments: how context and article topic shape user-generated content
    Goncalves, Joao
    JOURNAL OF APPLIED COMMUNICATION RESEARCH, 2018, 46 (05) : 604 - 620
  • [36] Detecting short-term cyclical topic dynamics in the user-generated content and news
    Lu, Hsin-Min
    DECISION SUPPORT SYSTEMS, 2015, 70 : 1 - 14
  • [37] Mediatised Participation: Citizen Journalism and the Decline in User-Generated Content in Online News Media
    Pena-Fernandez, Simon
    Larrondo-Ureta, Ainara
    Agirreazkuenaga, Irati
    SOCIAL SCIENCES-BASEL, 2024, 13 (05):
  • [38] Studies of user-generated content: A systematic review
    Naab, Teresa K.
    Sehl, Annika
    JOURNALISM, 2017, 18 (10) : 1256 - 1273
  • [39] On the Use of User-generated Content in Critiquing Recommendation
    Contreras, David
    Salamo, Maria
    ARTIFICIAL INTELLIGENCE RESEARCH AND DEVELOPMENT, 2015, 277 : 195 - 204
  • [40] Impact of Mobility and Timing on User-Generated Content
    Piccoli, Gabriele
    Ott, Myle
    MIS QUARTERLY EXECUTIVE, 2014, 13 (03) : 147 - 157