Documents, Topics, and Authors: Text Mining of Online News

被引:0
|
作者
Sertkan, Mete [1 ]
Neidhardt, Julia [1 ]
Werthner, Hannes [1 ]
机构
[1] TU Wien, Res Unit ECommerce, Vienna, Austria
关键词
Recommender Systems; Online News; Text Mining; Topic Modelling; Co-occurence Networks;
D O I
10.1109/CBI.2019.00053
中图分类号
F [经济];
学科分类号
02 ;
摘要
The goal of recommender systems is, in essence, to help people to discover items they might like, i.e., items that fit their preferences, personality, and needs. Depending on the respective domain, those items can be books, movies, music, hotels, and much more. Typically, recommendations are based on past user interactions (e.g., movies a user saw, hotels a user booked, etc.). This work in progress paper focuses on news recommender systems. Because of the nature of news (e.g., constantly new items, short item lifetime, etc.), recommendations based on past interactions are especially hard to make. Hence, news recommender systems heavily rely on the actual content of news. While previous work mainly considers one aspect of the content of news articles, we jointly analyse and discuss in this work a given corpora of news articles on three different levels (i.e., document-level, topic-level, and author-level). The overall aim is to set to provide the basis for a comprehensive news recommender system, which reaches beyond accuracy and considers also diversity and serendipity. We demonstrate that relevant information can be extracted out of a given corpora, and differences in author, time, and topic can be shown. Furthermore, the author-level analysis shows that documents can be clustered based on the writing style of authors. Finally, our findings show that author-level analysis has the potential to recommend the most diverse items compared to the other approaches.
引用
收藏
页码:405 / 413
页数:9
相关论文
共 50 条
  • [31] Extracting Body Text from Academic PDF Documents for Text Mining
    Yu, Changfeng
    Zhang, Cheng
    Wang, Jie
    PROCEEDINGS OF THE 12TH INTERNATIONAL JOINT CONFERENCE ON KNOWLEDGE DISCOVERY, KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT (KDIR), VOL 1, 2020, : 235 - 242
  • [32] A tracking and summarization system for online Chinese news topics
    Chang, Hsien-Tsung
    Liu, Shu-Wei
    Mishra, Nilamadhab
    ASLIB JOURNAL OF INFORMATION MANAGEMENT, 2015, 67 (06) : 687 - 699
  • [33] Text Mining: Sentiment Analysis on news classification
    Gomes, Helder
    Neto, Miguel de Castro
    Henriques, Roberto
    PROCEEDINGS OF THE 2013 8TH IBERIAN CONFERENCE ON INFORMATION SYSTEMS AND TECHNOLOGIES (CISTI 2013), 2013,
  • [34] Online News User Journeys: The Role of Social Media, News Websites, and Topics
    Vermeer, Susan
    Trilling, Damian
    Kruikemeier, Sanne
    de Vreese, Claes
    DIGITAL JOURNALISM, 2020, 8 (09) : 1114 - 1141
  • [35] Text Mining and Sentiment Extraction in Central Bank Documents
    Bruno, Giuseppe
    2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 1700 - 1708
  • [36] Text mining using the hierarchical syntactical structure of documents
    Danger, R
    Ruíz-Shulcloper, J
    Berlanga, R
    CURRENT TOPICS IN ARTIFICIAL INTELLIGENCE, 2004, 3040 : 556 - 565
  • [37] Theoretical considerations of ethics in text mining of nursing documents
    Suominen, Hanna
    Lehtikunnas, Tuija
    Back, Barbro
    Karstena, Helena
    Salakoski, Tapio
    Salantera, Sanna
    CONSUMER-CENTERED COMPUTER-SUPPPORTED CARE FOR HEALTHY PEOPLE, 2006, 122 : 359 - +
  • [38] Mining criminal networks from unstructured text documents
    Al-Zaidy, Rabeah
    Fung, Benjamin C. M.
    Youssef, Amr M.
    Fortin, Francis
    DIGITAL INVESTIGATION, 2012, 8 (3-4) : 147 - 160
  • [39] Business documents analysis using text mining techniques
    Almanaseer, Orabe
    Alkhaleefah, Mohammad
    Elmanaseer, Sakha'a
    International Review on Computers and Software, 2012, 7 (04) : 1663 - 1677
  • [40] A text mining system for deviation detection in financial documents
    Kamaruddin, Siti Sakira
    Abu Bakar, Azuraliza
    Hamdan, Abdul Razak
    Nor, Fauzias Mat
    Nazri, Mohd Zakree Ahmad
    Othman, Zulaiha Ali
    Hussein, Ghassan Saleh
    INTELLIGENT DATA ANALYSIS, 2015, 19 : S19 - S44