Automatic Quality Assessment of Content Created Collaboratively by Web Communities: A Case Study of Wikipedia

被引:0
|
作者
Dalip, Daniel Hasan [1 ]
Goncalves, Marcos Andre [1 ]
Cristo, Marco
Calado, Pavel
机构
[1] Univ Fed Minas Gerais, Dept Comp Sci, Belo Horizonte, MG, Brazil
关键词
Quality Assessment; Wikipedia; Machine Learning; SVM;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The old dream of a universal repository containing all the human knowledge and culture is becoming possible through the Internet and the Web. Moreover, this is happening with the direct collaborative, participation of people., Wikipedia is a great example. It is an enormous repository of information with free access and edition, created by the community in a collaborative manner. However, this large amount of information, made available democratically and virtually without any control, raises questions about its relative quality. In this work we explore a significant number of quality indicators, some of them proposed by us and used here for the first, time, and study I,heir capability to assess the quality of Wikipedia articles. Furthermore, we explore machine learning techniques to combine these quality indicators into one single assessment judgment. Through experiments, we show that the most important quality indicators are the easiest ones to extract, namely, textual features related to length, structure and style. We were also able to determine which indicators did not contribute significantly to the quality assessment. These were, coincidentally, the most complex features, such as those based on link analysis. Finally, we compare our combination method with state-of-the-art solution and show significant improvements in terms of effective quality prediction.
引用
收藏
页码:295 / 304
页数:10
相关论文
共 50 条
  • [1] Quality Assessment of Arabic Web Content: The case of the Arabic Wikipedia
    Yahya, Adnan
    Salhi, Ali
    [J]. 2014 10TH INTERNATIONAL CONFERENCE ON INNOVATIONS IN INFORMATION TECHNOLOGY (IIT), 2014, : 36 - 41
  • [2] A general multiview framework for assessing the quality of collaboratively created content on web 2.0
    Dalip, Daniel H.
    Goncalves, Marcos Andre
    Cristo, Marco
    Calado, Pavel
    [J]. JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY, 2017, 68 (02) : 286 - 308
  • [3] Quality assessment of collaboratively-created web content with no manual intervention based on soft multi-view generation
    Goncalves Magalhaes, Luiz Felipe
    Goncalves, Marcos Andre
    Canuto, Sergio Daniel
    Dalip, Daniel H.
    Cristo, Marco
    Calado, Pavel
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2019, 132 : 226 - 238
  • [4] A deep learning-based quality assessment model of collaboratively edited documents: A case study of Wikipedia
    Wang, Ping
    Li, Xiaodan
    Wu, Renli
    [J]. JOURNAL OF INFORMATION SCIENCE, 2021, 47 (02) : 176 - 191
  • [5] Measuring Quality of Collaboratively Edited Documents: the case of Wikipedia
    Dang, Quang-Vinh
    Ignat, Claudia-Lavinia
    [J]. 2016 IEEE 2ND INTERNATIONAL CONFERENCE ON COLLABORATION AND INTERNET COMPUTING (IEEE CIC), 2016, : 266 - 275
  • [6] Information quality assessment of community-generated content - A user study of Wikipedia
    Yaari, Eti
    Baruchson-Arbib, Shifra
    Bar-Ilan, Judit
    [J]. JOURNAL OF INFORMATION SCIENCE, 2011, 37 (05) : 487 - 498
  • [7] Quality Assessment of Wikipedia Content Using Topic Models
    Santos, Lauro C. J.
    Christofani, Tais
    Silva, Ismael S.
    Dalip, Daniel H.
    [J]. WEBMEDIA 2019: PROCEEDINGS OF THE 25TH BRAZILLIAN SYMPOSIUM ON MULTIMEDIA AND THE WEB, 2019, : 249 - 252
  • [8] Computational Trust in Web Content Quality: A Comparative Evalutation on the Wikipedia Project
    Dondio, Pierpaolo
    Barrett, Stephen
    [J]. INFORMATICA-JOURNAL OF COMPUTING AND INFORMATICS, 2007, 31 (02): : 151 - 160
  • [9] Automatic Quality Assessment of Wikipedia Articles-A Systematic Literature Review
    Moas, Pedro Miguel
    Lopes, Carla Teixeira
    [J]. ACM COMPUTING SURVEYS, 2024, 56 (04)
  • [10] Quality assessment and stigmatising content of Wikipedia articles relating to functional disorders
    McGhie-Fraser, Brodie
    Tattan, Mais
    Chaabouni, Asma
    Kustra-Mulder, Aleksandra
    Mamo, Nick
    McLoughlin, Caoimhe
    Muenker, Lina
    Niwa, Saya
    Pampel, Anna Maria
    Petzke, Tara
    Regnath, Franziska
    Rometsch, Caroline
    Smakowski, Abigail
    Saunders, Chloe
    Treufeldt, Hobe
    Weigel, Angelika
    Rosmalen, Judith
    [J]. JOURNAL OF PSYCHOSOMATIC RESEARCH, 2023, 165