Quality Assessment of Arabic Web Content: The case of the Arabic Wikipedia

被引:0
|
作者
Yahya, Adnan [1 ]
Salhi, Ali [1 ]
机构
[1] Birzeit Univ, Dept Comp Syst Engn, Birzeit, Palestine
关键词
Document Quality Assessment; Arabic Wikipedia; Web Content Quality; Quality Assessment Parameters;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the huge size and large diversity of Arabic web content, machine assessment of document quality acquires added importance. Users are in dire need for quality rating of the material returned in response to their queries. The Wikipedia, with its large metadata, has been a topic of extensive research on document quality assessment. Criteria used include text properties and style parameters, contributor and edit characteristics and multimedia components. In this paper we report on our ongoing work to adapt existing document assessment approaches to Arabic content with concentration on the Arabic Wikipedia and present some of the results. We also try to augment that with features specific to Arabic as well as parameters like author expertise and social media presence. One of our goals is an aggregate measure integrating many of the features into a single document quality index. We plan to use Wikipedia article quality assessment results to train general content assessment methods that can be applied to general content that lacks major Wikipedia features.
引用
收藏
页码:36 / 41
页数:6
相关论文
共 50 条
  • [1] Conceptual Search for Arabic Web Content
    Al-Zoghby, Aya M.
    Shaalan, Khaled
    [J]. COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING (CICLING 2015), PT II, 2015, 9042 : 405 - 416
  • [2] Automatic Quality Assessment of Content Created Collaboratively by Web Communities: A Case Study of Wikipedia
    Dalip, Daniel Hasan
    Goncalves, Marcos Andre
    Cristo, Marco
    Calado, Pavel
    [J]. JCDL 09: PROCEEDINGS OF THE 2009 ACM/IEEE JOINT CONFERENCE ON DIGITAL LIBRARIES, 2009, : 295 - 304
  • [3] Applying authorship analysis to arabic web content
    Abbasi, A
    Chen, HC
    [J]. INTELLIGENCE AND SECURITY INFORMATICS, PROCEEDINGS, 2005, 3495 : 183 - 197
  • [4] Web design for dyslexics: Accessibility of Arabic content
    Al-Wabil, Areej
    Zaphiris, Panayiotis
    Wilson, Stephanie
    [J]. COMPUTERS HELPING PEOPLE WITH SPECIAL NEEDS, PROCEEDINGS, 2006, 4061 : 817 - 822
  • [5] Models for Arabic Document Quality Assessment
    Yahya, Adnan
    Ahmad, Afnan
    Assaf, Alaa
    Khater, Rawan
    Salhi, Ali
    [J]. BUSINESS INFORMATION SYSTEMS WORKSHOPS (BIS 2020), 2020, 394 : 297 - 310
  • [6] Arabic Medical Terms Compilation from Wikipedia
    Vivaldi, Jorge
    Rodriguez, Horacio
    [J]. 2014 THIRD IEEE INTERNATIONAL COLLOQUIUM IN INFORMATION SCIENCE AND TECHNOLOGY (CIST'14), 2014, : 248 - 253
  • [7] Content-based analysis to detect Arabic web spam
    Al-Kabi, Mohammed
    Wahsheh, Heider
    Alsmadi, Izzat
    Al-Shawakfa, Emad
    Wahbeh, Abdullah
    Al-Hmoud, Ahmed
    [J]. JOURNAL OF INFORMATION SCIENCE, 2012, 38 (03) : 284 - 296
  • [8] Document Similarity for Arabic and Cross-Lingual Web Content
    Salhi, Ali
    Yahya, Adnan H.
    [J]. ARABIC LANGUAGE PROCESSING: FROM THEORY TO PRACTICE, 2018, 782 : 134 - 146
  • [9] Combining Semantic Techniques to Enhance Arabic Web Content Retrieval
    AlAwajy, Anfal M.
    Berri, Jawad
    [J]. 2013 9TH INTERNATIONAL CONFERENCE ON INNOVATIONS IN INFORMATION TECHNOLOGY (IIT), 2013,
  • [10] The Arabic ontology - an Arabic wordnet with ontologically clean content
    Jarrar, Mustafa
    [J]. APPLIED ONTOLOGY, 2021, 16 (01) : 1 - 26