Error Link Detection and Correction in Wikipedia

被引:6
|
作者
Wang, Chengyu [1 ]
Zhang, Rong [1 ]
He, Xiaofeng [1 ]
Zhou, Aoying [1 ]
机构
[1] East China Normal Univ, Sch Comp Sci & Software Engn, Shanghai, Peoples R China
关键词
error link; Wikipedia; LinkRank; pairwise learning; LARGE-SCALE;
D O I
10.1145/2983323.2983705
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The hyperlink structure of Wikipedia forms a rich semantic network connecting entities and concepts, enabling it as a valuable source for knowledge harvesting. Wikipedia, as crowd-sourced data, faces various data quality issues which significantly impacts knowledge systems depending on it as the information source. One such issue occurs when an anchor text in a Wikipage links to a wrong Wikipage, causing the error link problem. While much of previous work has focused on leveraging Wikipedia for entity linking, little has been done to detect error links. In this paper, we address the error link problem, and propose algorithms to detect and correct error links. We introduce an efficient method to generate candidate error links based on iterative ranking in an Anchor Text Semantic Network. This greatly reduces the problem space. A more accurate pairwise learning model was used to detect error links from the reduced candidate error link set, while suggesting correct links in the same time. This approach is effective when data sparsity is a challenging issue. The experiments on both English and Chinese Wikipedia illustrate the effectiveness of our approach. We also provide a preliminary analysis on possible causes of error links in English and Chinese Wikipedia.
引用
收藏
页码:307 / 316
页数:10
相关论文
共 50 条
  • [41] What Makes a Link Successful on Wikipedia?
    Dimitrov, Dimitar
    Singer, Philipp
    Lemmerich, Florian
    Strohmaier, Markus
    PROCEEDINGS OF THE 26TH INTERNATIONAL CONFERENCE ON WORLD WIDE WEB (WWW'17), 2017, : 917 - 926
  • [42] To Link or Not to Link: Ranking Hyperlinks in Wikipedia using Collective Attention
    Thruesen, Philip
    Cechak, Jaroslav
    Seznec, Blandine
    Castano, Roel
    Kanhabua, Nattiya
    2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 1709 - 1718
  • [43] Multiple Bit Error Detection and Correction in Memory
    Tarrillo, J. F.
    Mavrogiannakis, N.
    Lisboa, C. A.
    Argyrides, C.
    Carro, L.
    13TH EUROMICRO CONFERENCE ON DIGITAL SYSTEM DESIGN: ARCHITECTURES, METHODS AND TOOLS, 2010, : 652 - 657
  • [44] Error Detection and Correction using Fast Coding
    Faraj, Khalid
    EHAC'09: PROCEEDINGS OF THE 9TH WSEAS INTERNATIONAL CONFERENCE ON ELECTRONICS, HARDWARE, WIRELESS AND OPTIONAL COMMUNICATIONS, 2010, : 110 - 114
  • [45] Deep Learning for Arabic Error Detection and Correction
    Alkhatib, Manar
    Monem, Azza Abdel
    Shaalan, Khaled
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2020, 19 (05)
  • [46] Timing Error Detection and Correction by Time Dilation
    Floros, Andreas
    Tsiatouhas, Yiorgos
    Kavousianos, Xrysovalantis
    VLSI-SOC: DESIGN METHODOLOGIES FOR SOC AND SIP, 2010, 313 : 271 - 285
  • [47] Error Detection and Correction in Content Addressable Memories
    Pontarelli, S.
    Ottavi, M.
    Salsano, A.
    2010 IEEE 25TH INTERNATIONAL SYMPOSIUM ON DEFECT AND FAULT TOLERANCE IN VLSI SYSTEMS (DFT 2010), 2010, : 420 - 428
  • [48] Improved Hamming code for error detection and correction
    Kumar, U. K.
    Umashankar, B. S.
    2007 2ND INTERNATIONAL SYMPOSIUM ON WIRELESS PERVASIVE COMPUTING, VOLS 1 AND 2, 2007, : 498 - 500
  • [49] FEEDBACK AND PRACTICE AS VARIABLES IN ERROR DETECTION AND CORRECTION
    ADAMS, JA
    GOETZ, ET
    JOURNAL OF MOTOR BEHAVIOR, 1973, 5 (04) : 217 - 224
  • [50] Semantic Error Detection and Correction in Bangla Sentence
    Mridha, M. F.
    Hamid, Md Abdul
    Rana, Md Mashod
    Khan, Md Eyaseen Arafat
    Ahmed, Md Masud
    Sultan, Mohammad Tipu
    2019 JOINT 8TH INTERNATIONAL CONFERENCE ON INFORMATICS, ELECTRONICS & VISION (ICIEV) AND 2019 3RD INTERNATIONAL CONFERENCE ON IMAGING, VISION & PATTERN RECOGNITION (ICIVPR) WITH INTERNATIONAL CONFERENCE ON ACTIVITY AND BEHAVIOR COMPUTING (ABC), 2019, : 184 - 189