Error Link Detection and Correction in Wikipedia

被引:6
|
作者
Wang, Chengyu [1 ]
Zhang, Rong [1 ]
He, Xiaofeng [1 ]
Zhou, Aoying [1 ]
机构
[1] East China Normal Univ, Sch Comp Sci & Software Engn, Shanghai, Peoples R China
关键词
error link; Wikipedia; LinkRank; pairwise learning; LARGE-SCALE;
D O I
10.1145/2983323.2983705
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The hyperlink structure of Wikipedia forms a rich semantic network connecting entities and concepts, enabling it as a valuable source for knowledge harvesting. Wikipedia, as crowd-sourced data, faces various data quality issues which significantly impacts knowledge systems depending on it as the information source. One such issue occurs when an anchor text in a Wikipage links to a wrong Wikipage, causing the error link problem. While much of previous work has focused on leveraging Wikipedia for entity linking, little has been done to detect error links. In this paper, we address the error link problem, and propose algorithms to detect and correct error links. We introduce an efficient method to generate candidate error links based on iterative ranking in an Anchor Text Semantic Network. This greatly reduces the problem space. A more accurate pairwise learning model was used to detect error links from the reduced candidate error link set, while suggesting correct links in the same time. This approach is effective when data sparsity is a challenging issue. The experiments on both English and Chinese Wikipedia illustrate the effectiveness of our approach. We also provide a preliminary analysis on possible causes of error links in English and Chinese Wikipedia.
引用
收藏
页码:307 / 316
页数:10
相关论文
共 50 条
  • [31] Error detection/correction in collaborative writing
    Maura Pilotti
    Martin Chodorow
    Reading and Writing, 2009, 22 : 245 - 260
  • [32] Error correction in high frequency automatic link establishment radios with and without link protection
    Lay, R
    MILCOM 96, CONFERENCE PROCEEDINGS, VOLS 1-3, 1996, : 696 - 699
  • [33] Error detection and error correction procedures for the advanced encryption standard
    Czapski, Mariusz
    Nikodem, Maciej
    DESIGNS CODES AND CRYPTOGRAPHY, 2008, 49 (1-3) : 217 - 232
  • [34] ERROR-DETECTION AND ERROR CORRECTION IN BINARY DATA TRANSFER
    HAMMERL, S
    NTZ ARCHIV, 1988, 10 (11): : 307 - 314
  • [35] A single error correction double burst error detection code
    Bodnar, L
    Chapelle, G
    CONFERENCE RECORD OF THE THIRTY-SEVENTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, VOLS 1 AND 2, 2003, : 1118 - 1121
  • [36] Possibilistic Coding: Error Detection vs. Error Correction
    Bortolussi, Luca
    Sgarro, Andrea
    COMBINING SOFT COMPUTING AND STATISTICAL METHODS IN DATA ANALYSIS, 2010, 77 : 41 - 48
  • [37] Error correction and error detection techniques for wireless ATM systems
    Aikawa, Satoru
    Motoyama, Yasushi
    Umehira, Masahiro
    WIRELESS NETWORKS, 1997, 3 (04) : 285 - 290
  • [38] Error correction and error detection techniques for wireless ATM systems
    Satoru Aikawa
    Yasushi Motoyama
    Masahiro Umehira
    Wireless Networks, 1997, 3 : 1 - 6
  • [39] Error detection and error correction procedures for the advanced encryption standard
    Mariusz Czapski
    Maciej Nikodem
    Designs, Codes and Cryptography, 2008, 49 : 217 - 232
  • [40] Topic Modeling for Wikipedia Link Disambiguation
    Skaggs, Bradley
    Getoor, Lise
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2014, 32 (03)