Error Link Detection and Correction in Wikipedia

被引:6
|
作者
Wang, Chengyu [1 ]
Zhang, Rong [1 ]
He, Xiaofeng [1 ]
Zhou, Aoying [1 ]
机构
[1] East China Normal Univ, Sch Comp Sci & Software Engn, Shanghai, Peoples R China
关键词
error link; Wikipedia; LinkRank; pairwise learning; LARGE-SCALE;
D O I
10.1145/2983323.2983705
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The hyperlink structure of Wikipedia forms a rich semantic network connecting entities and concepts, enabling it as a valuable source for knowledge harvesting. Wikipedia, as crowd-sourced data, faces various data quality issues which significantly impacts knowledge systems depending on it as the information source. One such issue occurs when an anchor text in a Wikipage links to a wrong Wikipage, causing the error link problem. While much of previous work has focused on leveraging Wikipedia for entity linking, little has been done to detect error links. In this paper, we address the error link problem, and propose algorithms to detect and correct error links. We introduce an efficient method to generate candidate error links based on iterative ranking in an Anchor Text Semantic Network. This greatly reduces the problem space. A more accurate pairwise learning model was used to detect error links from the reduced candidate error link set, while suggesting correct links in the same time. This approach is effective when data sparsity is a challenging issue. The experiments on both English and Chinese Wikipedia illustrate the effectiveness of our approach. We also provide a preliminary analysis on possible causes of error links in English and Chinese Wikipedia.
引用
收藏
页码:307 / 316
页数:10
相关论文
共 50 条
  • [1] Link Detection with Wikipedia
    He, Jiyin
    ADVANCES IN FOCUSED RETRIEVAL, 2009, 5631 : 366 - 373
  • [2] Automatic Error Correction Using the Wikipedia Page Revision History
    Hasan, Md Kamrul
    Mahdavi, Mohammad
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 3073 - 3077
  • [3] Tuneable error correction and link restructuring
    Woodhead, Len
    Santos, D. Bravo
    IEE Communications Engineer, 2003, 1 (05): : 32 - 35
  • [4] AIR TO GROUND DATA LINK (SSR), ERROR AND ERROR CORRECTION
    CLOOS, B
    NACHRICHTENTECHNISCHE ZEITSCHRIFT, 1974, 27 (04): : 123 - 128
  • [5] The WikEd Error Corpus: A Corpus of Corrective Wikipedia Edits and Its Application to Grammatical Error Correction
    Grundkiewicz, Roman
    Junczys-Dowmunt, Marcin
    Advances in Natural Language Processing, 2014, 8686 : 478 - 490
  • [6] Ultrafast Error Correction Codes for Double Error Detection/Correction
    Saiz-Adalid, Luis-J.
    Gil, Pedro
    Ruiz, Juan-Carlos
    Gracia-Moran, Joaquin
    Gil-Tomas, Daniel
    Baraza-Calvo, J. -Carlos
    2016 12TH EUROPEAN DEPENDABLE COMPUTING CONFERENCE (EDCC 2016), 2016, : 108 - 119
  • [7] Network Error Correction With Unequal Link Capacities
    Kim, Sukwon
    Ho, Tracey
    Effros, Michelle
    Avestimehr, Amir Salman
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2011, 57 (02) : 1144 - 1164
  • [8] Network error correction with unequal link capacities
    Kim, Sukwon
    Ho, Tracey
    Effros, Michelle
    Avestimehr, Salman
    2009 47TH ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING, VOLS 1 AND 2, 2009, : 1387 - 1394
  • [9] ERROR-DETECTION AND ERROR-CORRECTION
    不详
    LECTURE NOTES IN CONTROL AND INFORMATION SCIENCES, 1987, 94 : 55 - 85
  • [10] Modeling Dynamics of Wikipedia: An Empirical Analysis Using a Vector Error Correction Model
    Liu, Feng-Jun
    Qiu, Jiang-Nan
    Zhao, Na
    4TH ANNUAL INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND APPLICATIONS (ITA 2017), 2017, 12