Classification of Protein-Protein Interaction Full-Text Documents Using Text and Citation Network Features

被引:21
|
作者
Kolchinsky, Artemy [1 ,2 ]
Abi-Haidar, Alaa [1 ,2 ]
Kaur, Jasleen [1 ]
Hamed, Ahmed Abdeen [1 ]
Rocha, Luis M. [1 ,2 ]
机构
[1] Indiana Univ, Sch Informat & Comp, Bloomington, IN 47408 USA
[2] FLAD Computat Biol Collaboratorium, Inst Gulbenkian Ciencia, P-2780156 Oeiras, Portugal
关键词
Text mining; literature mining; binary classification; protein-protein interaction; citation network; INFORMATION; GENES;
D O I
10.1109/TCBB.2010.55
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
We participated ( as Team 9) in the Article Classification Task of the Biocreative II.5 Challenge: binary classification of full-text documents relevant for protein-protein interaction. We used two distinct classifiers for the online and offline challenges: 1) the lightweight Variable Trigonometric Threshold (VTT) linear classifier we successfully introduced in BioCreative 2 for binary classification of abstracts and 2) a novel Naive Bayes classifier using features from the citation network of the relevant literature. We supplemented the supplied training data with full-text documents from the MIPS database. The lightweight VTT classifier was very competitive in this new full-text scenario: it was a top-performing submission in this task, taking into account the rank product of the Area Under the interpolated precision and recall Curve, Accuracy, Balanced F-Score, and Matthew's Correlation Coefficient performance measures. The novel citation network classifier for the biomedical text mining domain, while not a top performing classifier in the challenge, performed above the central tendency of all submissions, and therefore indicates a promising new avenue to investigate further in bibliome informatics.
引用
收藏
页码:400 / 411
页数:12
相关论文
共 50 条
  • [21] RETRACTED: Comparison of classification methods on imbalanced protein-protein interaction text set (Retracted Article)
    Xu, Guixian
    Gao, Xu
    Zhao, Xiaobing
    2011 INTERNATIONAL CONFERENCE ON ENERGY AND ENVIRONMENTAL SCIENCE-ICEES 2011, 2011, 11 : 2295 - 2301
  • [22] Investigating and Annotating the Role of Citation in Biomedical Full-Text Articles
    Yu, Hong
    Agarwal, Shashank
    Frid, Nadya
    BIBMW: 2009 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE WORKSHOP, 2009, : 302 - 307
  • [23] Protein-Protein Interaction Network Extraction Using Text Mining Methods Adds Insight into Autism Spectrum Disorder
    Nezamuldeen, Leena
    Jafri, Mohsin Saleet
    BIOLOGY-BASEL, 2023, 12 (10):
  • [24] USING NEAREST-NEIGHBOR SEARCHING TECHNIQUES TO ACCESS FULL-TEXT DOCUMENTS
    ALHAWAMDEH, S
    DEVERE, R
    SMITH, G
    WILLETT, P
    ONLINE REVIEW, 1991, 15 (3-4): : 173 - 191
  • [25] Improving bibliographic coupling and co-citation: A study evaluating the effects of various full-text citation features
    Zhang, Ruhao
    Yuan, Junpeng
    JOURNAL OF INFORMATION SCIENCE, 2025,
  • [26] PARAGRAPH-BASED ACCESS TO FULL-TEXT DOCUMENTS USING A HYPERTEXT SYSTEM
    ALHAWAMDEH, S
    SMITH, G
    WILLETT, P
    PROGRAM-AUTOMATED LIBRARY AND INFORMATION SYSTEMS, 1991, 25 (02): : 119 - 131
  • [27] Toward Full-text Searching Middleware over Hierarchical Documents
    Ma, Kun
    Yang, Bo
    Abraham, Ajith
    2013 13TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS (ISDA), 2013, : 194 - 198
  • [28] Extracting protein-protein interaction information from biomedical text with SVM
    Mitsumori, Tomohiro
    Murata, Masaki
    Fukuda, Yasushi
    Doi, Kouichi
    Doi, Hirohumi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2006, E89D (08) : 2464 - 2466
  • [29] Full-Text based Context-Rich Heterogeneous Network Mining Approach for Citation Recommendation
    Liu, Xiaozhong
    Yu, Yingying
    Guo, Chun
    Sun, Yizhou
    Gao, Liangcai
    2014 IEEE/ACM JOINT CONFERENCE ON DIGITAL LIBRARIES (JCDL), 2014, : 361 - 370
  • [30] Improving protein-protein interaction prediction using protein language model and protein network features
    Hu, Jun
    Li, Zhe
    Rao, Bing
    Thafar, Maha A.
    Arif, Muhammad
    ANALYTICAL BIOCHEMISTRY, 2024, 693