Combining instance and feature neighbours for extreme multi-label classification

被引:2
|
作者
Feremans, Len [1 ]
Cule, Boris [2 ]
Vens, Celine [3 ,4 ]
Goethals, Bart [1 ,5 ]
机构
[1] Univ Antwerp, Dept Comp Sci, Antwerp, Belgium
[2] Univ Antwerp, Dept Accountancy & Finance, Antwerp, Belgium
[3] Katholieke Univ Leuven, Fac Med, Leuven, Belgium
[4] Katholieke Univ Leuven, Imec Res Grp, ITEC, Leuven, Belgium
[5] Monash Univ, Melbourne, Vic, Australia
关键词
Extreme multi-label classification; Item-based collaborative filtering; k-nearest neighbours; Top-kqueries; Information retrieval;
D O I
10.1007/s41060-020-00209-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Extreme multi-label classification problems occur in different applications such as prediction of tags or advertisements. We propose a new algorithm that predicts labels using a linear ensemble of labels from instance- and feature-based nearest neighbours. In the feature-based nearest neighbours method, we precompute a matrix containing the similarities between each feature and label. For the instance-based nearest neighbourhood, we create an algorithm that uses an inverted index to compute cosine similarity on sparse datasets efficiently. We extend this baseline with a new top-kquery algorithm that combines term-at-a-time and document-at-a-time traversal with tighter pruning based on a partition of the dataset. On ten real-world datasets, we find that our method outperforms state-of-the-art methods such as multi-labelk-nearest neighbours, instance-based logistic regression, binary relevance with support vector machines and FastXml on different evaluation metrics. We also find that our algorithm is orders of magnitude faster than these baseline algorithms on sparse datasets and requires less than 20 ms per instance to predict labels for extreme datasets without the need for expensive hardware.
引用
收藏
页码:215 / 231
页数:17
相关论文
共 50 条
  • [1] Combining instance and feature neighbours for extreme multi-label classification
    Len Feremans
    Boris Cule
    Celine Vens
    Bart Goethals
    [J]. International Journal of Data Science and Analytics, 2020, 10 : 215 - 231
  • [2] Combining Instance and Feature neighbors for Efficient Multi-label Classification
    Feremans, Len
    Cule, Boris
    Vens, Celine
    Goethals, Bart
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA), 2017, : 109 - 118
  • [3] Decoupled Instance-label Extreme Multi-label Classification with Skew Coordinate Feature Space
    Song, Jihyeon
    Moon, Bongki
    [J]. 2021 IEEE 37TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2021), 2021, : 1919 - 1924
  • [4] A Multi-label Classification Algorithm Combining Feature Screening and Label Correlation
    Chen, Xinying
    Liang, Xupeng
    Yi, Weiguo
    Song, Xudong
    Wang, Di
    Zhang, Yina
    [J]. IAENG International Journal of Computer Science, 2023, 50 (04)
  • [5] Towards Multi-label Feature Selection by Instance and Label Selections
    Mansouri, Dou El Kefel
    Benabdeslem, Khalid
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2021, PT II, 2021, 12713 : 233 - 244
  • [6] Learning Local Instance Constraint for Multi-label Classification
    Luo, Shang
    Wu, Xiaofeng
    Wang, Bin
    Zhang, Liming
    [J]. IMAGE AND GRAPHICS (ICIG 2017), PT I, 2017, 10666 : 284 - 294
  • [7] Independent Feature and Label Components for Multi-label Classification
    Zhong, Yongjian
    Xu, Chang
    Du, Bo
    Zhang, Lefei
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2018, : 827 - 836
  • [8] Exploiting Instance Relationship for Effective Extreme Multi-label Learning
    Li, Feifei
    Liu, Hongyan
    He, Jun
    Du, Xiaoyong
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2018), PT II, 2018, 10828 : 440 - 456
  • [9] Online Biomedical Publication Classification Using Multi-Instance Multi-Label Algorithms with Feature Reduction
    Ren, Dong
    Ma, Long
    Zhang, Yanqing
    Sunderraman, Raj
    Laird, Angela R.
    Turner, Jessica A.
    Fox, Peter T.
    Turner, Matthew D.
    [J]. PROCEEDINGS OF 2015 IEEE 14TH INTERNATIONAL CONFERENCE ON COGNITIVE INFORMATICS & COGNITIVE COMPUTING (ICCI*CC), 2015, : 234 - 241
  • [10] Extreme Multi-label Classification for Information Retrieval
    Dembczynski, Krzysztof
    Babbar, Rohit
    [J]. ADVANCES IN INFORMATION RETRIEVAL (ECIR 2018), 2018, 10772 : 839 - 840