Text GCN-SW-KNN: a novel collaborative training multi-label classification method for WMS application themes by considering geographic semantics

被引:7
|
作者
Wei, Zhengyang [1 ]
Gui, Zhipeng [1 ,2 ,3 ]
Zhang, Min [2 ,3 ]
Yang, Zelong [2 ,3 ]
Mei, Yuao [2 ]
Wu, Huayi [2 ,3 ]
Liu, Hongbo [4 ]
Yu, Jing [4 ]
机构
[1] Wuhan Univ, Sch Remote Sensing & Informat Engn, Wuhan, Peoples R China
[2] Wuhan Univ, State Key Lab Informat Engn Surveying Mapping & R, Wuhan, Peoples R China
[3] Wuhan Univ, Collaborat Innovat Ctr Geospatial Technol, Wuhan, Peoples R China
[4] Chongqing Geomat & Remote Sensing Ctr, Informat Technol Dept, Chongqing, Peoples R China
基金
中国国家自然科学基金;
关键词
Web map service; multi-label text classification; semantic distance; text graph convolutional network; collaborative training; ML-KNN; application theme extraction; SEARCH; WEB;
D O I
10.1080/20964471.2021.1877434
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Without explicit description of map application themes, it is difficult for users to discover desired map resources from massive online Web Map Services (WMS). However, metadata-based map application theme extraction is a challenging multi-label text classification task due to limited training samples, mixed vocabularies, variable length and content arbitrariness of text fields. In this paper, we propose a novel multi-label text classification method, Text GCN-SW-KNN, based on geographic semantics and collaborative training to improve classification accuracy. The semi-supervised collaborative training adopts two base models, i.e. a modified Text Graph Convolutional Network (Text GCN) by utilizing Semantic Web, named Text GCN-SW, and widely-used Multi-Label K-Nearest Neighbor (ML-KNN). Text GCN-SW is improved from Text GCN by adjusting the adjacency matrix of the heterogeneous word document graph with the shortest semantic distances between themes and words in metadata text. The distances are calculated with the Semantic Web of Earth and Environmental Terminology (SWEET) and WordNet dictionaries. Experiments on both the WMS and layer metadata show that the proposed methods can achieve higher F1-score and accuracy than state-of-the-art baselines, and demonstrate better stability in repeating experiments and robustness to less training data. Text GCN-SW-KNN can be extended to other multi-label text classification scenario for better supporting metadata enhancement and geospatial resource discovery in Earth Science domain.
引用
收藏
页码:66 / 89
页数:24
相关论文
共 5 条
  • [1] Text Classification Based on a Novel Ensemble Multi-Label Learning Method
    Zhang, Tao
    Wu, Jiansheng
    Hu, Haifeng
    [J]. 2014 2ND INTERNATIONAL CONFERENCE ON SYSTEMS AND INFORMATICS (ICSAI), 2014, : 964 - 968
  • [2] A multi-label social short text classification method based on contrastive learning and improved ml-KNN
    Tian, Gang
    Wang, Jiachang
    Wang, Rui
    Zhao, Guangxin
    He, Cheng
    [J]. EXPERT SYSTEMS, 2024, 41 (07)
  • [3] New Lifelong Topic Modeling Method and Its Application to Vietnamese Text Multi-label Classification
    Quang-Thuy Ha
    Thi-Ngan Pham
    Van-Quang Nguyen
    Thi-Cham Nguyen
    Thi-Hong Vuong
    Minh-Tuoi Tran
    Tri-Thanh Nguyen
    [J]. INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2018, PT I, 2018, 10751 : 200 - 210
  • [4] Text classification based on a novel cost-sensitive ensemble multi-label learning method
    Hu, Haifeng
    Zhang, Tao
    Wu, Jiansheng
    [J]. Journal of Software Engineering, 2016, 10 (01): : 42 - 53
  • [5] A new sentence embedding framework for the education and professional training domain with application to hierarchical multi-label text classification
    Lefebvre, Guillaume
    Elghazel, Haytham
    Guillet, Theodore
    Aussem, Alexandre
    Sonnati, Matthieu
    [J]. DATA & KNOWLEDGE ENGINEERING, 2024, 150