Unsupervised Concept Representation Learning for Length-Varying Text Similarity

被引:0
|
作者
Zhang, Xuchao [1 ]
Zong, Bo [1 ]
Cheng, Wei [1 ]
Ni, Jingchao [1 ]
Liu, Yanchi [1 ]
Chen, Haifeng [1 ]
机构
[1] NEC Labs Amer, Princeton, NJ 08540 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Measuring document similarity plays an important role in natural language processing tasks. Most existing document similarity approaches suffer from the information gap caused by context and vocabulary mismatches when comparing varying-length texts. In this paper, we propose an unsupervised concept representation learning approach to address the above issues. Specifically, we propose a novel Concept Generation Network (CGNet) to learn concept representations from the perspective of the entire text corpus. Moreover, a concept-based document matching method is proposed to leverage advances in the recognition of local phrase features and corpus-level concept features. Extensive experiments on real-world data sets demonstrate that new method can achieve a considerable improvement in comparing length-varying texts. In particular, our model achieved 6.5% better F1 Score compared to the best of the baseline models for a concept-project benchmark dataset.
引用
收藏
页码:5611 / 5620
页数:10
相关论文
共 50 条
  • [2] Instance Similarity Learning for Unsupervised Feature Representation
    Wang, Ziwei
    Wang, Yunsong
    Wu, Ziyi
    Lu, Jiwen
    Zhou, Jie
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 10316 - 10325
  • [3] Text similarity measurement using concept representation of texts
    Pandya, A
    Bhattacharyya, P
    PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PROCEEDINGS, 2005, 3776 : 678 - 683
  • [4] LENGTH-VARYING COMPUTER-CONTROLLED FIBER DRAWING
    BOGATYRJOV, VA
    BUBNOV, MM
    SEMENOV, SL
    SYSOLIATIN, AA
    MEASUREMENT SCIENCE AND TECHNOLOGY, 1994, 5 (11) : 1370 - 1374
  • [5] Unsupervised Event Graph Representation and Similarity Learning on Biomedical Literature
    Frisoni, Giacomo
    Moro, Gianluca
    Carlassare, Giulio
    Carbonaro, Antonella
    SENSORS, 2022, 22 (01)
  • [6] MoCoUTRL: a momentum contrastive framework for unsupervised text representation learning
    Zou, Ao
    Hao, Wenning
    Jin, Dawei
    Chen, Gang
    Sun, Feiyan
    CONNECTION SCIENCE, 2023, 35 (01)
  • [7] Balancing Similarity-Contrast in Unsupervised Representation Learning: Evaluation with Reinforcement Learning
    Mengistu, Menore Tekeba
    Alemu, Getachew
    Chevaillier, Pierre
    De Loor, Pierre
    2022 21ST IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, ICMLA, 2022, : 270 - 277
  • [8] Similarity measures for Chinese short text based on representation learning
    University of Science and Technology Beijing, Beijing, China
    不详
    J. Inf. Comput. Sci., 6 (2253-2263):
  • [9] Boundary Control of a Rotating and Length-Varying Flexible Robotic Manipulator System
    Liu, Yu
    Zhan, Wenkang
    Xing, Mali
    Wu, Yilin
    Xu, Ruifeng
    Wu, Xinsheng
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2022, 52 (01): : 377 - 386
  • [10] Unsupervised Learning of Discourse-Aware Text Representation for Essay Scoring
    Mim, Farjana Sultana
    Inoue, Naoya
    Reisert, Paul
    Ouchi, Hiroki
    Inui, Kentaro
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019:): STUDENT RESEARCH WORKSHOP, 2019, : 378 - 385