Modelling Text Similarity: A Survey

被引:0
|
作者
Mu, Wenchuan [1 ]
Lim, Kwan Hui [1 ]
机构
[1] Singapore Univ Technol & Design, Singapore, Singapore
关键词
Modelling and simulation; Deep learning and embeddings; Algorithms and techniques; SEMANTIC SIMILARITY; KERNELS;
D O I
10.1145/3625007.3627305
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Online social networking services such as Twitter and Instagram have become pervasive platforms for engaging in discussions on a wide array of topics. These platforms cater to both mainstream subjects, like music and movies, as well as more specialized areas, such as politics. With the growing volume of textual data generated on these platforms, the ability to define and identify similar texts becomes crucial for effective investigation and clustering. In this paper, we explore the challenges and significance of text similarity regression models in the context of online social networking services. We delve into the methods and techniques employed to define and find similarities among texts, enabling the extraction of meaningful patterns and insights. Specifically, we categorize text similarity regression models into four distinct types: set-theoretic, sequence-theoretic, real-vector, and end-to-end methods. This categorization is based on the mathematical formalisation of similarity used by each model. Ultimately, our survey aims to provide a comprehensive overview of the interlinkages between independently proposed methods for text similarity. By understanding the strengths and weaknesses of these methods, researchers can make informed decisions when designing novel approaches and algorithms. We hope this survey serves as a valuable resource for advancing the state-of-the-art in addressing the complex problem of text similarity.
引用
收藏
页码:698 / 705
页数:8
相关论文
共 50 条
  • [1] Measurement of Text Similarity: A Survey
    Wang, Jiapeng
    Dong, Yihong
    INFORMATION, 2020, 11 (09) : 1 - 17
  • [2] A survey of Chinese text similarity computation
    Wang, Xiuhong
    Ju, Shiguang
    Wu, Shengli
    INFORMATION RETRIEVAL TECHNOLOGY, 2008, 4993 : 592 - +
  • [3] A survey on the techniques, applications, and performance of short text semantic similarity
    Han, Mengting
    Zhang, Xuan
    Yuan, Xin
    Jiang, Jiahao
    Yun, Wei
    Gao, Chen
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2021, 33 (05):
  • [4] Text mining: identification of similarity of text documents using hybrid similarity model
    K. M. Shiva Prasad
    Iran Journal of Computer Science, 2023, 6 (2) : 123 - 135
  • [5] Similarity between text and RDF
    Schiessrl, Marcelo
    Berardi, Rita
    Brascher, Marisa
    LET'S PUT DATA TO USE: DIGITAL SCHOLARSHIP FOR THE NEXT GENERATION, 2014, : 128 - 130
  • [6] Similarity between text and RDF
    Schiessl, Marcelo
    Berardi, Rita
    Bräscher, Marisa
    Information Services and Use, 2014, 34 (3-4): : 325 - 330
  • [7] Text Similarity Calculations Using Text and Syntactical Structures
    Elhadi, Mohamed T.
    2012 7TH INTERNATIONAL CONFERENCE ON COMPUTING AND CONVERGENCE TECHNOLOGY (ICCCT2012), 2012, : 715 - 719
  • [8] Using similarity network analysis to improve text similarity calculations
    Witschard, Daniel
    Kucher, Kostiantyn
    Jusufi, Ilir
    Kerren, Andreas
    Applied Network Science, 2025, 10 (01)
  • [9] Systematic Characterizations of Text Similarity in Full Text Biomedical Publications
    Sun, Zhaohui
    Errami, Mounir
    Long, Tara
    Renard, Chris
    Choradia, Nishant
    Garner, Harold
    PLOS ONE, 2010, 5 (09): : 1 - 6
  • [10] Research on the Text Length's Effect of the Text Similarity Measurement
    Niu, Yan
    Chen, Yongchao
    INFORMATION AND AUTOMATION, 2011, 86 : 112 - 117