Second-Order Text Matching Algorithm for Agricultural Text

被引：0

作者：

Sun, Xiaoyang ^{[1
]}

Song, Yunsheng ^{[1
,2
]}

Huang, Jianing ^{[1
]}

机构：

[1] Shandong Agr Univ, Sch Informat Sci & Engn, Tai An 271018, Peoples R China

[2] Minist Agr & Rural Affairs, Key Lab Huang Huai Hai Smart Agr Technol, Tai An 271018, Peoples R China

来源：

APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 16期

关键词：

natural language processing; deep learning; text matching; agriculture text;

D O I：

10.3390/app14167012

中图分类号：

O6 [化学];

学科分类号：

0703 ;

摘要：

Text matching promotes the research and application of deep understanding of text information, and it provides the basis for information retrieval, recommendation systems and natural language processing by exploring the similar structures in text data. Owning to the outstanding performance and automatically extract text features for the target, the methods based-pre-training models gradually become the mainstream. However, such models usually suffer from the disadvantages of slow retrieval speed and low running efficiency. On the other hand, previous text matching algorithms have mainly focused on horizontal domain research, and there are relatively few vertical domain algorithms for agricultural text, which need to be further investigated. To address this issue, a second-order text matching algorithm has been developed. This paper first obtains a large amount of text about typical agricultural crops and constructs a database by using web crawlers and querying relevant textbooks, etc. Then BM25 algorithm is used to generate a candidate set and BERT model is used to filter the optimal match based on the candidate set. Experiments have shown that the Precision@1 of this second-order algorithm can reach 88.34% on the dataset constructed in this paper, and the average time to match a piece of text is only 2.02 s. Compared with BERT model and BM25 algorithm, there is an increase of 8.81% and 13.73% in Precision@1 respectively. In terms of the average time required for matching a text, it is 55.2 s faster than BERT model and only 2 s slower than BM25 algorithm. It can improve the efficiency and accuracy of agricultural information retrieval, agricultural decision support, agricultural market analysis, etc., and promote the sustainable development of agriculture.

引用

下载

页数：20

共 50 条

[1] Monadic second-order definable text languages
Hoogeboom, HJ
tenPas, P
THEORY OF COMPUTING SYSTEMS, 1997, 30 (04) : 335 - 354
[2] Monadic second-order definable text languages
Hoogeboom H.J.
Ten Pas P.
Theory of Computing Systems, 1997, 30 (4) : 335 - 354
[3] FIRST-ORDER, SECOND-ORDER AND THIRD-ORDER ENTROPIES OF ARABIC TEXT
WANAS, MA
ZAYED, AI
SHAKER, MM
TAHA, EH
IEEE TRANSACTIONS ON INFORMATION THEORY, 1976, 22 (01) : 123 - 123
[4] Manuscript Text Line Detection and Segmentation using Second-Order Derivatives
Aldavert, David
Rusinol, Marcal
2018 13TH IAPR INTERNATIONAL WORKSHOP ON DOCUMENT ANALYSIS SYSTEMS (DAS), 2018, : 293 - 298
[5] Text information extraction based on the second-order hidden Markov model
College of Computer and Communication, Hunan University, Changsha 410082, China
不详
Tien Tzu Hsueh Pao, 2007, 11 (2226-2231):
[6] Efficient second-order matching
Curien, R
Qian, ZY
Shi, H
REWRITING TECHNIQUES AND APPLICATIONS, 1996, 1103 : 317 - 331
[7] Second-order probability matching priors
Mukerjee, R
Ghosh, M
BIOMETRIKA, 1997, 84 (04) : 970 - 975
[8] Matching modulo superdevelopments application to second-order matching
Faure, Germain
Logic for Programming, Artificial Intelligence, and Reasoning, Proceedings, 2006, 4246 : 60 - 74
[9] Second-order perceptron algorithm
Cesa-Bianchi, N
Conconi, A
Gentile, C
SIAM JOURNAL ON COMPUTING, 2005, 34 (03) : 640 - 668
[10] An Online Map Matching Algorithm Based on Second-Order Hidden Markov Model
Fu, Xiao
Zhang, Jiaxu
Zhang, Yue
JOURNAL OF ADVANCED TRANSPORTATION, 2021, 2021

← 1 2 3 4 5 →