Automatic Extraction of Web Page Text Information Based on Network Topology Coincidence Degree

被引：2

作者：

Shu, Zhinian ^{[1
]}

Li, Xiaorong ^{[1
]}

机构：

[1] Chaohu Univ, Coll Informat Engn, Chaohu 238000, Peoples R China

来源：

WIRELESS COMMUNICATIONS & MOBILE COMPUTING | 2022年 / 2022卷

关键词：

INTERNET;

D O I：

10.1155/2022/9220661

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In order to effectively solve the above problems, an automatic extraction method of web text information based on network topology coincidence degree is proposed. Search engine, web crawler, and hypertext tag are used to classify web text information, and then, dimensionality reduction is carried out. After processing, the similarity of different features of web page text information is calculated, the similarity is sorted, and the similar text information is extracted according to the correlation based on segment estimation. The experimental results show that the designed method can simplify the complexity of the associated information of the data set and improve the amount of data collection and the success rate of information collection.

引用

页数：10

共 50 条

[31] Web Page Information Extraction System by Using Deep Learning
Pakyurek, Muhammet
Sezgin, Mehmet Selman
Kulac, Selman
2019 4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENGINEERING (UBMK), 2019, : 145 - 149
[32] Web Page Ranking Based on Text Content and Link Information Using Data Mining Techniques
Naamha, Esraa Q.
Abdulmunim, Matheel E.
ARO-THE SCIENTIFIC JOURNAL OF KOYA UNIVERSITY, 2024, 12 (01): : 29 - 40
[33] A Method to Discover Sensitive Information in Classified Network Based on Web Information Extraction
Zhang, Jianping
Li, Hongmin
Lu, Min
Ke, Mingmin
2016 FIRST IEEE INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATION AND THE INTERNET (ICCCI 2016), 2016, : 262 - 265
[34] Earthquake Information Extraction and Comparison from Different Sources Based on Web Text
Han, Xuehua
Wang, Juanle
ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2019, 8 (06)
[35] An information extraction method based on improved mixed text density web pages
Zhou, Yuan
Yin, Xiaojun
Yan, Jingchen
EXPERT SYSTEMS, 2024, 41 (06)
[36] Automatic Summarization of Web Page Based on Statistics and Structure
Zheng, Shuangyi
Yu, Junyang
KNOWLEDGE DISCOVERY AND DATA MINING, 2012, 135 : 643 - +
[37] SVM based Chinese web page automatic classification
Liang, JZ
2003 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-5, PROCEEDINGS, 2003, : 2265 - 2268
[38] An Open Relation Extraction System for Web Text Information
Li, Huagang
Liu, Bo
APPLIED SCIENCES-BASEL, 2022, 12 (11):
[39] TEXT: Automatic Template Extraction from Heterogeneous Web Pages
Kim, Chulyun
Shim, Kyuseok
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2011, 23 (04) : 612 - 626
[40] HTML text segmentation for Web page summarization by a key sentence extraction method
Sunayama, Wataru
Iyama, Akihiro
Yachida, Masahiko
Systems and Computers in Japan, 2006, 37 (07): : 26 - 36

← 1 2 3 4 5 →