Automatic Extraction of Web Page Text Information Based on Network Topology Coincidence Degree

被引：2

作者：

Shu, Zhinian ^{[1
]}

Li, Xiaorong ^{[1
]}

机构：

[1] Chaohu Univ, Coll Informat Engn, Chaohu 238000, Peoples R China

来源：

WIRELESS COMMUNICATIONS & MOBILE COMPUTING | 2022年 / 2022卷

关键词：

INTERNET;

D O I：

10.1155/2022/9220661

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In order to effectively solve the above problems, an automatic extraction method of web text information based on network topology coincidence degree is proposed. Search engine, web crawler, and hypertext tag are used to classify web text information, and then, dimensionality reduction is carried out. After processing, the similarity of different features of web page text information is calculated, the similarity is sorted, and the similar text information is extracted according to the correlation based on segment estimation. The experimental results show that the designed method can simplify the complexity of the associated information of the data set and improve the amount of data collection and the success rate of information collection.

引用

页数：10

共 50 条

[1] A Method of Automatic Web Information Extraction Based on Page Clustering
Yang, Tianqi
Qiu, Taofen
2011 9TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA 2011), 2011, : 390 - 393
[2] A novel web page text information extraction method
Wang, Chongjun
Wei, Peng
PROCEEDINGS OF 2019 IEEE 3RD INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC 2019), 2019, : 2213 - 2218
[3] Automatic Summarization and Keyword Extraction from Web Page or Text File
You, Xiangdong
2019 IEEE 2ND INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION ENGINEERING TECHNOLOGY (CCET), 2019, : 154 - 158
[4] Automatic Web Information Extraction Based on Rules
Hu, Fanghuai
Ruan, Tong
Shao, Zhiqing
Ding, Jun
WEB INFORMATION SYSTEMS ENGINEERING - WISE 2011, 2011, 6997 : 265 - 272
[5] Text-Based Web Page Classification with Use of Visual Information
Bartik, Vladimir
2010 INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM 2010), 2010, : 416 - 420
[6] Web text information extraction based on wrapper model
Wang, Jingpu
Lin, Yaping
Zhou, Shunxian
2005 International Symposium on Computer Science and Technology, Proceedings, 2005, : 607 - 612
[7] A Method of Web Information Automatic Extraction Based on XML
Gu, Junhua
Song, Jie
Zhang, Na
Liu, Yanliu
INFORMATION TECHNOLOGY FOR MANUFACTURING SYSTEMS, PTS 1 AND 2, 2010, : 178 - 183
[8] Web Page Information Extraction Service Based on Graph Convolutional Neural Network and Multimodal Data Fusion
Zhang, Mingzhu
Yang, Zhongguo
Ali, Sikandar
Ding, Weilong
2021 IEEE INTERNATIONAL CONFERENCE ON WEB SERVICES, ICWS 2021, 2021, : 681 - 687
[9] Automatic extraction and verification of page transitions in a Web application
Kubo, Atsuto
Washizaki, Hironori
Fukazawa, Yoshiaki
14TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE, PROCEEDINGS, 2007, : 350 - +
[10] Study of Web Page Information Topic Extraction Technology Based on Vision
Li, Qingshui
Wu, Kai
PROCEEDINGS OF 2010 3RD IEEE INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY, VOL 9 (ICCSIT 2010), 2010, : 781 - 784

← 1 2 3 4 5 →