A TEXT FEATURE WORD EXTRACTION METHOD APPLIED TO ENTERPRISE COMPETITIVE INTELLIGENCE SYSTEM

被引:0
|
作者
Zhang, Zhiwei [1 ]
Zhang, Haining [1 ]
Zhu, Guangliang [1 ]
机构
[1] Suzhou Univ, Sch Informat & Engn, Suzhou, Peoples R China
来源
UNIVERSITY POLITEHNICA OF BUCHAREST SCIENTIFIC BULLETIN SERIES C-ELECTRICAL ENGINEERING AND COMPUTER SCIENCE | 2023年 / 85卷 / 04期
关键词
feature word extraction; information extraction; part-of-speech tagging; text classification; competitive intelligence; natural language processing;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
To acquire industry feature words, professionals need to collect and analyze the feature word sets from industry sites according to their experience and finally merge the feature word sets at different sites to form industry feature word sets. This method is characterized by a large workload, the difficulty in ensuring the accuracy of text classification, and the necessity to adjust feature words by repeating the above process. To solve the above problems, a universal feature word extraction scheme and a system framework were first proposed in this study based on the actual requirements of the enterprise competitive intelligence system. Then, the key problems involved in the process of feature word extraction were elaborated in detail. Finally, the traditional feature word weight was improved on basis of predecessors' results, and the Sogou lexicon was introduced to correct the word frequency and part of speech. A good classification effect was achieved through experiments with the KNN classifier, which was verified using the vector space model (VSM), and a high average F1 value was acquired.
引用
收藏
页码:221 / 234
页数:14
相关论文
共 50 条
  • [21] A Novel Text Feature Weight Calculation Method Applied to Power Field
    Du, Haizhou
    Chen, Zhengbo
    Li, Qifen
    Yang, Yongwen
    2016 IEEE TRUSTCOM/BIGDATASE/ISPA, 2016, : 1700 - 1705
  • [22] A Method of Feature Selection Based on Word2Vec in Text Categorization
    Tian, Wenfeng
    Li, Jun
    Li, Hongguang
    2018 37TH CHINESE CONTROL CONFERENCE (CCC), 2018, : 9452 - 9455
  • [23] A Document Feature Extraction Method Based on Concept-word List
    Zhu, Zheng-yu
    He, Jie
    Dong, Shu-jia
    Yu, Chun-lei
    MANUFACTURING SYSTEMS AND INDUSTRY APPLICATIONS, 2011, 267 : 386 - 392
  • [24] A Multiscale Feature Extraction Method for Text-independent Speaker Recognition
    Chen Zhigao
    Li Peng
    Xiao Runqiu
    Li Ta
    Wang Wenchao
    JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2021, 43 (11) : 3266 - 3271
  • [25] A Text Feature Based Automatic Keyword Extraction Method for Single Documents
    Campos, Ricardo
    Mangaravite, Vitor
    Pasquali, Arian
    Jorge, Alipio Mario
    Nunes, Celia
    Jatowt, Adam
    ADVANCES IN INFORMATION RETRIEVAL (ECIR 2018), 2018, 10772 : 684 - 691
  • [26] A Feature Extraction Method Using Base Phrase and keyword In Chinese Text
    Li, Xin-fu
    Zhao, Lei-lei
    Wu, Li-hong
    2008 3RD INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEM AND KNOWLEDGE ENGINEERING, VOLS 1 AND 2, 2008, : 680 - +
  • [27] A novel method for spoken text feature extraction in semantic video retrieval
    Cao, Juan
    Li, Jintao
    Zhang, Yongdong
    Tang, Sheng
    Advances in Multimedia Information Processing - PCM 2006, Proceedings, 2006, 4261 : 270 - 278
  • [28] An Innovative Method of Feature Extraction for Text Classification Using PART Classifier
    Dhar, Ankita
    Dash, Niladri Sekhar
    Roy, Kaushik
    INFORMATION, COMMUNICATION AND COMPUTING TECHNOLOGY, ICICCT 2018, 2019, 835 : 131 - 138
  • [29] A feature extraction method for personal identification system
    Takimoto, H
    Mitsukura, Y
    Fukumi, M
    Akamatsu, N
    KNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 1, PROCEEDINGS, 2003, 2773 : 601 - 608
  • [30] Japanese word sense disambiguation system based on deep feature extraction
    Lei, Xue-Mei
    Wang, Da-Liang
    Takaaki, Tanaka
    Zeng, Guang-Ping
    Beijing Keji Daxue Xuebao/Journal of University of Science and Technology Beijing, 2010, 32 (02): : 263 - 269