Automatic new word extraction method

被引:0
|
作者
Shi, Q
Shen, LQ
Chai, HX
机构
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
New words are very difficult to be extracted automatically for those languages where there is no word boundary in written texts, such as Chinese, Japanese etc. In this paper, we present a statistical method to extract new words from a large amount of corpus with no word boundary. Based on Generalized Suffix Tree (GST) data structure we define NWP (New Word Pattern) and SBP (Segmentation Boundary Pattern) to separate input strings into small pieces, and offer a practical and efficient algorithm to get the proper words from GST.
引用
收藏
页码:865 / 868
页数:4
相关论文
共 50 条
  • [21] Extraction method of the word meaning class
    Tsuda, Kazuhiko
    Nakamura, Masami
    International Conference on Knowledge-Based Intelligent Electronic Systems, Proceedings, KES, 1999, : 534 - 537
  • [22] A New Method of Automatic Extraction of RTN and OMI-friendly implementation
    Xiao, Yu
    Zhang, Chenyang
    Wang, Da
    Xue, Yongkang
    Ren, Pengpeng
    Ji, Zhigang
    2024 IEEE INTERNATIONAL RELIABILITY PHYSICS SYMPOSIUM, IRPS 2024, 2024,
  • [23] AUTOMATIC CHINESE SENTIMENT WORD EXTRACTION BASED ON MAXIMUM ENTROPY
    Li, Si
    He, Hui
    Xu, Wei-Ran
    Guo, Jun
    PROCEEDINGS OF 2009 INTERNATIONAL CONFERENCE ON WAVELET ANALYSIS AND PATTERN RECOGNITION, 2009, : 437 - 441
  • [24] Automatic summarization oriented Chinese word extraction and statistics system
    School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
    不详
    不详
    不详
    J. Comput. Inf. Syst., 2007, 3 (941-948): : 941 - 948
  • [25] Word Concept Extraction Using HOSVD for Automatic Text Summarization
    Biyabangard, Atiyeh
    Abadeh, Mohammad Saniee
    2015 AI & ROBOTICS (IRANOPEN), 2015,
  • [26] An automatic wheel contour extraction method
    Zeng, Hui
    Wu, Han
    Wang, Xiuqing
    Sensors and Transducers, 2014, 165 (02): : 61 - 67
  • [27] Automatic Keyword Extraction: An Ensemble Method
    Pay, Tayfun
    Lucci, Stephen
    2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 4816 - 4818
  • [28] A method for the continuous automatic extraction of soils
    Russel, JC
    SOIL SCIENCE, 1933, 36 (06) : 447 - 450
  • [29] Feature Word Extraction Method for Book Difficulty
    Suzuki, Masaaki
    Saitoh, Fumiaki
    2022 JOINT 12TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS AND 23RD INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (SCIS&ISIS), 2022,
  • [30] A New Fourier-Moments based Video Word and Character Extraction Method for Recognition
    Rajendran, Deepak
    Shivakumara, Palaiahnakote
    Su, Bolan
    Lu, Shijian
    Tan, Chew Lim
    11TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2011), 2011, : 1165 - 1169