Automatic new word extraction method

被引:0
|
作者
Shi, Q
Shen, LQ
Chai, HX
机构
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
New words are very difficult to be extracted automatically for those languages where there is no word boundary in written texts, such as Chinese, Japanese etc. In this paper, we present a statistical method to extract new words from a large amount of corpus with no word boundary. Based on Generalized Suffix Tree (GST) data structure we define NWP (New Word Pattern) and SBP (Segmentation Boundary Pattern) to separate input strings into small pieces, and offer a practical and efficient algorithm to get the proper words from GST.
引用
收藏
页码:865 / 868
页数:4
相关论文
共 50 条
  • [1] An automatic extraction method of word tendency judgement for specific subjects
    Morita, Kazuhiro
    Atlam, El-Sayed
    Fuketa, Masao
    Iwabu, Yuya
    Aoe, Jun-ichi
    INTERNATIONAL JOURNAL OF COMPUTER APPLICATIONS IN TECHNOLOGY, 2009, 35 (2-4) : 281 - 295
  • [2] AUTOMATIC WORD EXTRACTION.
    Anon
    1600, (27):
  • [3] A novel unsupervised method for new word extraction
    Mei, Lili
    Huang, Heyan
    Wei, Xiaochi
    Mao, Xianling
    SCIENCE CHINA-INFORMATION SCIENCES, 2016, 59 (09)
  • [4] A novel unsupervised method for new word extraction
    Lili MEI
    Heyan HUANG
    Xiaochi WEI
    Xianling MAO
    ScienceChina(InformationSciences), 2016, 59 (09) : 11 - 21
  • [5] A new method for baseline extraction of Manchu word
    Zheng, Rui-Rui
    Li, Min
    Xu, Shuang
    Hu, Yan-Xia
    Wu, Bao-Chun
    JOURNAL OF DISCRETE MATHEMATICAL SCIENCES & CRYPTOGRAPHY, 2016, 19 (03): : 523 - 534
  • [6] New automatic liver segmentation and extraction method
    Zhang, Pinzheng
    Xu, Qinzheng
    Wang, Zheng
    MIPPR 2007: MEDICAL IMAGING, PARALLEL PROCESSING OF IMAGES, AND OPTIMIZATION TECHNIQUES, 2007, 6789
  • [7] Study on Automatic Extraction Method of Tibetan New Words
    Sun, Yuan
    Yan, Xiaodong
    Zhao, Xiaobing
    Yang, Guosheng
    2012 8TH INTERNATIONAL CONFERENCE ON COMPUTING AND NETWORKING TECHNOLOGY (ICCNT, INC, ICCIS AND ICMIC), 2012, : 130 - 133
  • [8] A new automatic extraction method of container identity codes
    He, ZW
    Liu, JL
    Ma, HQ
    Li, PH
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2005, 6 (01) : 72 - 78
  • [9] A new automatic extraction method of container identity codes
    He, ZW
    Liu, JL
    Ma, HQ
    Li, PH
    2003 IEEE INTELLIGENT TRANSPORTATION SYSTEMS PROCEEDINGS, VOLS. 1 & 2, 2003, : 1688 - 1691
  • [10] Automatic keyphrase extraction using word embeddings
    Yuxiang Zhang
    Huan Liu
    Suge Wang
    W. H. Ip.
    Wei Fan
    Chunjing Xiao
    Soft Computing, 2020, 24 : 5593 - 5608