Modeless Japanese Input Method Using Multiple Character Sequence Features

被引:4
|
作者
Ikegami, Yukino [1 ]
Sakurai, Yoshitaka [1 ]
Tsuruta, Setsuo [1 ]
机构
[1] Tokyo Denki Univ, Inzai, Japan
关键词
modeless Japanese input; multiple character sequence features; multilingual text; n-gram;
D O I
10.1109/SITIS.2012.93
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Recently, the rapid growth of globalization requires writing a large number of multilingual texts. However, Japanese PC users need to switch the input mode between Japanese and the Latin alphabet on conventional Japanese input method. That is cumbersome. Meanwhile, the solution system using a dictionary is hard to maintain because new words are created every year with high frequency. This paper proposes a modeless Japanese input method which automatically switches the input mode without using a dictionary. Using the model called "multiple character sequence features", this method discriminates whether to convert alphabet into Kana or not. There are multiple character sequence features, namely, character surface features and character type features both based on n-gram. These model features are learned by a Support Vector Machine from corpora especially from those of a large number of living words on Web. The evaluation of this method showed that the statistical accuracy by F-measure for both chatting texts and news texts was over 90% (mostly over 99%).
引用
收藏
页码:613 / 618
页数:6
相关论文
共 50 条
  • [31] Online handwriting character recognition method using directional and direction-change features
    Okamoto, M
    Yamamoto, K
    [J]. INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 1999, 13 (07) : 1041 - 1059
  • [32] An efficient method of cast shadow removal using multiple features
    Chu Tang
    M. Omair Ahmad
    Chunyan Wang
    [J]. Signal, Image and Video Processing, 2013, 7 : 695 - 703
  • [33] An efficient method of cast shadow removal using multiple features
    Tang, Chu
    Ahmad, M. Omair
    Wang, Chunyan
    [J]. SIGNAL IMAGE AND VIDEO PROCESSING, 2013, 7 (04) : 695 - 703
  • [34] Devanagari Offline Handwritten Numeral and Character Recognition using Multiple Features and Neural Network Classifier
    Dongre, Vikas J.
    Mankar, Vijay H.
    [J]. 2015 2ND INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT (INDIACOM), 2015, : 425 - 431
  • [35] A Method to Identify the Cause of Misrecognition for Offline Handwritten Japanese Character Recognition using Deep Learning
    Gyohten, Keiji
    Ohki, Hidehiro
    Takami, Toshiya
    [J]. ICPRAM: PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION APPLICATIONS AND METHODS, 2020, : 446 - 452
  • [36] A MULTIPLE SEQUENCE COMPARISON METHOD
    WONG, AKC
    CHAN, SC
    CHIU, DKY
    [J]. BULLETIN OF MATHEMATICAL BIOLOGY, 1993, 55 (02) : 465 - 486
  • [37] A New Japanese Input Method for Virtual Reality Applications
    Komiya, Kosuke
    Nakajima, Tatsuo
    [J]. HUMAN-COMPUTER INTERACTION: INTERACTION TECHNOLOGIES, HCI INTERNATIONAL 2018, PT III, 2018, 10903 : 43 - 55
  • [38] Driving method of multiple ultrasonic motors using common line signal input
    Matsuzawa, Kohei
    Takemura, Kenjiro
    [J]. SENSORS AND ACTUATORS A-PHYSICAL, 2010, 161 (1-2) : 210 - 216
  • [39] Treatment of multiple input uncertainties using the scaled boundary finite element method
    Dsouza, Shaima M.
    Varghese, Tittu M.
    Ooi, Ean Tat
    Natarajan, Sundararajan
    Bordas, Stephane P. A.
    [J]. APPLIED MATHEMATICAL MODELLING, 2021, 99 : 538 - 554
  • [40] A Recognition Method of RMB Numbers Based on Character Features
    Zhu, Xuejiao
    Ren, Mingwu
    [J]. PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON INFORMATION, ELECTRONICS AND COMPUTER, 2014, 59 : 51 - 54