Machine Learning Methods for Word Prediction in Brasilian Portuguese

被引:1
|
作者
Palazuelos-Cagigas, Sira E. [1 ]
Martin-Sanchez, Jose L. [1 ]
Macias-Guarasa, Javier [1 ]
Garcia-Garcia, Juan C. [1 ]
Cavalieri, Daniel. C. [2 ]
Bastos-Filho, Teodiano F. [2 ]
Sarcinelli-Filho, Mario [2 ]
机构
[1] Univ Alcala de Henares, Polytech Sch, Dept Elect, Madrid, Spain
[2] Univ Fed Espirito Santo, Dept Elect Engn, Vitoria, ES, Brazil
来源
EVERYDAY TECHNOLOGY FOR INDEPENDENCE AND CARE | 2011年 / 29卷
关键词
Machine Learning Methods; Word Prediction in Portuguese; Statistical POS Models; Artificial Neural Networks; Support Vector Machines (SVM); Logistic Regression Models; Algorithm Fusion; Text Editor; Communicator;
D O I
10.3233/978-1-60750-814-4-424
中图分类号
R49 [康复医学];
学科分类号
100215 ;
摘要
Objective Computers have become essential in the life of many people with disabilities. A common activity that can be computer assisted is text generation. People who cannot accurately control their extremities (due to cerebral palsy, etc.) may use computers as writing tools, and if they have problems to speak, they may use a computer to communicate. In both cases, the generation of text is a necessary activity that can be physically demanding and extremely slow. Word prediction methods are commonly used to assist in this task. The objective of our work is to improve the quality of a word prediction system for Brazilian Portuguese, in order to reduce the effort and time needed to write texts. Main content The selection of the predicted words is partly based on a two steps process. Firstly, the possible parts-of-speech (POS) of the next word are predicted from the POS of the previous words. Secondly, the list of predicted words is generated from these predicted POS and the information contained in the lexicons. In this paper we present prediction algorithms based on machine learning methods adapted to POS prediction (the first step of the process). Specifically, this work describes the use of artificial neural networks, support vector machines and regularized logistic models to predict word POS in Brazilian Portuguese, based on the POS of the 1, 2, 3 or 4 previous words. We also briefly describe a meta-learning strategy for algorithm selection and a fusion algorithm to combine them. Results These methods increase the word prediction quality, saving a maximum of 38.26% of the keystrokes needed to write the text (a relative improvement of 9.85% with respect to the unigram method), and correctly predicting 79.95% of the words in the experiments performed (with a maximum of 28.5% of hit rate). Conclusions Besides presenting evidences that such methods can be adapted to predict word POS, it is also shown that they are robust, consistent and easy to incorporate into a general word prediction system. In future works, the Portuguese prediction system will be included in PredWin, a freely available text editor and communicator already working for Spanish. The whole word prediction system will also be adapted, trained and evaluated in other languages, such as English, and included in PredWin in case good results were obtained.
引用
收藏
页码:424 / 431
页数:8
相关论文
共 50 条
  • [1] Machine Learning Approaches applied to Brazilian Portuguese Word Prediction
    Cavalieri, Daniel Cruz
    Bastos Filho, Teodiano Freire
    Palazuelos Cagigas, Sira Elena
    Guarasa, Javier Macias
    Martin Sanchez, Jose L.
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2010, (45): : 87 - 94
  • [2] Word Categorization of Corporate Annual Reports for Bankruptcy Prediction by Machine Learning Methods
    Hajek, Petr
    Olej, Vladimir
    TEXT, SPEECH, AND DIALOGUE (TSD 2015), 2015, 9302 : 122 - 130
  • [3] Sea in Portuguese and Brasilian literatures
    Talan, Nikica
    KNJIZEVNA SMOTRA, 2007, 39 (03): : 15 - 28
  • [4] Severity Prediction with Machine Learning Methods
    Geyik, Buket
    Kara, Medine
    2ND INTERNATIONAL CONGRESS ON HUMAN-COMPUTER INTERACTION, OPTIMIZATION AND ROBOTIC APPLICATIONS (HORA 2020), 2020, : 382 - 388
  • [5] The impact of brasilian Portuguese through internet for Portuguese basic education
    Pastorello, Adriana
    REVISTA LUSOFONA DE EDUCACAO, 2010, (15): : 175 - +
  • [6] Machine learning methods for metabolic pathway prediction
    Joseph M Dale
    Liviu Popescu
    Peter D Karp
    BMC Bioinformatics, 11 (1)
  • [7] A survey on machine learning methods for churn prediction
    Louis Geiler
    Séverine Affeldt
    Mohamed Nadif
    International Journal of Data Science and Analytics, 2022, 14 : 217 - 242
  • [8] Machine Learning Methods for Quality Prediction in Production
    Sankhye, Sidharth
    Hu, Guiping
    LOGISTICS-BASEL, 2020, 4 (04):
  • [9] Machine Learning Methods for Septic Shock Prediction
    Darwiche, Aiman
    Mukherjee, Sumitra
    AIVR 2018: 2018 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND VIRTUAL REALITY, 2018, : 104 - 110
  • [10] Machine learning methods for metabolic pathway prediction
    Dale, Joseph M.
    Popescu, Liviu
    Karp, Peter D.
    BMC Bioinformatics, 2010, 11