Machine Learning Methods for Word Prediction in Brasilian Portuguese

被引：1

作者：

Palazuelos-Cagigas, Sira E. ^{[1
]}

Martin-Sanchez, Jose L. ^{[1
]}

Macias-Guarasa, Javier ^{[1
]}

Garcia-Garcia, Juan C. ^{[1
]}

Cavalieri, Daniel. C. ^{[2
]}

Bastos-Filho, Teodiano F. ^{[2
]}

Sarcinelli-Filho, Mario ^{[2
]}

机构：

[1] Univ Alcala de Henares, Polytech Sch, Dept Elect, Madrid, Spain

[2] Univ Fed Espirito Santo, Dept Elect Engn, Vitoria, ES, Brazil

来源：

EVERYDAY TECHNOLOGY FOR INDEPENDENCE AND CARE | 2011年 / 29卷

关键词：

Machine Learning Methods; Word Prediction in Portuguese; Statistical POS Models; Artificial Neural Networks; Support Vector Machines (SVM); Logistic Regression Models; Algorithm Fusion; Text Editor; Communicator;

D O I：

10.3233/978-1-60750-814-4-424

中图分类号：

R49 [康复医学];

学科分类号：

100215 ;

摘要：

Objective Computers have become essential in the life of many people with disabilities. A common activity that can be computer assisted is text generation. People who cannot accurately control their extremities (due to cerebral palsy, etc.) may use computers as writing tools, and if they have problems to speak, they may use a computer to communicate. In both cases, the generation of text is a necessary activity that can be physically demanding and extremely slow. Word prediction methods are commonly used to assist in this task. The objective of our work is to improve the quality of a word prediction system for Brazilian Portuguese, in order to reduce the effort and time needed to write texts. Main content The selection of the predicted words is partly based on a two steps process. Firstly, the possible parts-of-speech (POS) of the next word are predicted from the POS of the previous words. Secondly, the list of predicted words is generated from these predicted POS and the information contained in the lexicons. In this paper we present prediction algorithms based on machine learning methods adapted to POS prediction (the first step of the process). Specifically, this work describes the use of artificial neural networks, support vector machines and regularized logistic models to predict word POS in Brazilian Portuguese, based on the POS of the 1, 2, 3 or 4 previous words. We also briefly describe a meta-learning strategy for algorithm selection and a fusion algorithm to combine them. Results These methods increase the word prediction quality, saving a maximum of 38.26% of the keystrokes needed to write the text (a relative improvement of 9.85% with respect to the unigram method), and correctly predicting 79.95% of the words in the experiments performed (with a maximum of 28.5% of hit rate). Conclusions Besides presenting evidences that such methods can be adapted to predict word POS, it is also shown that they are robust, consistent and easy to incorporate into a general word prediction system. In future works, the Portuguese prediction system will be included in PredWin, a freely available text editor and communicator already working for Spanish. The whole word prediction system will also be adapted, trained and evaluated in other languages, such as English, and included in PredWin in case good results were obtained.

引用

页码：424 / 431

页数：8

共 50 条

[1] Machine Learning Approaches applied to Brazilian Portuguese Word Prediction
Cavalieri, Daniel Cruz
Bastos Filho, Teodiano Freire
Palazuelos Cagigas, Sira Elena
Guarasa, Javier Macias
Martin Sanchez, Jose L.
PROCESAMIENTO DEL LENGUAJE NATURAL, 2010, (45): : 87 - 94
[2] Word Categorization of Corporate Annual Reports for Bankruptcy Prediction by Machine Learning Methods
Hajek, Petr
Olej, Vladimir
TEXT, SPEECH, AND DIALOGUE (TSD 2015), 2015, 9302 : 122 - 130
[3] Sea in Portuguese and Brasilian literatures
Talan, Nikica
KNJIZEVNA SMOTRA, 2007, 39 (03): : 15 - 28
[4] Severity Prediction with Machine Learning Methods
Geyik, Buket
Kara, Medine
2ND INTERNATIONAL CONGRESS ON HUMAN-COMPUTER INTERACTION, OPTIMIZATION AND ROBOTIC APPLICATIONS (HORA 2020), 2020, : 382 - 388
[5] The impact of brasilian Portuguese through internet for Portuguese basic education
Pastorello, Adriana
REVISTA LUSOFONA DE EDUCACAO, 2010, (15): : 175 - +
[6] Machine learning methods for metabolic pathway prediction
Joseph M Dale
Liviu Popescu
Peter D Karp
BMC Bioinformatics, 11 (1)
[7] A survey on machine learning methods for churn prediction
Louis Geiler
Séverine Affeldt
Mohamed Nadif
International Journal of Data Science and Analytics, 2022, 14 : 217 - 242
[8] Machine Learning Methods for Quality Prediction in Production
Sankhye, Sidharth
Hu, Guiping
LOGISTICS-BASEL, 2020, 4 (04):
[9] Machine Learning Methods for Septic Shock Prediction
Darwiche, Aiman
Mukherjee, Sumitra
AIVR 2018: 2018 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND VIRTUAL REALITY, 2018, : 104 - 110
[10] Machine learning methods for metabolic pathway prediction
Dale, Joseph M.
Popescu, Liviu
Karp, Peter D.
BMC Bioinformatics, 2010, 11

← 1 2 3 4 5 →