Purpose: The COVID-19 pandemic has disrupted many health systems, causing congestion in some hospital settings. Physical consultation increases the risk of infection for staff and patients and requires a significant amount of health system resources. Researchers around the world are mobilizing to develop effective detection methods to control the spread of the virus and effectively manage the disease. Artificial intelligence (AI) algorithms offer promising alternatives for not just monitoring patient care, but also for assisting in the faster and more accurate diagnosis, prevention, and evaluation of COVID-19. Several investigations have found that this virus has a major impact on voice production due to the respiratory system’s dysfunction. During phonation, COVID-19 patients have more asynchronous, asymmetrical, and limited vocal fold oscillations. Methods: In this paper, we investigate and analyze the effectiveness of the major machine learning algorithms in terms of their capacity to appropriately detect COVID-19 diseases using voice analysis. To do so, we classified five classes: three COVID-19 positives based on the severity of the disease (26 asymptomatic, 36 mild, and 20 moderate subjects), one class of recovered patients with 23 cases, and a class of healthy persons with 38 observations. The records are collected from the Coswara Dataset, which is a crowdsourcing project from the Indian Institute of Science (IIS) aiming to build a diagnostic tool for COVID-19 using audio recordings. After data collection, we extracted the MFCC and the pitch from the frames of the cough records. These acoustic features are mapped directly to decision tree (DT) k-nearest neighbor (kNN) for k equals to 3, 5, and 7, support vector machine (SVM), naive Bayes (NB), quadratic discriminant analysis (QDA), and deep neural network (DNN), or after a feature selection by mRMR, RELIEF, and PCA. Results: The 3NN with all the features produced the best classification results. 3NN identifies the 5 classes with an accuracy of around 94.9%, as well as an f1-score of 94.7%. Conclusion: The proposed method appropriately distinguishes healthy and sick individuals precising the severity of the disease if it is mild, moderate, asymptomatic, or recovered from the disease. The F1-score result represents one of the most accurate measures to quantify the algorithm’s reliability, due to the substantial imbalance between classes in the dataset analysis, and obtaining a score of around 94.7% shows how accurate and precise this screening approach is. © 2023, The Author(s), under exclusive licence to The Brazilian Society of Biomedical Engineering.