Machine learning (ML)-based intrusion detection systems (IDSs) are crucial in safeguarding computer networks against malicious activities. However, building an optimal (accurate and high-performance) ML-based IDS, a combination of data sets, feature selectors, and classifiers, is challenging. This article presents a comprehensive empirical analysis to enhance the effectiveness of IDSs by delving into these three critical components: 1) data sets; 2) feature selection; and 3) classification techniques based on regression models and linear support vector machines (LSVMs), respectively. We begin by evaluating six different data sets commonly used in IDS research, identifying their strengths, limitations, and suitability for real-world scenarios. Next, we explore regression-based feature selectors to identify the most relevant features for intrusion detection, enhancing the accuracy and efficiency of the IDSs. Then, we examine various LSVM classifiers, comparing their performance and highlighting their strengths and weaknesses. By combining these components, this study aims to provide a holistic understanding of the intricate relationship between data sets, regression-based feature selectors, and SVM-based linear classifiers, thus aiding researchers and practitioners in designing more effective and robust IDSs. The empirical analysis conducted in this study employs rigorous evaluation metrics and a comprehensive experimental setup to ensure reliable and unbiased results. The insights gained from our investigation can help guide future research and development efforts toward more efficient and reliable ML-based IDSs.