The emergence of technologies that produce massive amounts of data such as gene sequencing and distributed sensor systems, created an urgent need to develop statistical methodologies and computational tools to deal with Big Data. Linear regression, which is arguably among the most commonly used inferential tools in statistics, was not designed with Big Data applications in mind. Modern applications in which the number of predictors is often in the thousands, and may even be in the millions, renders the traditional solution to regression analysis, useless. Thus, before any regression analysis can be done, it is necessary to begin with "variable selection". The goal of any good variable selection algorithm is to select only the variables which are useful in predicting the outcome. At the same time, such algorithms have to be computationally efficient. The empirical Bayes approach provides a sound statistical framework for developing variable selection methods. Empirical Bayes allows to "borrow strength" across predictors, thus increasing the power to detect significant ones, while controlling the false discovery rate. While the modeling approach is Bayesian, the crucial step in the empirical Bayes approach replaces the potentially cumbersome and slow integration via Monte Carlo simulations with a simple approximation to the posterior distribution of the regression coefficients. This article is categorized under: Statistical Models > Bayesian Models