Advertising aims to influence consumer preferences, appraisals, action tendencies, and behavior in order to increase sales. These are all components of emotion. In the past, they have been measured through self report or panel discussions. While informative, these approaches are difficult to scale to large numbers of consumers, fail to capture moment-to-moment changes in appraisals that may be predictive of sales, and depend on verbal mediation. We used web-cam technology to sample non-verbal responses to television commercials from four product categories in six different countries. For each participant, head pose, head motion, and more frequent facial expressions like smiling, surprise and disgust were automatically measured at each video frame and aggregated across subjects. Dynamic features from the aggregated series were input to simple linear ensemble classifier with 10-fold cross-validation to predict product sales. Sales were predicted with ROC AUC = 0.75, 95% CI [0.727,0.773] and predictions for unseen categories were consistent for all, but one product groups (ROC AUC varies between 0.74 and 0.83, except for Confections with 0.61). Predictions for unseen countries showed similar pattern: ROC AUC varied between 0.71 and 0.89, with the exception of Russia with ROC AUC 0.53. In comparison with previous attempts, our approach yielded higher overall performance and greater generalization over not modeled factors like country or category. These findings support the feasibility, efficiency, and predictive validity of sales predictions from large-scale sampling of viewers' moment-to-moment responses to commercial media. (C) 2017 Elsevier B.V. All rights reserved.