Unsupervised domain adaptation is effective in leveraging rich information from the labeled source domain to the unlabeled target domain, with the aim to improving classification performance on the target domain. Recently, optimal transport based methods have received wide attention in this literature. However, in many existing methods, the feature learning phase is separate from the classification phase, and thus, they are lack of mutual benefit from each other. In this paper, we propose a new method to tackle this problem. Specifically, we first design a new classification rule by computing the optimal transport plan between test samples and the prototypes of training set, and predicting the label for each test sample with the maximum class-probability. The classification rule is derived under the Bayesian perspective. To further deal with the dataset bias problem, we then propose to learn a discriminative and shared embedding space, which is alternatively optimized with the optimal transport plan. In particular, the target samples with their pseudo labels are combined with the source domain in a batch-wise manner to update the discriminative subspace progressively. Thus, it is essentially a joint graph matching and graph embedding method under the optimal transport framework. Extensive experiments are conducted on several benchmark datasets including Office- Home, ImageCLEF-DA, Adaptatiope, Office-10, and Office-31, and the accuracies are 72.2%, 91.2%, 68.5%, 93.0% and 93.9%, respectively. These results validate effectiveness of the proposed methods.