With the rapid development of modern deep learning technology, deep neural network (DNN)-based mobile applications have also been considered for various areas. However, since mobile devices are not optimized to run the DNN applications due to their limit of computational resources, several computation offloading-based approaches have been introduced to overcome the issue; for DNN models, it was reported that, their elaborate partitioning, which allows that input samples are partially executed on mobile devices and then the edge server processes the rest of the execution, can be effective in improving runtime performance. In addition, to improve communication-efficiency in the offloading scenario, there have been also studies to reduce transmitted data from a mobile device and the edge server by leveraging model compression. However, the existing approaches have the root limitation that the performance eventually depend on that of the architecture of original DNN models. To overcome this, we propose a novel neural architecture search (NAS) method to consider the computation offloading cases. On the top of the existing NAS approaches, we additionally introduce resource and channel selection mask. The resource selection mask effectively divides the operations in the target model into those for a mobile device and the edge server; the channel selection mask allows to transmit only selected channels to the edge server without the reduction of task performance (e.g., accuracy). Based on the two additional masks, for the NAS procedure we introduce a new loss function to take into account end-to-end inference time as well as the task performance which is the original goal of NAS. In the evaluation, the proposed method is compared to existing approaches; we see from the experimental results that our method outperforms both the previous NAS and pruning-based model partitioning approaches.