With the rapid growth of social media and online news sources, the spread of fake news (FN) has become a significant concern. Classifying FN requires sophisticated approaches to discern nuances, patterns of misinformation, and evolving strategies of disinformation spreaders. Furthermore, contextualizing news texts enables a more in-depth understanding of the nature of the news beyond surface analyses and, thus, more accurate classifications. This study proposes approaches based on Attention-based Deep Multiple Instance Learning (ADMIL) for Fake News Detection (FND), which treats texts through contextual embedding. The proposed approaches leverage the learning capabilities of the MIL model by using an integrated attention mechanism that adaptively focuses on significant instances and dynamically adjusts attention weights. Embeddings provided by state-of-the-art contextual Neural Language Models (cNLMs) such as DeBERTa, SGPT, Flair, and GPT3.5-based ADA-002 contribute to developing ADMIL-based detection models. These embeddings support the learning process by enabling the model to extract deeper meaning from the data. This study also presents novel approaches that integrate the feature extraction layers of CNNs, called Cony-ADMIL, - ADMIL , to improve classification ability by improving extraction resolution. Experimental studies were conducted on two comprehensive FN datasets, LIAR and McIntire, to evaluate the effectiveness of the proposed approaches. In McIntire's dataset, the Cony-ADMIL - ADMIL classifier, utilizing the DeBERTa, achieved an F1 1 - score of 97%, while in the LIAR dataset, employing the ADA-002, it attained a performance of over 93%. These findings demonstrate that integrating the Cony - ADMIL model with DeBERTa and ADA-002 cNLMs yields superior performance across various metrics. As a pioneering research towards new models for this domain, this study lays a foundation for future research and confirms that the proposed approaches are practical for FND tasks.