The STEM subjects books heavily rely on Non-textual Components (NTCs) such as charts, geometric figures, and equations to demonstrate the underlying complex concepts. However, the accessibility of STEM subjects for Blind and Visually Impaired (BVIP) students is a primary concern, especially in developing countries such as India. BVIP uses assistive technologies (ATs) like optical character recognition (OCR) and screen readers for reading/writing purposes. While parsing, such ATs skip NTCs and mainly rely on alternative texts to describe these visualization components. Integration of effective and automated document layout parsing frameworks for extracting data from non-textual components of digital documents are required with existing ATs for making these NTCs accessible. Although, the primary concern is the absence of an adequately annotated textbook dataset on which layout recognition and other vision-based frameworks can be trained. To improve the accessibility and automated parsing of such books, we introduce a new NCERT5K-IITRPR dataset of National Council of Educational Research and Training (NCERT) school books. Twenty-three annotated books covering more than 5000 pages from the eighth to twelve standards have been considered. The NCERT label objects are structurally different from the existing document layout analysis (DLA) dataset objects and contain diverse label categories. We benchmark the NCERT5K-IITRPR dataset with multiple object detection methods. A systematic analysis of detectors shows the label complexity and fine-tuning necessity of the NCERT5K-IITRPR dataset. We hope that our dataset helps in improving the accessibility of NCERT Books for BVIP students.