We present the Contextual Bandits based Online Markdown Pricing (COMP) model, which maximizes gross margins and clears significant inventory in a real-world non-stationary e-commerce environment. The COMP model effectively handles challenges such as data scarcity and dynamic variables, providing optimal markdown prices across multiple styles with finite inventories. Using a suite of Contextual Bandits algorithms (LinUCB, Vowpal Wabbit, Contextual Thompson Sampling, and Bayes UCB), the COMP model formulates a dual-objective function that cumulatively optimizes margin and Inventory Reduction Rate (IRR) for the entire markdown period. A key contribution of this paper is our unique formulation of the markdown pricing problem as a cumulative model, incorporating a global objective paired with a global inventory constraint, akin to a knapsack problem. Another innovation is the objective function, which integrates both margin and IRR considerations. The COMP model comprehensively addresses practical challenges such as limited data availability, dynamic factors such as competitor prices, fluctuating seasonality, inventory changes, demand fluctuations, personalized pricing for customer segments, inter-item effects, and learning from real-time feedback in online markdown pricing. By learning from feedback, the model continuously adjusts product prices based on customer behavior, resulting in improved pricing decisions over time. Another novelty of the COMP model lies in its ability to optimize pricing for various real-world use cases, while satisfying the inventory constraints. The COMP model is a deployable system which can be scaled to run across diverse product ranges using advanced cloud technologies. Evaluation of the COMP model demonstrates its efficacy. The VW online cover solution yields a 17.24% increase in sales units and a 6.14% improvement in margin, while taking into account customer and competitor effects leads to an 18.65% increase in sales units. Our results show that the COMP model outperforms alternative approaches including non-cumulative Contextual Bandit (CB) models, Reinforcement Learning (RL) models and classical optimization models. In conclusion, the COMP model provides a comprehensive solution for online markdown pricing, addressing practical challenges and offering improved pricing strategies for enhanced profitability.