In this paper, we study a firm's dynamic pricing problem in the presence of unknown and time-varying heterogeneity in customers' preferences for quality. The firm offers a standard product as well as a premium product to deal with this heterogeneity. First, we consider a benchmark case in which the transition structure of customer heterogeneity is known. In this case, we analyze the firm's optimal pricing policy and characterize its key structural properties. Thereafter, we investigate the case of unknown market transition structure and design a simple and practically implementable policy, called the bounded learning policy, which is a combination of two policies that perform poorly in isolation. Measuring performance by regret (i.e., the revenue loss relative to a clairvoyant who knows the underlying changes in the market), we prove that our bounded learning policy achieves the fastest possible convergence rate of regret in terms of the frequency of market shifts. Thus, our policy performs well without relying on precise knowledge of the market transition structure.