Despite indisputable advances in reinforcement learning (RL) research, some cognitive and architectural challenges still remain. The primary source of challenges in the current conception of RL stems from the theory's way to define states. Whereas states under laboratory conditions are tractable (due to the Markov property), states in real-world RL are high-dimensional, continuous and partially observable. Hence, effective learning and generalization can be guaranteed if the subset of reward relevant dimensions were correctly identified for each state. Moreover, the computational discrepancy between model-free and model-based RL methods creates a stability-plasticity dilemma in terms of how to guide optimal decision-making control in case of interactive and competitive multiple systems, each of which implements different type of RL methods. By showing behavioral results of how human subjects flexibly define states in a reversal learning paradigm contrary to a simple RL model, we argue that these challenges can be met by infusing the RL framework as an algorithmic theory of human behavior with the strengths of the attractor framework at the level of neural implementation. Our position is supported by the hypothesis that 'attractor states' which are stable patterns of self-sustained and reverberating brain activity, are a manifestation of the collective dynamics of neuronal populations in the brain. With its capacity of pattern-completion along with the ability to link events in temporal order, an attractor network becomes relatively insensitive to noise allowing to account for sparse data which is characteristic to high-dimensional and continuous real-world RL.