Reinforcement Learning for Non-Stationary Markov Decision Processes: The Blessing of (More) Optimism