Is Model-Free Learning Nearly Optimal for Non-Stationary RL?