Reinforcement learning an introduction exercise solutions