Imagine teaching a dog to fetch a ball. At first, the dog has no idea what “fetch” means. But over time, with a lot of practice and some treats as rewards, the dog learns that fetching the ball leads to a treat. This back-and-forth learning through rewards is similar to how a reinforcement learning agent learns to make decisions in complex environments.
In this section, we’ll break down the basics of reinforcement learning.
Did You Know? Rewards can be positive (for good actions) or negative (for mistakes). This helps guide the agent to learn better choices over time!
Think back to the challenge. What did you do when you first started?
Reinforcement learning follows a basic cycle:
Watch a Demo: See how an RL agent learns to solve an obstacle course.
To see an interactive animation of the agent-environment interaction click the "Interactive Animation" button
Interactive Animation1. Self-Driving Cars
Self-driving cars use reinforcement learning to make decisions on the road. RL algorithms help cars "learn" how to drive by giving them rewards for safe driving actions (like staying in the lane, slowing down at red lights) and penalties for risky ones (like veering off the road). Over time, they improve their ability to make safe, smart decisions.
2. Robotic Control
Robots often use reinforcement learning to train to perform tasks efficiently, getting rewards for successfully completing a task and penalised for mistakes (like coliding with obstacles or taking a long route). With training, robots learn to be faster and more accurate.
You are a warehouse robot (blue) tasked with moving a package. Reach the goal (green) in minimum number of steps without colliding with the obstacles (red).
Reward System:
-1 for each step.
-10 for bumping with obstacles.
+100 for reaching the goal.