Consider the hot dog. While it might come pretty easily to a human, training a robot to prepare and serve a hot dog is no simple task. The robot has to grab the hot dog, place it on the grill, allow it to cook for the right amount of time, place it in the bun, throw at least one condiment on and gently serve it to the waiting recipient. Researchers at Boston University were recently successful in training a robot to do just that.
By utilizing a trial and error system known as reinforcement learning (RL), the researchers were able to develop artificial intelligence that can learn and utilize prior knowledge to get better and better at the tasks it needs to complete. This A.I. could then be used to operate two robotic arms to complete the tasks. The A.I. was tested in computer simulations before it was used to operate the robotic arms.
If the robot learned the hard way not to squeeze the bun too hard, it would never do that again in future trials. That’s pretty much how we learn as we go about life trying new things. In this way, the robot became “self-aware.” It had to learn what needed to be done, how it could be done and how to avoid making the same mistakes it had previously made attempting to complete these tasks.
The researchers developed a simple language that helped them break down each part of the task into smaller tasks. They could write something like, “Eventually turn on grill, then pick sausage and place on grill” to start a task.
Zachary Serlin, one of the researchers who worked on this project, tells Inverse that as long as you used predefined terms, the algorithm would understand what to do.
“You have to define things beforehand,” Serlin says. “As long as I know what you mean by ‘grill’ when you say ‘grill,’ then I can learn to do the thing that has ‘grill’ in it.”
Putting prior knowledge in the algorithm helped limit how the A.I. would attempt to achieve its goals. Instead of letting it try whatever it wants while attempting to complete a task, there were some limits to what it would try baked in. Serlin says others who have used RL to develop their A.I. haven’t always done this.
“In ours, because you have this nice sentence structure, you can put in things like, ‘You can’t pick up two things at once.’ That’s prior knowledge that you have,” Serlin says.
Not only did the robot have to learn how to properly prepare and serve the hot dog—to a person waiting with a plate—it had to make sure it wasn’t making mistakes that could make its actions unsafe.
“There is also this level of safety on top of it, which are called control barrier functions. You let the algorithm run normally with this prior knowledge and the sentence that you give it, and… there is a layer that doesn’t let it do unsafe things,” Serlin says. “This allows it to have this level of safety without ever having to try it.”
Instead of the robotic arm trying to pick up the hot dog from the grill and accidentally forcing its grabbers into the grill, this safety layer would ensure that the robot knows it can only get so close to the grill while it tries to complete this part of the task.
Serlin says this team is going to keep working on this technology to see how they can teach the robot to complete tasks that are more and more complex in nature.