When children first learn to crawl, walk, and run it is a process full of trial and error — expressed with frustrating cries and bumped heads.
This tender learning process from early childhood may seem like an innately human experience, but it’s actually incredibly similar to what engineers at the University of California, Berkeley sent their bipedal robot Cassie through in order to teach it to walk.
Dancing and fighting robots, like those made by and parodied of Boston Dynamics’ robots, have taken the internet by storm in the past few years. But what these videos don’t show are the fine-tuned and choreographed movements often lurking in their code.
Zhongyu Li is a Ph.D. candidate at the University of Berekely studying robotic locomotion. He tells Inverse that while dancing robots might look cool, programming these kinds of movements to work in an uncontrolled environment (like a campus tour or even a disaster zone) would be a logistical nightmare.
“The reason is that building a precise model of something like a Cassie is very challenging because Cassie is [a system] with lots of degrees of freedom system,” Li explains, referring to Cassie’s independent variables. “It is not computationally feasible to compute the entire [movement] model online for the real-time control.”
To train an A.I. to take its first steps, maybe it needs to be treated like a child. And Cassie’s first steps through campus are not only a moment of pride for its doting parents, but an important moment for robotic locomotion as well thanks to the reinforcement learning whizzing through Cassie’s “brain.”
What is reinforcement learning?
Despite not being an incredibly common tool for robotic locomotion, reinforcement learning is a long-standing guard of machine learning in many other applications, including path planning in self-driving cars and the strange visual works of Google DeepMind.
Historically, reinforcement learning can be traced back to the 1960s when Britain’s “father of artificial intelligence,” Donald Michie, designed a pole balancing experiment — which quite simply involved a pole intelligently balancing on a cart.
This kind of machine learning works by teaching a robot or an algorithm to complete tasks by giving them examples and then letting them complete a process of trial and error to successfully replicate or even iterate off the example. Successful trials are rewarded (by “weighting” or emphasizing these actions in the algorithm, not necessarily by giving Cassie a present. Hard to give a robot a cookie it can use, after all.)
You can imagine this learning process like the one children undertake when learning that the stovetop is hot or when learning to walk or crawl, says Li.
In particular, the deep reinforcement learning that Cassie does works by using a neural network (a series of algorithms that mimic human neurons to intelligent find patterns in data) to process and iterate through training examples and trial and error in new environments.
How did reinforcement learning make its way to Cassie?
Many roboticists have tried to build robotic locomotion by simplifying their robot model. Li says this can backfire, especially when that robot is introduced to the real world complete with tricky variables like gravity and friction. Like a baby learning to walk, it might faceplant — but unlike a baby, it won’t learn from its mistakes easily.
Instead, Li and colleagues took a different approach to design Cassie’s motion in their recent pre-print study, which he says has been accepted for presentation at the International Conference on Robotics and Automation this summer.
The gist of the research: using reinforcement learning instead of strictly modeled or pre-programmed locomotion allows Cassie to respond better to dynamic changes in its environment and adapt to new situations — essential skills for a young robot on the go.
Yunzhu Li (no relation), a Ph.D. candidate in the MIT Computer Science and Artificial Intelligence Laboratory program not involved in the Cassie project, says that the research into Cassie’s movements moves reinforcement learning forward.
“The work is quite impressive, which achieves the state-of-the-art on using RL to control human-sized bipedal robots,” he tells Inverse. “The controller can adapt to changes in the environment and enable walking at different velocities and heights while also being robust to significant external perturbations.”
However, the movements of Cassie still haven’t caught up to the Boston Dynamics line of robots, Yunzhu Li says.
“Compared to the videos released from Boston Dynamics, the bipedal Cassie robot operates at a low speed with relatively conservative motions on a plane, where the gap between simulation and the real world is relatively small,” Yunzhu Li says.
“If the Cassie robot wants to perform more aggressive motions, more works may be needed to bridge the simulation-to-reality gap and have explicit modeling of the surrounding environment.”
How did Cassie learn to walk?
When it comes to Cassie, Zhongyu Li says that the team trained it first in a simulation using a library of gaits before taking Cassie out into the real world.
Like learning to ride a bike with training wheels before you hit the road on a true two-wheeler, Zhongyu Li explains that teaching Cassie to walk in a simulation first was an important step. Starting in simulation allowed it to learn and iterate on the gait libraries without being (literally) knocked down by minor inconveniences, like uneven floors.
With this fundamental walking knowledge underfoot, Cassie could then be brought into the real world to put its experience to the test.
The team tested Cassie’s motion, including fast-walking, crouched walking, and recovering from obstacles — the latter which included researchers prodding Cassie with a stick or gently hitting it with a plank of wood.
In their study, the team reports that Cassie performed on par or better than model-driven robots doing the same task.
Should all robots learn like children?
So, does that mean we should place all our robots in elementary school to be taught how to navigate the real world? Maybe not, says Zhongyu Li, because neural networks are inherently a black box when it comes to how they make their decisions.
“The entire kind of network is unexplainable,” says Zhongyu Li. “Which means that if it fails, you don't which part makes it fail or not.”
Instead, both Lis separately suggest that a mixed approach that uses both model-based locomotion, as well as reinforcement learning, might be a better bet.
“I believe the future of robotic locomotion should be some combination of model-based and reinforcement learning-based methods,” Yunzhu Li says. “Both directions have their pros and cons, and it would be great to achieve the best of both worlds.”
Such a robot — especially a bipedal robot like Cassie — would have an easier time integrating into human environments, Zhongyu Li says, and would be ready to take on its first job delivering packages or even responding to disasters like collapsed buildings.