Google Uses Classic Video Game to Teach A.I. to "Dream"

While dreaming robots might seem like the kind of science fiction straight out of Westworld, researchers at Google DeepMind think artificial intelligence could make it a reality.

Using an UNsupervised REinforcement and Auxiliary Learning tool (UNREAL), DeepMind researchers were able to apply the same situations that cause animals to dream in artificial intelligence. The new research, recently published, also brings DeepMind closer to its goal of creating an A.I. that can teach itself at an enhanced level.

While this is a new application, the researchers used the same deep reinforcement learning methods that it used to master the game Go in January, but instead used the game Labyrinth on Atari as models.

UNREAL also augmented the reinforcement learning by asking the agent to complete two tasks in addition to gameplay completion. In the first task, the agent was asked to control pixels on a screen, teaching it visual inputs to enhance gameplay.

Auxiliary functions allow DeepMind to perform at a higher rate than before.

Google DeepMind

“This is similar to how a baby might learn to control their hands by moving them and observing the movements,” researchers explained in a blog post.

The agent also learned to predict rewards by replaying “rewarding” situations to more accurately understand the visual features that lead to them.

“To learn more efficiently, our agents use an experience replay mechanism to provide additional updates,” researchers explain in the paper. “Just as animals dream about positively or negatively rewarding events more frequently, our agents preferentially replay sequences containing rewarding events.”

And the results paid off. The new tasks helped UNREAL learn 10 times faster than previous DeepMind agents and, on average, the game performed at expert-level at 87 percent in Labyrinth.

This means the deep learning that Google promised in September would make artificial intelligence learn new skills “ten times faster” is getting even more sophisticated.

“We hope that this work will allow us to scale up our agents to ever more complex environments,” researchers said in the post.

Related Tags