Humans and robots possess very disparate perceptions of the world.”

If you listen to the words of Rohan Paul, a postdoc researcher at MIT’s Computer Science and Artificial Intelligence Laboratory, he’ll explain that while humans are able to take spoken or written commands and reason as to what is actually being asked, a robot may not necessarily be trained to do the same.

If you tell a robot to catch a fish, and it will catch a fish. Tell a robot to just “go get some food,” however, and it’s probably going to be at a loss of how exactly to accomplish such a task. Essentially, context matters. Giving a machine instructions won’t do much unless it understands how to infer what the task-master is assigning.

Of course, MIT scientists are on the case. Paul and his team recently developed a new A.I. system called ComText, short for “commands in context,” which is able to parse instructions and use contextual information from the past in order to accurately deduce what is being asked and how to fulfill those commands.

“Presently, robots don’t have the ability to reason about what they’ve seen in the past,” says Paul. And that’s where ComText comes in.

The group illustrates its work in a new paper that was presented at the Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence in Australia last month.

ComText allows robots to understand contextual commands such as, “Pick up the box I put down.”

Paul describes two key contributions that ComText provides to robots and A.I. systems. “One key contribution of this work is the ability to think about future actions, with the ability to reason about what happened in the past,” he says. This reasoning — using prior information to understand how to move forward — is sort of a central goal with all A.I. development.

But the other contribution is the ability to remember facts and information from the past — something that robots designed for direct applications aren’t well-equipped to do.

“The robot can’t right now remember factual information, Paul tells Inverse.

Information is collected as a mash-up pixels and point-clouds which is used to identify information from tangibly descriptive level, but not one that might refer to, say someone’s possession, proximity, or relation to other objects. ComText introduces the ability for knowledge to be inferred and retained over time.

For example, says Paul, “ComText gives a robot the ability to learn that this cup on the table is mine, and remember this fact, and then reason with it.”

A human can issue an instruction later that goes “give me my cup,” and the robot will know exactly which cup he or she is referring to, based on that context. Or, someone might tell a robot to “bring it,” and the robot will know “it” refers to the object on the table since it was the object being discussed just prior to the command.

This is important not just in helping robots understand human commands better, but also in fulfilling multi-step commands more effectively. If the same factory robot was told to “clean up for the day,” it needs to now precisely what those instructions mean — and ComText is a step forward in allowing the robot to understand it means to clean specific tools, organize things in a certain way, dust down a few parts, etc.

If the robot from the introduction is told to “bring food,” it will know to bring a cake that’s already been baked, and not raw flour, a carton of eggs, sugar, frosting, and other ingredients.

What’s Next for “ComText”?

For Paul and his team, ComText is a step forward in helping robots to deduce insight and use facts to presume things in a larger scale. “What we really want to do is get humans and robots together to build something more complex,” he says. He and his team want to advance ComText as a platform that can ask logical questions to humans in order to collect more information that’s relevant to fulfilling tasks.

Just don’t expect to run into ComText in any device soon. “At the moment, we’re not building products,” says Paul. The team still needs to tinker around with the system in order to get it to more smoothly react to human commands and overcome other challenges. But it’s entirely conceivable something like ComText will start to become integrated into applications in the near future. “We’re making good progress.”

Photos via Tom Buehler/MIT CSAIL