Facebook teaches its AI to understand objects in its surroundings

First-person video taught the AI to discern typical kitchen interactions.

Originally Published: 

A team of researchers from the University of Texas and Facebook AI Research investigated a way for AI to understand how we interact with our surroundings. The technique, Ego-Topo, breaks video down into activities first and then zones within a kitchen, according to Venture Beat.

Instead of the usual continuous video used by AI, the team looked at how “visits” to particular areas helped the program understand the activities associated with, say, a sink and even predict what one might do at a sink later.

What did they uncover? — A topological map revealed activity hot zones as well the common activities taken there, essentially cutting out the filler footage of the first-person videos used. Unlike other methods used in AI video mapping, this technique presented each kitchen “as an action-affording space” instead of just a place where the person happened to be.

The experiments followed people making specific recipes as well as people just casually using a kitchen over time. The information from the egocentric videos also allowed researchers to see how parts of a kitchen relate to each other and determine similarities between different kitchens.

Why this is a big deal — The researchers believed that both regular and 3D video of a space did a poor job of recognizing actionable objects, using the example of a cutting board’s similarity to a patch of wood flooring. This new paper furthers Facebook’s quest to find real-world applications for AI. The team expects its technique could be used to guide humans (using AR) or robots through an unfamiliar space, helping the latter understand how that space is actually used.

This article was originally published on