Mark Zuckerberg Asks, 'Can We Teach A.I. to Read 'Alice in Wonderland?'

It's literally reading children's books.

Mark Zuckerberg

After beating the ancient game Go last month, Facebook’s artificial intelligence team continues to set its own bar higher and higher: On Thursday, Mark Zuckerberg (who’s currently hard at work on his own, personal-assistant A.I.) announced that his researchers are teaching A.I. to predict missing words in children’s stories.

“Can we teach A.I. to read Alice in Wonderland?,” Zuckerberg asked in this Facebook post.

This A.I. is trained to predict how to fill in the gaps in incomplete sentences, but it would be able to do much than just —— in —— blanks. Computers have had this capacity in juvenile form for a while now, Zuck explains. They’ve been capable of predicting “simple words like ‘on’ or ‘at’ and verbs like ‘run’ or ‘eat,’” but when it comes nouns, these computers have long struggled.

So, these Facebook researchers set up a test that they could then figure out how to pass with flying colors: “The Children’s Book Test.” (Facebook also released a 1.6 GB file of all the children’s books that the A.I. studied in the test in .tgz form, available — warning: instant download — here.)

The A.I. is also trained to respond to questions about news stories, and these advancements will, Zuck says, go a long way toward developing Facebook Messenger’s chat assistant, nicknamed “M.”

The idea is that other A.I. developers will take up this project, too, and use the same massive 1.6 GB dataset.

From the paper that the researchers released to the public:

The test requires predictions about different types of missing words in children’s books, given both nearby words and a wider context from the book. Humans taking the test predict all types of word with similar levels of accuracy. However, they rely on the wider context to make accurate predictions about named entities or nouns, whereas it is unimportant when predicting higher-frequency verbs or prepositions.

Obviously, neither a human nor a computer can do this completely blind — both subjects need a bit of context. Too much context, though, and the computer gets overwhelmed; too little, and the computer flails in confusion.

"Circled phrases indicate all considered windows; red ones are the ones corresponding to the returned (correct) answer; the blue windows represent the queries."

Zuck wants to one day enslave this A.I. to Facebook Messenger, meaning Facebook will have a chat assistant at least on par with that of Google.

Another test, buried deep in the published paper, gives a glimpse into what could be one of M’s many planned features. The researchers trained the A.I. to learn from a dataset of 93,000 news articles from CNN. Each article in the dataset has a “question derived from a bullet point summary accompanying” it, and the answer to that question is always a named entity — a proper noun. The A.I. did fairly well on that test: under certain conditions, it “greatly surpasses the state-of-the-art.”

The chat assistant could therefore — presumably — answer questions about current news stories. (“What’s the name of the CEO who just publicly resisted the government?Tim Cook.) And when A.I.s are writing news stories and responding to inquiries about news stories, there’ll be absolutely no need for the fine folks at Inverse to even exist anymore. Until then, we’ll approximate our future A.I. overlords’ capacities to the best of our human abilities.