Scientists are tantalizingly close to cracking any ancient texts wide open
This is autofill for antiquity.
In the legend of Odysseus, the hero hails from Ithaca, a Greek island in the Ionian Sea. The myth was famously told by the Ancient Greek poet Homer in The Odyssey and first written down sometime in the 8th century B.C.E., according to some estimates. In the story, Odysseus is the king of Ithaca and he is trying to make it home from the Trojan War.
It is one of the seminal texts of world literature — and an exceedingly rare example of an ancient written story that made it to modern times. The reason why is not to do with production: Many ancient peoples were just as interested in documenting their histories as modern humans (although perhaps not to the extent of the Twitterati). Yet cracking the few, often fragmented pieces of writing we have left from the ancient world is difficult — especially in places devastated by colonialism.
In Odysseus’s time, many people wrote on papyrus, but some would also directly inscribe messages onto metal, stone, and pottery. These engraved texts are technically called inscriptions. As with much of ancient literature, we only have fragments of some inscriptions — many are lost to wear and tear.
But Odysseus’ Ithaca now lends its name a neural network — essentially a kind of algorithm — that has one purpose: To restore ancient, fragmented inscriptions, and then read them. In a new paper published last week in the journal Nature, the scientists behind the AI reveal how it works to crack archaic codes.
What’s new — The program Ithaca is a deep neural network developed by a team led by epigrapher Thea Sommerschield and AI scientist and DeepMind researcher Yannis Assael (an epigrapher is someone who studies ancient inscriptions). Ithaca is designed to restore fragmented inscriptions and use machine learning to complete them — much like how some email clients now use autofill to guess what you want to write next. When used by a historian, Ithaca has up to 72 percent accuracy in completing a text.
“Just as microscopes and telescopes have extended the range of what scientists can do today, Ithaca similarly augmented and expanded capabilities to study one of the most significant periods of human history,” Assael says in a press briefing to announce the findings.
Assael and his team used an archive of more than 178,500 transcribed, ancient inscriptions to train Ithaca. The dataset is maintained by the Packard Humanities Institute. The inscriptions come from 84 ancient communities around the world and are in several different languages. The inscriptions also range in how old they are — languages change over time, after all. The team refined the dataset by adding rules, like separating different inscriptions by language, time period, and region.
This process essentially gave Ithaca a foundation for it to use machine learning to be fed a new inscription — one it had never seen before — and then, using what it had learned from its original training, it can predict the likeliest missing words or text in the inscription. These predictions are based on what it already knows about the ancient language, its peoples, and the time period and region the inscription comes from.
Why it matters — Inscriptions are the relics of past human thought and culture. In a sense, they’re a key to ancient civilizations’ codes. Incomplete inscriptions offer incomplete information on how those ancient societies functioned, evolved, and diminished.
“Using effects and programs like [Ithaca] would be like putting on putting on your glasses, right?” says Jordi Alonso, a Greek classics scholar and Master’s student at Columbia University. Alonso was not involved in the study.
“It helps you see better and understand the world better.”
Ultimately, Ithaca’s success also depends on the people using it — man and machine have to work together. Alone, Ithaca’s accuracy is 62 percent in restoring damaged texts. But that accuracy level shoots up to 72 percent when a trained historian uses the algorithm. Trained historians alone have an estimated 25 percent accuracy in restoring inscriptions.
“If you’re waving around a pair of glasses, and there’s no eyes behind them, it’s not gonna do much,” Alonso says.
In the future, Ithaca may be able to help decipher and complete fragments of ancient text written on parchment or papyrus, too. For example, Alonso says, we have only ever found two complete poems written by the iconic ancient poet Sappho. With Ithaca, classicists may be able to piece together some more.
“One or two more would be completely fantastic,” Alonso says.
What’s next — Ithaca has the potential to crack open more ancient literature far from the Western Classical tradition.
“Ithaca’s architecture design makes it easily applicable to any ancient language, not just Roman Latin, but Mayan, cuneiform... And any written medium,” Sommerschield says.
Thus far, Ithaca has discovered that the Athenian Decrees were written around 421 B.C.E. — these are the foundational documents of the world’s first democracy. Past conventional thinking had put their date closer to 446 B.C.E. Even the shift of just 15 years helps scholars re-contextualize these documents — now we know they were written at the same time as Athens prepared to go to war with Sparta, another Ancient Greek state that was governed as a military autocracy.
“We hope that the way we’ve designed [Ithaca] is going to be easy for [other researchers] to use because they just type in the text and then they will get all these visualizations that they can use,” Assael says.
“Selfishly, I want somebody to sic Ithaca on [undeciphered] inscriptions,” Alonso says.
“I know it’s not going to be 100 percent accurate, but it’ll be way more accurate than if we just have some guy staring at these inscriptions and just using his own human brain.”