family tree

Scientists may have just uncovered the origin of an ancient language

A prehistoric 23&me.

Longji Terraces in Guilin, Guangxi Zhuang Autonomous Region, China.
Zhihong Zhuo/Moment/Getty Images

Every boring email we type or moment of small talk we have at the grocery store is part of a historic and mysterious legacy: the creation of language.

The kind of languages we speak — from Arabic to Mandarin and English — feel like immovable constants in our lives, but in reality, these languages are shifting and transforming at every moment.

While the spread of slang through apps like TikTok or WeChat may seem like a modern phenomenon, new research published in the journal Nature on Wednesday uses genetic, archaeological, and linguistic data to demonstrate that this transformation can be traced back much further — all the way to 2000 B.C.E.

The Transeurasian language family the researchers focused on has connections to modern-day Japanese, Korean, Tungusic, Mongolic, and Turkic.

Today the markets of Istanbul may look quite different from the shops of Seoul, but a new study shows that these cultures may in fact have more common than meets the eye.


By tracking the transformation of ancient Transeurasian language, this research can help scientists not only better understand how language changes, but how its speakers change along with it.

Martine Robbeets is a linguist from the Max Planck Institute for the Science of Human History and the first author of the new paper. She explains that the significance of this study is that it shows how powerful linguistics can be when used in collaboration with other disciplines including genetics and archaeology.

“I think that the novelty of the research is not so much in applying one single method, but in bringing different methods and different disciplines together,” Robbeets tells Inverse. “[These questions] cannot be answered with linguistics alone.”

What’s new — The spread of Transeurasian languages — jointly known as “Altaic” — and their linguistic ties to one another has been a topic of hot debate among linguists who study prehistoric languages for years. Primarily, Robbeets explains, the debate can be boiled down to whether or not these languages under the Altaic umbrella are similar because they share a common, ancient ancestor or simply because these cultures interacted and borrowed words from one another.

Based on this latest work as well as 20-years of research on the topic, Robbeets says it’s likely a bit of both.

Prior to this latest research, linguists studying the spread of these historic Transeurasian languages had believed in something called the “pastoralist hypothesis,” Robbeets and colleagues explain their paper.

Researchers used to believe that the Transeurasian languages spread through the riding of horses across the steppe, but that’s only part of the picture a new study finds.

Anastasiia Shavshyna/E+/Getty Images

Essentially, this hypothesis proposes that Transeurasian languages spread from west to east via horses and the movement of nomadic peoples across the region. However, together with information collected in this new study, Robbeets and colleagues suggest that this view of history doesn’t quite at up.

This discrepancy comes in part because the migration of these ancient peoples began long before the tradition of horse riding, Robbeets explains.

Instead, Robbeets and colleagues suggest that the language dispersed through the spread of agriculture and farmings some 9,000 years ago — long before nomads began riding horses.

“Before 3,200 years ago, there was no horse pastoralism whatsoever,” she says.

Why it matters — There are no clear-cut answers in this field of work, Robbeets says. For every answer researchers find, twice as many new questions appear in their place — but that’s all part of the fun of it.

One of the biggest takeaways from their work, says Robbeets, is the idea that Transeurasian languages can be represented as a genealogical group with a common ancestor. This classification will help future researchers better search for this lost ancestor and discover how their lives have led to the explosion of language and diversity we know today.

What they did — To come to this conclusion, Robbeets and colleagues first had to collect and analyze a lot of data, including:

  • 254 basic vocabulary concepts for 98 Transeurasian languages
  • 172 archaeological features for 255 Neolithic and Bronze Age sites, including ancient grains
  • 23 authenticated ancient individuals genomic data

The researchers then compared their findings to discover overlapping similarities that pointed toward shared experiences of these ancient peoples. Through this work, the team found that Amur ancestry (i.e. ancient people who lived by the Amur River in northeastern China) had connections to a shared core language (including similar words for things like ‘tea’) and archaeological artifacts of cultivated millet.

Based on this overlap, Robbeets says the ancestral home of the Transeurasian languages can be traced back to millet farmers in the Liao valley of northeast China.

Transeurasian languages spread through the development of agriculture, such as rice farming, researchers find.

Kiatanan Sugsompian/Moment/Getty Images

“It turns out that exactly where our linguistic data suggested the root in the west Liao river region is exactly the place of millet domestication, exactly 30,00 years ago,” Robbeets says. “So there the linguistics and the archaeology kind of map very nicely onto each other.”

After this, the researchers say the language spread through Eurasia and transformed through the spread of agriculture, such as rice farming.

What’s next — One of the next big challenges for this research will be to learn more about who these Amur people were and what their lives were like, Robbeets says.

But for her part, she’s also interested in taking a break from searching for similarities to instead learn more about how these more modern languages are different and where they might’ve broken apart from a single, mother tongue.

Abstract: The origin and early dispersal of speakers of Transeurasian languages—that is, Japanese, Korean, Tungusic, Mongolic and Turkic—is among the most disputed issues of Eurasian population history. A key problem is the relationship between linguistic dispersals, agricultural expansions and population movements. Here we address this question by ‘triangulating’ genetics, archaeology and linguistics in a unified perspective. We report wide-ranging datasets from these disciplines, including a comprehensive Transeurasian agropastoral and basic vocabulary; an archaeological database of 255 Neolithic–Bronze Age sites from Northeast Asia; and a collection of ancient genomes from Korea, the Ryukyu islands and early cereal farmers in Japan, complementing previously published genomes from East Asia. Challenging the traditional ‘pastoralist hypothesis’, we show that the common ancestry and primary dispersals of Transeurasian languages can be traced back to the first farmers moving across Northeast Asia from the Early Neolithic onwards, but that this shared heritage has been masked by extensive cultural interaction since the Bronze Age. As well as marking considerable progress in the three individual disciplines, by combining their converging evidence we show that the early spread of Transeurasian speakers was driven by agriculture.
Related Tags