Crowdsourced Genetic Data Used to Create Largest Human Family Tree
Before 1850, marrying within the family was common. While today, hooking up with anyone closer than a seventh cousin is seen as too much incest, getting it on with your fourth cousin used to be socially acceptable. Historians have long thought that societies were less squeamish about incest because people lived so close together, but a new study on a crowdsourced human family tree suggests that was not the case.
They came to this explanation using data collected from the hundreds of existing genealogy websites designed to help you extend the branches on your family tree. The new study, published Thursday in Science, describes how they used the data from this unprecedented hoard of genetic information to create the largest scientifically vetted family tree to date, which consists of 13 million people — larger than the populations of Belgium and Cuba.
After using mathematical graph theory to clean and organize the data from the profiles, numerous small family trees emerged — as well as a massive, single tree spanning an average of 11 generations. This tree, they write, revealed sweeping changes in the ways people met and who they married, along with other surprising genetic, cultural, and socio-demographic trends that occurred over the past 500 years.
Changes in who people marry, and where they met their spouse, stood out as a cultural factor that massively changed over time. Contrary to the idea that people only hooked up with their cousins because it was geographically convenient, the family tree found that between 1800 and 1850 people started leaving their hometowns and settling farther away — yet they were still likely to marry a fourth cousin or closer.
This hinted to the researchers it was changing social norms, not increased mobility, that got people to stop marrying their family. Finding a spouse away from home, meanwhile, continued to rise. Before 1750, most Americans met their spouse within six miles from where they were born. By 1950, that distance changed to 60 miles.
The tree revealed more than just our changing social norms, though. Technically a mathematical graph structure that connects mating and parenthood links, the family tree was created by analyzing 86 million public profiles from Geni.com, one of the largest collaborative genealogy websites. In this enormous wealth of data was a data set of 3 million relatives within the larger family tree, which allowed the researchers to explore another trend: the influence of “nature and nurture” on longevity.
For this part of the study, they specified their study group to include people born between 1600 and 1910 who lived past the age of 30, excluding twins and people who died because of natural disasters, the U.S. Civil War, or either of the World Wars. When they compared each individual’s lifespan to that of their relatives, they came across something surprising: Genes only explained about 16 percent of longevity variation.
That strays from previous estimates that genes affect 15 to 30 percent of longevity variation and indicate that good longevity genes can extend a person’s life by an average of five years.
“That’s not a lot,” senior author Yaniv Erlich, Ph.D., commented in a statement released Thursday. “Previous studies have shown that smoking takes 10 years off of your life. That means some life choices could matter a lot more than genetics.”
The study authors have released their data set on the academic research site FamiLinx.org with the hopes that other scientists can apply the information to other fields, like anthropology and genetics. Room for further future work is available too. In this dataset, 85 percent of the profiles originated from just Europe and North America. To get a full scope of global interconnectedness, we’ll need a lot more information.