We need genome sequencing of only 2% population to match nearly every person relatives
How big genomic sequencing projects are needed? It turns out that after sequencing just 6.5m people in the US (or 1.3m people in the UK) nearly every person can be matched at a distance of maximum 3rd cousin.
The calculations were based and tested on real genomic sequencing database, involving 1.28 million individuals.
Search for sequences called identity-by-descent (IBD) proved that every person could be traced with precision ranging from 2nd cousin to 4th cousin. The disparity is a consequence of various ancestry histories between people with European ancestry and people with African ancestry.
The analysis was inspired by the recent case of Golden State Killer. After 13 murders in 70′ and 80′, he was hiding for thirty years, but law enforcement forces found him by genomic screening. Thanks to 1 million DNA profiles, a sample from a crime scene matched a 3rd cousin and further led to identification and arrest of the serial killer.
This method, as demonstrated by researchers, can also be used to identify anonymous genomic sequences like those published in the 1000 Genomes Project.
The US market of genomic sequencing rapidly accelerates – companies performing direct-to-consumer test sold over 7 million kits only in 2017. We can expect rapid reaching of proposed 2% treshold in the next years.
More: “Identity inference of genomic data using long-range familial searches”, Y. Erlich et al., 2018, doi:10.1126/science.aau4832.