the human genome is fully mapped, but who’s genome is this?


There are differences between everyone’s genome, so when the human genome was mapped, who were they basing it off of?

In: 99

They sequenced the DNA from several people of European, Asian and African ancestries. And in the past few decades sequencing a genome has gone from billions of dollars down to about $1000 so many other peoples genomes have been sequenced.

Research shows that that the DNA of human beings is 99.9 percent alike. That being said, for the genome project scientists used DNA from five individuals of varying ethnicity. The five people whose genomes were used in sequencing came from a pool of 21 donors. It was agreed that donors names would never be made public. So we never will really know the identity of the DNA used.

Currently a lot of peoples genomes have been mapped. The cost have come down to reasonable levels for research projects. So we are selecting interesting people to map the genome of as well as a lot of “standard” individuals. And we can compare the genome of a number of peolpe to get an average to compare against.

But if you are asking who was first, or at least who was first using a specific technique, that is a more interesting question. Firstly we have not actually mapped the full human genome. We are able to map most of the interesting sections just fine. However there are a number of repeating sequences in the genes which we can not map using current techniques. We are getting better at this though but this means that we do not have a single genome mapping project which can claim to be first. However it is typical for these projects to start with a sperm cell donated by an anonymous donor. Most likely from a sperm bank although there are plenty of ways to get voulenteer donors anonymously.

The “human genome” you’re talking about is called a “reference genome” — everyone’s DNA is different, but early on we figured out that we need to have one genome that everyone can look at and make notes about: where genes are, what they do, how they differ from person to person, etc. So when we talk about “mapping the human genome”, we’re really talking about selecting a representative and then making notes about its features, locations of genes, etc.

Originally, we got DNA from blood from hundreds of donors, and randomly picked 20 from men and 20 from women to actually use. They vials of blood didn’t have the people’s names on them, so we don’t know who they were.

The idea was that we’d sequence the DNA from each person, line up the sequence from each, then decide what the most common DNA sequence was at each position, and keep track of the differences among the people. Person-to-person our DNA is *almost* identical, but not quite – everyone is a just a bit different.

In fact, some of the samples were mishandled and we weren’t able to get much sequence information out of them, meaning that most of the DNA used came from one anonymous person. While some parts a mix of DNA sequence from several people, lots of it is just based off DNA from a single person.

That was the first “Human Genome”. In the years that followed, DNA sequencing became much cheaper and faster as we improved how we do it, and designed machines to do it. What took billions of dollars and years to do at the time, can now be done in a day for a few hundred dollars.

Because it’s cheap and easy, we’ve sequenced millions of human genomes. There are teams that have done so, and made big databases that describe every tiny difference between each one they sequenced and the one that’s the reference. They’ve combined that information with the donor’s medical records so that they can use statistics to look for genes that cause diseases.

The answer you get from NCBI is that the reference genome is “pooled” from a number of men. What this means in ELI5 terms is that they mixed DNA from a lot of people together before they sequenced it. Once they mixed them together, it wouldn’t be possible to figure out which gene sequence came from which guy.

As far as who these men are…. back in the day it was a bit of a sport in the genetics field to accuse anyone and everyone of being such an egomaniac as to put himself in the pool. I’d not believe any claims that anyone specifically is in the pool, but it’s definitely not Albert Einstein or anyone famous – it’s probably a bunch of genetics guys.

But hey if you’re in the database now, you can pull up the 1000 Genomes browser and see what the most common sequence is for people from other ethnicities!