Research shows that that the DNA of human beings is 99.9 percent alike. That being said, for the genome project scientists used DNA from five individuals of varying ethnicity. The five people whose genomes were used in sequencing came from a pool of 21 donors. It was agreed that donors names would never be made public. So we never will really know the identity of the DNA used.
Currently a lot of peoples genomes have been mapped. The cost have come down to reasonable levels for research projects. So we are selecting interesting people to map the genome of as well as a lot of “standard” individuals. And we can compare the genome of a number of peolpe to get an average to compare against.
But if you are asking who was first, or at least who was first using a specific technique, that is a more interesting question. Firstly we have not actually mapped the full human genome. We are able to map most of the interesting sections just fine. However there are a number of repeating sequences in the genes which we can not map using current techniques. We are getting better at this though but this means that we do not have a single genome mapping project which can claim to be first. However it is typical for these projects to start with a sperm cell donated by an anonymous donor. Most likely from a sperm bank although there are plenty of ways to get voulenteer donors anonymously.
The “human genome” you’re talking about is called a “reference genome” — everyone’s DNA is different, but early on we figured out that we need to have one genome that everyone can look at and make notes about: where genes are, what they do, how they differ from person to person, etc. So when we talk about “mapping the human genome”, we’re really talking about selecting a representative and then making notes about its features, locations of genes, etc.
Originally, we got DNA from blood from hundreds of donors, and randomly picked 20 from men and 20 from women to actually use. They vials of blood didn’t have the people’s names on them, so we don’t know who they were.
The idea was that we’d sequence the DNA from each person, line up the sequence from each, then decide what the most common DNA sequence was at each position, and keep track of the differences among the people. Person-to-person our DNA is *almost* identical, but not quite – everyone is a just a bit different.
In fact, some of the samples were mishandled and we weren’t able to get much sequence information out of them, meaning that most of the DNA used came from one anonymous person. While some parts a mix of DNA sequence from several people, lots of it is just based off DNA from a single person.
That was the first “Human Genome”. In the years that followed, DNA sequencing became much cheaper and faster as we improved how we do it, and designed machines to do it. What took billions of dollars and years to do at the time, can now be done in a day for a few hundred dollars.
Because it’s cheap and easy, we’ve sequenced millions of human genomes. There are teams that have done so, and made big databases that describe every tiny difference between each one they sequenced and the one that’s the reference. They’ve combined that information with the donor’s medical records so that they can use statistics to look for genes that cause diseases.
The answer you get from NCBI is that the reference genome is “pooled” from a number of men. What this means in ELI5 terms is that they mixed DNA from a lot of people together before they sequenced it. Once they mixed them together, it wouldn’t be possible to figure out which gene sequence came from which guy.
As far as who these men are…. back in the day it was a bit of a sport in the genetics field to accuse anyone and everyone of being such an egomaniac as to put himself in the pool. I’d not believe any claims that anyone specifically is in the pool, but it’s definitely not Albert Einstein or anyone famous – it’s probably a bunch of genetics guys.
But hey if you’re in the database now, you can pull up the 1000 Genomes browser and see what the most common sequence is for people from other ethnicities!
I think you are misunderstanding what it means to map a genome.
To keep it in the realm of eli5, let’s pretend that there are 1 million pairs of chromosomes.
Pair number 15 identifies whether you have blue eyes or brown. It doesn’t matter which chromosome the individual has, it still maps the same.
My DNA may have a blue chromosome and your DNA may have a brown chromosome in the 15th pair, however we are both members of the same map.
So you seem to be hung up on which exact chromosome each individual person has. Which exact one you have isn’t important, what’s important is that it’s mapped correctly and we know which ones it’s possible for you to have and what they mean. That’s what it means to map our genome.
Imagine a puzzle- the old “500 pieces in a box” type.
The mapped genome is the shape of all the cuts of the pieces. It took a long time to trace all those cuts, but we mapped it.
Think of each person as a different picture that could be stamped onto that puzzle. Could be any picture; there are an infinite amount to chose from.
We’ve always seen the “pictures” of people, but now that we’ve mapped the “shape” of the puzzle pieces, we can map out each part of each person specifically!
Genome = Puzzle Shapes
Every Person on the planet = A different picture we can put on the puzzle
Latest Answers