Eli5: If the amino acid sequence of two organisms is similar, will their DNA be also the same?



Eli5: If the amino acid sequence of two organisms is similar, will their DNA be also the same?

In: Biology

Could you specify to which amino acid sequence you’re referring? Otherwise it seems like you’re asking “If all the letters and spacing in two books are the same, will their words also be the same?”

Yes. Actually, the information flow is the other way around. The amino acid sequence is similar *because* the DNA is similar.

There’s a direct translation table (called the codon sun) between DNA and amino acid sequences. While there are different possibilities for DNA sequences to still code for the same protein (like there are multiple binary files, just in different formats, that will produce the same image), if a whole sequence is similar it’s very likely that it evolutionary came from the same ancestors, therefore sharing similar DNA.

Short Answer

They will probably be close but do not have to be identical. It will kind of depend on the context of how you are comparing them.

Long answer

If you are comparing the DNA sequence between 2 humans for the same AA sequence they will probably be almost if not completely identical. This is because the vast majority of DNA between humans is very similar, the parts that make up the differences in looks is less than 1% of our DNA. The rest of the DNA is not identical but it is all very close. If you were to pick an AA sequence that is the same in humans and in chimps then the DNA would probably still be very similar but probably not identical. Even less complex animals such as fruit flies would probably have pretty similar DNA to humans for the same AA sequence. But once you start getting to things like yeast the DNA can, but does not have to, be pretty different for the same AA sequence. This is because as life has evolved it picks up mutations, some of these are bad and kill of the organism, some of these are good and make it stronger, and some mutations change the DNA but do not change the amino acid sequence that the DNA makes.

Why does this happen? Because of codons. There are 4 nucleic acids that make up DNA A,C,G,T and there are 20 amino acids that make up all of our proteins, there are more but that is complicated and rarely talked about. Because of this if we want 4 nucleic acids to be able to code 20 amino acids we have to use groups of 3 nucleic acids which is called a codon. So 3 nucleic acids, lets say CAT, gives you one amino acid, in this case Histidine. If you go through and do the math 4 nucleic acids in 3 positions gives a total of 64 different possible combinations. Since there are only 20 different amino acids, and a stop codon so the body knows when a sequence is over, most amino acids can be coded for with more than one DNA sequence, the exception is methionine which acts as the start codon. For example the amino acid leucine can be coded for with 6 different codons, TTA, TTG, CTA, CTC, CTG, or CTT. Therefore every time an L shows up in your amino acid sequence any of these 6 options could be used. But like a said before the closer together things are in the evolutionary scale the more similar their DNA sequence will probably be for the same amino acid sequence. You and your mom probably have the same DNA sequence, you and your dog maybe not, you and the bacteria in your stomach even less likely to be the same.

Even longer answer because biology is complicated

So how does your body know that CAT in your DNA is supposed to be Histidine in your amino acid sequence? because of tRNA. This is a special molecule that is used in converting DNA to AA. On one side it has nucleic acids that will recognize the DNA code and on the other side it has whatever amino acid lines up with that nucleic acid sequence. (If you have learned about mRNA I am intentionally skipping over that because it just adds in Us to replace Ts and that makes everything more confusing with out adding anything to the question) Now making these tRNA takes energy and like I said before there can be up to 6 different nucleic acid sequences that give the same amino acid and you need special enzymes to make each of those 6 different tRNA so that takes even more energy. Cells don’t like to waste energy so they tend to end up preferring one or two of the options for the nucleic acid sequence. This is called a codon bias and the codon bias is generally conserved by species. For example in humans if you want to put an L in your amino acid sequence then 41%(no exactly but close enough with out having to get into more complicated ideas) it will be coded with CTG, but you want to code an L in yeast then CTG suddenly is only used 11% of the time and instead it will probably be TTA 29% or TTG 28% of the time.

This can be important for different reasons. One is used in a lot of these 23 and me type ancestry things. Because historically people did not travel much mutations tended to stay within the same general geographic areas. This is why people in different parts of the world tend to share looks with locals but look different from people a long ways away. For some random sequence of amino acids that has an L in it that L might be coded with TTA in people with European ancestry but coded with CTG in people with Asian ancestry. So then when this company tests your DNA they can guess where your ancestors came from based off of this mutation that happened way back when. If you take lots and lots of these mutations you can get a somewhat accurate guess as to where someone’s ancestors are from based off of the specific mix of mutations they have in their DNA. A second reason that codon bias is important is in the biotech setting. Often times scientists want to study some gene that exists in humans, but we can’t (unfortunately we have done fucked up things in the past but we shouldn’t) just start messing with humans to see what bad things happen when we mutate things. So we will often take the DNA sequence for the thing we want to study from a human and put it into bacteria or yeast. But, like I said before the codons that bacteria and yeast prefer to use are different from the codons that humans prefer to use and this can cause problems. So when you move the DNA from humans to yeast you change all of the Ls to the codons that yeast prefers to use to code in Ls instead of the codons humans prefer to use to code in Ls so that your yeast do not run out of tRNAs that they usually would not waste the energy making because they rarely use them.

This was a ton of information and I do not know how much of it you knew before so feel free to ask any follow up questions you have. I do this stuff for a living so I can probably answer or at least point you in the right direction.