So we can sequence DNA, and we can compare the amino acids to match other strands, but how do we know it means anything bordering on similarity of what it actually represents?
I’m a programmer so the best I can give an analogy to is binary, I guess. Just because you have a bunch of 0s and 1s that match from one part of the machine to another doesn’t mean it acts the same among the “thing” it is a part of when compared to the other.
How do we know our DNA does with a level of certainty to say “this organism is similar to that one?”
In: 0
Let’s stick with programming concepts.
DNA isn’t just a sequence of A/T/C/G with no organization, just like binary isn’t just a sequence of 0/1. First, nucleotides in DNA are interpreted in groups of 3 to form a codon, comparable to how bits are interpreted in groups of 8 to form a byte. And what those codons translate to as amino acids is consistent between different organisms, like different computers interpreting bytes to text through the same standard (e.g. UTF-8). On top of that, protein-coding genes are somewhat similar to computer files or file formats; they have sequences that indicate their start, end, intermediate breaks and even metadata, in a way, which are also consistent within major branches of life (and done in different but conceptually similar ways between branches).
Lastly, when you start researching individual genes, you’ll generally find similar genes in other organisms. The proteins they encode are the tools of the cell and ultimately represent how *things get done*. The more similar these are between organisms, the more similar those organisms tend to be. If it quacks like a duck and all that.
Latest Answers