How do we know DNA sequences make life similar to other life?

312 views

So we can sequence DNA, and we can compare the amino acids to match other strands, but how do we know it means anything bordering on similarity of what it actually represents?

I’m a programmer so the best I can give an analogy to is binary, I guess. Just because you have a bunch of 0s and 1s that match from one part of the machine to another doesn’t mean it acts the same among the “thing” it is a part of when compared to the other.

How do we know our DNA does with a level of certainty to say “this organism is similar to that one?”

In: 0

6 Answers

Anonymous 0 Comments

ELI5: There are ways in which the same DNA sequence can do different things in different animals, but they all depend on other bits of DNA.

Knowing perfectly what DNA does just by its sequence is basically not possible at this time. The only way right now to take a genome and say “this is what kind of organism this genetic sequence would grow up into”, would be to compare it to the genomes of other living organisms, and make an educated guess based on the similarity.

That said, we do know a lot about how different biological systems work, so we have a lot of structured ways of figuring out what we’re looking at, with genomic data. We can do things like take an entire chromosome’s sequence, and pick out which regions of it are actually genes, and we can therefore usually also identify genes that look like genes in other species known to be involved in major, important control functions.

Everything that follows ranges from the ELI15 to the ELI50:

Every three bases in a coding sequence codes for a single amino acid. How does that happen? It happens because when the ribosome is reading off the initial mRNA transcript, it requires, in order to keep reading, certain things called transfer-RNAs to “match” the next codon and add their specific amino acid onto the growing protein chain. The transfer RNA works because it folds into a shape that can interact with the ribosome and also do two other things: match a certain part of the DNA, while binding to a specific amino acid.

So the DNA that codes for the t-RNA libraries, is what determines how an organism reads the coding regions of its own DNA. There’s one nearly-universal one that is standard, but there [are alternatives](https://en.wikipedia.org/wiki/List_of_genetic_codes).

Okay, so once you have a coding structure, you have to specify which regions of the DNA are actually meant for coding proteins. To do that, there are certain upstream (and sometimes downstream) structures and patterns that do the job of recruiting proteins like RNA polymerase that transcribes DNA into an mRNA transcript. There are various DNA patterns that do this, a lot of which are called [promoters](https://en.wikipedia.org/wiki/Promoter_(genetics)) because they promote the activity of a gene, usually a nearby one.

What makes a promoter DNA sequence able to serve as a promoter? It’s able to serve as a promoter because of the existence of proteins called transcription factors (TFs) that are able to bind to the promoter sequence. TFs also bind to other TFs, or to RNA polymerase. By being able to bind together, they form active little clumps of protein that encourage RNA polymerase to attach to the gene and start transcribing.

When a TF is read off, it attaches to the promoters of other genes to ultimately recruit RNA polymerase to transcribe them. Those other genes might be TFs themselves, which means that you can get complex cascades of TFs that set each other off, one by one, as well as setting off any genes that share the promoters involved. These TF cascades can activate whole biological programs involving many genes; they can be useful precisely *because* they serve as centralized control loci for evolution to evolve useful functions on.

So the DNA that codes for the protein itself doesn’t (usually) determine for itself when it gets expressed. For two species that share the exact same DNA sequence for a protein, the same protein might get expressed in wildly different contexts. But it’s still DNA controlling all that, just, it’s the promoters, and the transcription factor cascades encoded in the linkage between promoter and TF sequence… it’s these *other control sequences*, that control gene expression. Still DNA, just a different type.

Lastly, there’s a bunch of [post-transcription control](https://en.wikipedia.org/wiki/Post-transcriptional_regulation) that takes place at the RNA level. RNA transcripts can interact with and bind to one another in ways that can silence each other’s expression, modify the sequence of a transcript, and a lot more.

You are viewing 1 out of 6 answers, click here to view all answers.