In order to determine this, we first need to determine how we’re defining data.
DNA comes in four chemicals (adenine, thymine, guanine, and cytosine), and they always pair up in a particular way: adenine and thymine pair up, and cytosine and guanine pair up.
This means there are four cases to consider: A-T, T-A, C-G, and G-C. No other combinations are possible.
This means that, to represent four base pair combinations, we need to use (at minimum) two bits each. `00` could represent A-T; `01` could represent T-A, and so on.
There are eight bits to a byte, so we can encode four DNA base pairs into a single byte.
The human genome contains about 6×10^9 base pairs. Given that, we can calculate the total size of the human genome to be
>6×10^9 base pairs/diploid genome x 1 byte/4 base pairs = 1.5×10^9 bytes,
or roughly 1.5 GB. Since sex cells contain half the genome each, a sperm cell holds 750MB. Similar calculations reveal that an average ejaculation contains something on the order of 135,000 TB.
Latest Answers