## Sunday, November 25, 2012

### The 234-bit gene that turns an ape into a man

You must have wondered why some of us are human while others are just apes. As I learned from an article that was sent to me by Peter F., all the difference may boil down to 117 base pairs on the 20th chromosome.
Study: Single Gene, Plus Some "Junk" DNA Turned Ape Ancestors Into Modern Man (Daily Tech)
Recall that all the information needed to create and run an organism is digitally stored in the DNA molecule, a sequence of base pairs. Each base pair is either AT or TA or CG or GC (the first and second letter correspond to the 1st strand and the 2nd strand of the double helix and they're locally distinguishable). Because you have 4 possibilities, the base pair carries roughly 2 bits of information.

The DNA sequence is divided into chromosomes. Humans have 23 pairs of chromosomes; all apes have 24 pairs of chromosomes. In total, they carry a few gigabytes of genetic information (3.08 billion base pairs or 6.16 billion bits), not far from an operating system. It's somewhat hard to believe that our not having one of the chromosomes is what makes us – apparently and only in some cases – superior relatively to the apes. There has to be a "positive difference" that favors us, too.

The researchers have previously looked for active genes – shorter sequences of base pairs that play some role (not just junk DNA) – and they have found essentially one solution: the MIR 941-1 gene (it also produces equally named MicroRNA molecules in all our, eukaryotic cells). The authors of the article in Nature Communications
Evolution of the human-specific microRNA miR-941 (Hai Yang Hu and 12 co-authors)
have used some now common methods to determine when and where this inherently human gene appeared for the first time. They found out it appeared between 6 million and 1 million years before Christ (or the present, for that matter: note that because of this parenthesis, this blog entry will become outdated in as little as one million years).

It's fun to repost the whole abstract. I hope that some other readers will, just like your humble correspondent, misunderstand some of this genetic jargon (and its content):
MicroRNA-mediated gene regulation is important in many physiological processes. Here we explore the roles of a microRNA, miR-941, in human evolution. We find that miR-941 emerged de novo in the human lineage, between six and one million years ago, from an evolutionarily volatile tandem repeat sequence. Its copy-number remains polymorphic in humans and shows a trend for decreasing copy-number with migration out of Africa. Emergence of miR-941 was accompanied by accelerated loss of miR-941-binding sites, presumably to escape regulation. We further show that miR-941 is highly expressed in pluripotent cells, repressed upon differentiation and preferentially targets genes in hedgehog- and insulin-signalling pathways, thus suggesting roles in cellular differentiation. Human-specific effects of miR-941 regulation are detectable in the brain and affect genes involved in neurotransmitter signalling. Taken together, these results implicate miR-941 in human evolution, and provide an example of rapid regulatory evolution in the human linage.
To be brief, our monkey ancestors became human when they escaped regulation and Africa. The individuals who want to return regulation to the human society want to revert the progress and make us apes again.

Ms Tereza Fajksová won the global Miss Earth 2012 (Biomiss 2012) contest promoting environmental awareness. The young woman, shown with some biomakeup, has been the hero of the local Filipinos for months, despite the fact that their local Miss Earth Philippines said that she wanted energy to be renewable while the winner's "positive" attitude to Nature means mainly that she loves fishing and mushroom hunting. ;-)

Fine, so it's probably a crucial gene that made the first humans – skillful apes.

It's irresistible to look at the servers of the University of California in Santa Cruz where the human genome and other genomes is mapped:
Gene turning apes to humans (UCSC)
These 117 base pairs (it's not an accident that the number is a multiple of three!) are those that are most responsible for ours feeling more skillful or, in many cases such as mine, at least cleverer than apes. In our way of measuring the information, it's just 234 bits.

In many respects, the digital information stored in the DNA and the digital information defining software is comparable, even when it comes to the size and its dependence on the abilities of the program or the organism. But these 117 base pairs are 234 bits or 29.25 bytes of information only. And they can make a human out of an ape? It's remarkable, isn't it?

It's like taking your code for MS-DOS and adding the following 29 bytes somewhere to the MS-DOS executables:
Behave like a userfriendly OS
I really wanted to write "Behave like a user-friendly OS and add all those nice windows, you chimp-like primitive operating system!" but I didn't have enough memory for that.

And this extra command would turn the MS-DOS into Windows. Isn't it cool? This represents anecdotal evidence that the abilities of an organism aren't simply monotonically increasing functions of the DNA size – even though we tend to think that it's pretty much the case for software. Instead, the "need for longer DNA codes" is just an approximate, overall condition, and sometimes very modest changes (MIR-941) and/or subtractions (missing 24th chromosome) can make a huge positive difference, too.

Nature stores the information in a much less predictable, more noisy way. This "holographic" feature of the DNA code is what makes it both impenetrable for a human reader as well as natural. On the other hand, the impact of pieces of software is much easier to be reconstructed which is why software is less subtle and man-made. Nature doesn't care about transparency and comprehensibility which is why it may afford obscure DNA codes whose importance isn't immediately clear for the DNA code's readers (She always understands Herself, She knows what to do, and She doesn't have to care for others).

To be more specific and geneticist-like, the birth of a new MicroRNA modifies the expression of hundreds of other genes – and the small code change therefore has a big, nonlocal effect.

Needless to say, genetics isn't the ultimate reason why I write these comments. I am secretly talking about physical theories. Deep, natural, physical theories often have very important consequences but one must learn and think hard to figure out what these consequences are. If someone isn't patient or smart enough and/or complains that the consequences aren't immediately obvious, he is implicitly demanding Nature to be overly simplistic, man-made, and user-friendly. But Nature isn't obliged to be like that and the person who complains is de facto an ape.

And that's the memo.

1. Since this is a molecule that controls how much other genes are expressed, it may have achieved significance by producing a widespread effect in the brain which was then finetuned by microevolution. For example, suppose its effect was to generally increase neural growth or neural connectivity. Such a change wouldn't uniformly be beneficial, but it might be more often beneficial then harmful. So the first mutants would have generally "denser" brains, obtaining both the benefits and the harms, and then the harms can be finetuned away in subsequent generations.

2. ...."subtractions (missing 24th chromosome) can make a huge positive"....

Not sure if it is missing. I understand the theory is that it became fused with another chromosome and became human CHR 2.

3. People have to stop calling introns "junk DNA". More and more, the junk is found to contain regulatory genes that have important control effects and can be activated or inactivated by epigenetic factors.

4. Good point, Gordon. Alternatively, you could have said that people should start to recycle junk because it may be good for many things. ;-)

5. Lubos, you have a typo there. "AT or TA or CG or GG" should be "AT or TA or CG or GC"

6. Less hair smaller teeth smart

My twenty nine bits worth, because I like to be specific. But maybe it's like politics, where central command is the problem.

7. What a great (and cheeky) article you turned it into! :-)

8. The role of epigenetics to change how we function/behave might be hugely underestimated.

9. "In total, they carry a few gigabytes of genetic information (3.08 billion base pairs or 6.16 billion bits) . . ." That is all? I was thinking it was more like 4 raised to the power of 3 billion, the number of possible sequences. What is the relationship between information and the total number of possibilities.

10. No, Luke! The information is just 0.75 gigabytes or so. The number of possible combinations is 4^3billion but the number of possible combination isn't information. It's the "exponential" of the information. The same is true for computers. When you have a CD ROM disk, it also has about 750 megabytes and the number of possible things you may save on a DVD is also about 4^3bilion.

Yours is an elementary mistake, suggesting that you always want to add one more (wrong) exponential to look cooler, right? ;-)

11. Yes, Tim I seem to remember this as well.

12. Well, it is not being ignored, Peter. Epigenetics has been THE sexy, hot topic for several years now.

13. Yes, genes that control other genes are important. But "one gene between humans and apes" is an overstatement. There are lots of other important genetic differences. We can quantify overall genetic distance using one of a variety of schemes, it looks like this: http://i.imgur.com/IyMe8.jpg

14. ...But these 117 base pairs are 234 bits or 29.25 bytes of information only. And they can make a human out of an ape? It's remarkable, isn't it?...

Too remarkable. The same thing was claimed about Fox3 gene some years ago. This is one example of change they found, that's all. All that so-called junk DNA is for the most part probably creating regulatory proteins like IRNAs and SRNAs, or perhaps even changing methylation and acetylation properties of DNA which are also very important in regulating gene expression.

There is a big difference between genes and software. Genes have multiple regulatory elements, up, down, and every which way, so a single gene can produce hundreds of possible proteins as a result of all that interaction. Ontologically it is a universe away from software code.

15. Ok, I think I get it. One of Shakespeare's plays contains, say, 100,000 letters in a particular sequence, which is roughly a measure of the amount of information in it. OTH, the number of possible 100,000 letter sequences (typed by monkeys randomly pounding on a keyboard) is about 100,000 orders of magnitude greater. Or something like that.

16. Hi John H! Are you the "healthily curious" Australian person that I hope you are? :-)

17. Is it correct to determine the information potential of any entity independently of the context in which it functions? The information potential of the genome is dependent on the context of its function. Outside of the body, what information is useful? In a cell the potential interactions are huge, and that to me seems a more accurate way to determine the information potential of DNA.

18. Nice article. The 'key' has been attributed to increased "neotony" in humans. Basically, we keep more infant like traits into adult size. Look at an infant chimp. More human like face, forward facing genitals (female chimps get much larger and rear facing genitals), larger head / brain in proportion to body. We even have much less 'power per pound' of muscles as do infants (so as to protect them from hurting themselves early on). It's a long list. So all that gene needs to do is slow down the rate of 'change of traits' with respect to the rate of growth in size and age. Pretty much the rest is a consequence.

Oh, and we didn't lose a gene. Ape 2a and 2b fused into our "2". See the comparative gene maps in this article (and the articles to which it links, if needed):

http://chiefio.wordpress.com/2012/09/25/orangutan-gramps-and-chimpanzee-grammy/

Basically, looking at the gross structure of the gene "layout" looks more like a mix of Chimp and Orangutan to me (or the Orangutan precursor that was around about 6 million years ago when humans started to split off the line). I speculate that a proto-human line began when a proto-orangutan and a proto-chimp got some 'on the wild side' and mushing the genes together 'had issues'. (This is fairly common in nature. Look up the "Triangle of Wu" and the various crosses of sheep / goats and various horse species. The 'species barrier' is really a 'species strong suggestion'... IIRC the sheep / goat crosses get a similar gene count change, though not a fusion. But fusions and transpositions of large chunks between chromosomes are relatively common.)

So figure on an inter-species hybrid, a chromosome fusion event (that may have changed how some genes are expressed), then 'back crossing' to the source species to 'stabilize the cross'. (Also a fairly standard technique in plant and animal breeding, especially inter-species crosses). Finally, the chromosome count stabilizes and you get a stable population that can be selected. From that point on, point mutation takes over in fine tuning things.

When did that particular gene get 'fine tuned'? In the original cross? In the back-crossing? In the gene fusion event? As it's not on 2a, 2b, or 2; probably not the fusion event. I'd guess were were a stable cross, but not too widely spread or divergent, when that point mutation raised the neotony flag and started on our way to big brains and world domination. (Mostly as we were smaller brained and not fully upright in earliest fossils).

Oh, and hope you are feeling well...

19. If the difference between humans and primates is 46 vs 48 chromosomes, where do Neanderthals fall? On the 46 side or on the 48 side (they were bigger brained than us, however they were clearly less clever and certainly a lot hairier.)

20. Hi, they were not too much more hairy than myself, and I won't comment on cleverness. But those attributes aren't simple functions of the number of chromosomes, anyway.

I don't know the answer but I would bet it is 46 - after all, the Neanderthals are "homo".

21. You make many assumptions... the logo is part of a chromosome... relational to the creation of eternal life. Absolutely no religion is involved, however a belief in prophecies and the means the Creator communicates to everyone is (Numbers 12:6)

I have faith that the communication works, like Einstein, Edison, Tesla, Mendeleyev and Darwin and vast numbers of other creative and high achieving people did.

What have blacks got to do with it? We are all descended from them or have you never seen white skinned negroid people?