Junk DNA: Evidence of Alien Life Form?
Researchers at
Non-coding sequences are common to all living organisms on Earth, from molds to fish to humans. In human DNA they constitute almost a half of the total genome, says Prof. Sam Chang, the group leader. Non-coding sequences, also known as junk DNA, were discovered years ago, and their function remained mystery. Unlike normal genes, which carry the information that intra-cellular machinery uses to synthesize proteins, enzymes and other chemicals produced by our bodies, non-coding sequences are never used for any purpose. They are never expressed, meaning that the information they carry is never read, no substance is synthesized and they have no function at all. These junk genes merely enjoy the ride with hard working active genes, passed from generation to generation. What are they? How come these idle genes are in our genome? Those were the question many scientists posed and failed to answer, until the breakthrough discovery by Prof. Chang and his group.
Trying to understand the origins and meaning of junk DNA Prof. Chang realized that he first needs a definition of "junk". Is junk DNA really junk, or does it contain meaningful information, which is not claimed for whatever reason? He once mentioned the question to an acquaintance, Dr. Lipshutz, a young theoretical physicist turned Wall Street derivative securities specialist. "Easy," replied Lipshutz. "We'll run your sequence through the software I use to analyze market data, and it will show if your sequences are total garbage, "white noise", or there is a message in there." This new breed of analysts with strong background in math, physics and statistics are getting more and more popular with Wall Street firms. They sift through gigabytes of market statistics, trying to uncover useful correlation between the various market indexes, and individual stocks.
Working evenings and weekends, Lipshutz managed to show that non-coding sequences are not all junk, they carry certain information. Combining massive data base of the Human Genome Project with thousands of data files developed by geneticists all over the world Lipshutz calculated the Kolmogorov entropy of the non-coding sequences and compared it with the entropy of regular, active genes. Kolmogorov entropy, introduced by the famous Russian mathematician half a century ago, was successfully used to quantify the level of randomness in various sequences, from time sequences of noise in radio lamps to sequences of letters in 19th century Russian poetry. By and large, the technique allows researchers to quantitatively compare various sequences and conclude which one carries more information than the other. "To my surprise, the entropy of coding and non-coding DNA sequences was not that different", continues Lipshutz. "There was noise in both but it was no junk at all. If the market data were that orderly, I would have already retired."
After a year of cooperation with Lipshutz, Chang was convinced, there is a hidden information in junk DNA. But how could one understand its meaning if the information is never used? With active sequences you try to watch the cell and see what proteins are being made using the information. This wouldn't work with dormant genes. There will be experiment to test a hypothesis, one should rely on the power of his thought. The information should be deciphered as Mayan or Egyptian writings. Prof. Sam Chang solicited help from three specialists in the field, but none of them managed to find a solution. There were no cultural clues, no references to other known languages, the field was too alien for the linguists.
"I asked myself: who else can decipher a hidden message?" Chang continues. "Of course, cryptographers! And I began talking with researchers at the National Security Agency. It took me few months to make them return my calls. Were they running background checks on me? Or were they too busy lobbying senators on retaining and strengthening their authority to control exports of encryption technologies? Eventually, a junior fellow was assigned to answer my questions. He listened, requested my questions in writing and after another few months turned me down. His message was polite but meant "Go to hell with your crazy ideas. We are a serious agency, it's National Security, dude. We are too busy."
“Well, Sam, forget the Government, talk to the private sector. And I began approaching computer security consultants. They were genuinely interested, and a couple of them even began working on my project, but their enthusiasm always faded after a month. I kept calling them until one nice fellow told me: "I'd love to work on your project if I had more time. I am overbooked. Emissaries of major banks and Fortune 500 companies are begging me to plumb the holes in their networks. They pay me $500 an hour. I can give you an educational discount, can you afford $350?" Scrambling $15/hr for a post doc is a big deal in academia, $350 sounded as something extraterrestrial." Eventually Prof. Chang was referred to Kharen Musaelian, a cryptographer in the former Soviet
Kharen promptly confirmed the findings of his Wall Street predecessor: the entropy indicated tons of information almost in the clear, it was not a strong crypto system. It did not appear to be a tough problem. Kharen began applying differential cryptanalysis and similar standard cryptographic techniques.
He was two months in the project when he noticed that all non-coding sequences are always almost preceded by one short DNA sequence. A very similar sequence usually followed the junk. These segments, known to biologists as alu-sequences, were all over the whole human genome. Being non-coding, junk sequences themselves alu are one of the most common genes of all.
Trained a cryptographer and computer programmer Kharen sought of the genetic code as of computer code. Dealing with 0, 1, 2 , 3 (four bases of genetic code) instead of 0s and 1s of the binary code was a sort of nuisance, but the computer code was what he was analyzing and deciphering all his life. He was on familiar territory. The most common symbol in the code that causes no action, followed by a chunk of dormant code, what is that? Just playing with the analogy Kharen grabbed the source code of one his programs and fed it into the program that calculates the statistics of symbols and short sequences, a tool often used in decoding messages. What was the most common symbol? Of course, it was "/", a symbol of comment! He took a Pascal code, and it were { and } ! Of course, the code between two slashes in C is never executed, and is never meant to be executed; it is not the code, it is the comment to the code.
Being unable to resist the temptation to further play with the analogy, Kharen began comparing statistical distributions of the comments in computer and genetic code. There must be a striking difference: comments are in English, which has a different distribution of characters than C, Java or Cobol. This should show up in statistics. Nevertheless, statistically, junk DNA was not much different from active, coding sequences. To be sure, Kharen fed a program into the analyzer: surprisingly, the statistics of code and comments were almost the same. He looked into the source code and realized why: there were very few comments in English between the slashes, it was mostly C code the author decided to exclude from execution, a common practice among programmers.
One who wrote the human genetic code was not very well organized, he was a rather sloppy programmer. At one point Kharen began thinking about the divine hand, but after analyzing the spaghetti code inside junk DNA sequences he convinced himself that whoever wrote the code was not God. It looked like rather somebody from Microsoft, but at the time human genetic code was written there was no Microsoft on Earth.
On Earth? It was like a lightning... Was the genetic code for all life on Earth written by an extraterrestrial civilization and then somehow deposited here, for execution? The idea was mad and frightening, and Kharen resisted it for few days. Then he decided to proceed. If the non-coding sequences are parts of the program that were rejected or abandoned by the author, there is a way to make them work. The only thing one needs to do is to remove the symbols of comments and if the portion between the /* .... */ symbols is a meaningful routine it may compile and execute. Following this line of though Kharen selected only those non-coding sequences that had exactly the same frequency distribution of symbols as the active genes. This procedure excluded the comments in Marcian or Q, whatever it was. He selected some 200 non-coding sequences that most closely resembled real genes, stripped them of /*, //, and similar stuff and after few days of hesitation sent e-mail to his American boss, asking him to find a way to put them in E-coli or whatever host and make them work.
Chang did not replied for two weeks. "I thought I was fired", confessed Dr. Musaelian. "With every day of his silence I more and more realized how crazy my idea was. Chang would conclude I was a schizophrenic and would terminate the contract. Chang finally responded and, to my surprise, he did not fire me. He had not bought my extraterrestrial theory but agreed to try to make my sequences work."
Biologists have attempted to make junk sequences express, without much success. Sometimes nothing turned out, sometimes it was junk. It was not surprising: grab an arbitrary portion of the excluded computer code and try to compile it. Most likely, it will fail. At best, it will produce bizarre results. Analyze the code carefully, fish out a whole function from the comments, and you may make it work. Because of careful Musaelian's statistical analysis 4 of the 200 sequences he selected began working, producing tiny amounts of a chemical.
"I was anxiously awaiting the response from Chang," says Dr. Musaelian. "Would it be a more or less normal protein or something out of this world?” The answer was shocking: it was a substance, known to be produced by several types of leukemias in men and animals. Surprisingly, three other sequences also produced cancer-related chemicals. It no longer looked like a coincidence. When one awakens a viable dormant gene it produces cancer related proteins. Researchers began searching Human Genome Project databases for the four genes they isolated from junk DNA. Eventually, three of the four were found there, listed as active, non-junk genes. This was not a big surprise: since cancer tissues produce the protein it's likely there is a gene somewhere which codes it. The surprise came later: in the active, non-junk portion of the code the gene in question (the researchers called it jhlg1, for junk human leukemia gene) was not preceded by the alu sequence, i.e. the /* symbol was missing. However, the closing */ symbol at the end of jhlg1 was there. This explained why jhlg1 was not expressed in the depth of the junk DNA but worked fine in the normal, active part of the genome. One who wrote the human genetic code was a very sloppy programmer: he excluded portion of the code by embracing them in /* ... */ but missed some of the opening /* symbol. His compiler seems to be garbage, too: a good compiler, even from terrestrial Microsoft, would most likely refuse to compile such program at all.
Prof. Sam Chang with his students began searching for genes associated with various cancers, and almost in all instances they discovered that those genes are followed by the alu-sequence (i.e. the comment closing symbol), but never preceded by the comment opening /* gene. "We are convinced that cancer tissues were never meant for terrestrial life. They belong to a different planet, to an alien life form. This explains why diseases result in cell damage and death, whereas cancers lead to reproduction and growth. It's an alien life that tries to grow inside the patients' bodies. Because only few fragments of this alien genome are expressed, they never lead to coherent growth. What we get with cancer is expression of only few of the alien genes that lead to bizarre and apparently meaningless chunk of living cells, though with its own veins and arteries, and its own immune system that vigorously resists all our anti-cancer drugs.
Our hypothesis, which we believe we will soon be able to prove, is that a higher alien life form was engaged in creating new life and planting it on various planets. Earth is just one of those planets. We do not know the motives of our creators: whether it was a scientific experiment, or the way of preparing new planets for colonization by the master race, or is it ongoing business. Perhaps, our creators grow us the may we grow bacteria in Petri dishes. Whatever the motive, the extraterrestrial programmers who worked on the genetic code were working on several projects. Most likely, they have been writing a big program product which should have produced various life forms for various planets. They have been also trying various solutions. They wrote the code, executed it and did not like some functions. The big project was underway. A lot of ideas, a lot of classes, a lot of incomplete procedures. Of course, it was behind the schedule. Few deadlines have already passed. Then the management began pressing for an immediate release. The programmers were ordered to cut all their plans for the future and to concentrate on the Earth project, to meet the pressing deadline. There was no time to finish the overall, multiple planet project. So, the programmers merely cut pieces of the code that were not intended for Earth. However, at that time they were not quite certain which functions may be needed in the future upgrades. Besides, they had no time to delete the lines of code. So, instead of deleting the lines they converted them into comments, and missed few /* symbols, thus presenting the mankind with the gift of cancer. "
"If we were able to efficiently insert genes into the chromosomes of living men, our breakthrough discovery would mean instant cure for all future cancer cases. The only thing we need to do is to put missing /* symbols before the rogue genes. Theoretically, we can do it in the laboratory, but we cannot implant the repaired DNA into living subjects. The mystery of cancer is solved but no quick cure shall be expected. The best thing we can do is to try nourish new, cancer-free line of humans with debugged genetic code. For us, and our children, there is no hope on the horizon."
"To be precise, we cannot show that the other life form was intended for another planet. Technically, what we see is the code from another project. This may be just a new race which is also intended for Earth. It's more comforting to think that the creators prepared it for deployment on another planet, and is not planning to replace the current version with major upgrade. We do not want to be ver. 3.11, do we?
We have to come to grips with the notion that every human on Earth carries genetic code for his extraterrestrial cousin, or major future upgrade. It may shake our beliefs in our power over our own destiny, but poses no immediate danger. There is no reason for panic. Our creators are not apparently interested in the fate of their program product, and we do not believe they would intervene after this news are announced. Judging by the fact that they did not fixed the cancer bug in the last 2 billion years, they delivered the program without technical support or dropped it long ago.
We have been already swamped with calls and e-mails from politicians, journalists and concerned citizens. Dr. Lipshutz and myself will address those concerns during our press conference to be held at 5 p.m. on the April 1 at the Clark Hall of Cornell University.”