Anyway, Science announced their 2007 Breakthrough of the Year in this weeks issue (article written by Elizabeth Pennisi). The winner was sort of a culmination of technological advancements over the past few years coupled to a long-standing goal in the field of human genomics. It is best described as the assessment of human genetic variation.
First, let's take a step back. The sequencing of the human genome several years ago was a major achievement for human medicine, but not necessarily for the reasons that most people think. What the sequence of the human genome actually is has always been a common area of confusion within the public. "What does it actually mean? We always hear that our genes make us who we are. I'm very different from any other human, so it follows that my genes must have some differences, right? So then, how can there be one single sequenced human genome?" Great questions. Let me try and clarify a little, it will help to point out why this was such a major breakthrough.
There are indeed some major genetic differences between all humans. The fact remains though, that when you look across a population (at every human being for example), there is what we call a consensus sequence for the genes in our cells. There are approximately 3,000,000,000 base pairs in every one of our cells. A majority of those bases, however, are exactly the same from person to person. In fact, it is estimated that there are only 15,000,000 base pairs that differ consistently from person to person. If you think about it, that is only 5% of the 3,000,000,000 total bases. So we can sequence a mixture of DNA from several people (this was how it was originally done) and get a very good idea of the concensus sequence of the human genome. Additionally, it is really just a compilation of those 15,000,000 differing base pairs [we call them SNPs (Small Nuclear Polymorphisms)] which make us individuals. If you do the math, that leads to a possible 50,625,000,000,000,000,000,000,000,000 (i.e. 50,625 billion-trillion-trillion) individuals with at least one differing base pair. That is just a number and not very meaningful, but you can see how the magnitude of it makes sense considering there is some 6,000,000,000 people in the world. That means that less than 0.000000000000000012% of the possible combinations of SNPs are being used today (man I hope my numbers are right, I did a lot of rounding and simplified things by ignoring insertions/deletions and other complicating factors).
To be clear, try not to put too much stock into these numbers. I am just using them to illustrate a point which will hopefully become apparent at the end of this paragraph. The 250,625 billion-trillion-trillion number does not actually represent the potential number of individuals, and there are two contrasting reasons for this:
- People probably need a minimum number of distinct SNPs in order to actually appear distinct. I couldn't even guess what this number is, but it would decrease that 50,625 billion-trillion-trillion number by some significant factor.
- Our appearance is dependent on the amount of the given gene that is expressed (i.e. the amount of the gene product that is present in our cells), which also has a genetic component to it, but I don't want to drag out here. This complicating factor would increase the number of potential individuals, mitigating to some extent the logic in point 1.
Ok, moving on. Hopefully I have answered the first part of my own hypothetical question. The original human genome sequence was a consensus type of study that generalized most of the sequence features in our genomes. Now, in order to make the human genome useful, we need to be able to sequence the genomes of individuals. And that is exactly what Science (and me too) dubbed the breakthrough of the year. We are finally starting to see some progress in this important area. We are finally able to look at our own DNA sequence, and because we can compare it to other people's sequences, we can learn how the differences affect our appearances (and more importantly our health!). The studies that are geared towards understanding these relationships are called Genome-wide association studies, and there have been more than a dozen of them this year.
The first individual genome sequenced was that of J. Craig Venter, of the institute bearing his name (JCVI). This is no surprise since he was the person who privately funded one of the two original sequencing projects in the late '90s. It took about 4 years, and was finally published this fall. In any case it was a huge accomplishment, and as sequencing technologies become cheaper, better and faster we are going to see this become commonplace. Already there are companies set up to sequence the genomes of individuals (although we should be wary of these claims until the companies have shown they are reliable), and the more sequences we have from individuals, the more data we have for the Genome-wide association studies, and the more we will know about how our genes affect disease and other traits. Exciting, no?
From a scientific perspective, this was a fascinating study because it actually showed that we are even more different than we ever expected. Until the Venter study, it was believed that we shared 99.9% of our DNA with all other humans. We now estimate it to be closer to 99.5%. You may think "hell, both are very high numbers so what's the diff?" Well, it means we were off by a factor of 5 before, so it just goes to show that we have ton of work to do in this field. I think that we are finally starting to see the results from the human genome project that we were expecting all along.