Posted: February 1, 2011
TCGA: A Future Arrived
Brad Ozenberger, Ph.D.
TCGA Program Director for the National Human Genome Research Institute (NHGRI)
The Cancer Genome Atlas (TCGA) program was launched in 2006, anticipating revolutionary advances in DNA sequencing technologies appearing on the horizon, which have since been realized. Goals that may have seemed outlandish—analyzing every major, and not so major, tumor type; building an initial atlas from characterization of hundreds of tumor cases and looking deeply into the genomes of each; investigating not hundreds of genes but entire exomes and genomes; as well as moving mRNA expression and DNA methylation analysis from microarray to digital approaches using unbiased sequencing methods—are now within reach.
To understand the scale of TCGA, it is an interesting exercise to compare the data output with first generation sequencing from just a few years ago (see table), or with other large sequencing programs such as the 1000 Genomes effort to populate the database of human variation or the nascent Human Microbiome Project (HMP) to reveal unknown human microbial fauna and its association with health and disease.
The Human Genome Project produced tens of billions (giga-) of bases of DNA sequence over many years to produce the first human reference genome. In the last half of 2010, the data producers for 1000 Genomes generated human DNA sequence at a rate of 1,000 gigabases per month. HMP is currently generating 1,500 gigabases of DNA sequence on a monthly basis. The large-scale centers, using the latest in next generation sequencing instruments have bumped the scale up not by a factor of 10, but by a factor greater than 1,000! TCGA in late 2010 averaged 7,300 gigabases/month, several fold greater than any other existent genome project. That number represents only the primary genomic sequencing. More sequencing data are flowing from the Genome Characterization Centers examining RNA expression. DNA methylation has not yet transitioned to sequencing but pilot projects have demonstrated the feasibility.
DNA Sequence Generation
|All NHGRI Projects||12|
The immense scale of TCGA and the cutting-edge nature of the research are sometimes forgotten as TCGA outsiders, and insiders, consider the progress being achieved. It is natural to be impatient and criticism of the pace of TCGA is not unexpected. Of course, the true value of TCGA will not be measured by how much data is produced but by how the project has enabled the acceleration of cancer research and helped cancer patients directly and indirectly. Those metrics will be difficult to assess and will lag the release of TCGA datasets, but NHGRI and National Cancer Institute leadership are confident that these ultimate goals of TCGA will be realized.