There are at least 200 forms of cancer, and many more subtypes. Each of these is caused by errors in DNA that cause cells to grow uncontrolled. Identifying the changes in each cancer’s complete set of DNA – its genome – and understanding how such changes interact to drive the disease will lay the foundation for improving cancer prevention, early detection and treatment.
The Cancer Genome Atlas (TCGA), a collaboration between the National Cancer Institute (NCI) and National Human Genome Research Institute (NHGRI), has generated comprehensive, multi-dimensional maps of the key genomic changes in 33 types of cancer. The TCGA dataset, 2.5 petabytes of data describing tumor tissue and matched normal tissues from more than 11,000 patients, is publically available and has been used widely by the research community. The data have contributed to more than a thousand studies of cancer by independent researchers and to the TCGA research network publications.
TCGA created a genomic data analysis pipeline that can effectively collect, select, and analyze human tissues for genomic alterations on a very large scale. The success of this national network of research and technology teams serves as a model for future projects and exemplifies the tremendous power of teamwork in science.
Though TCGA is coming to a close in 2017, new NCI genomics initiatives, run through the NCI Center for Cancer Genomics (CCG), will continue to build upon the success of TCGA by using the same model of collaboration for large-scale genomic analysis and by making the genomics data publically available.
Visit the Center for Cancer Genomics website for more information on the NCI’s current and future initiatives in cancer genomics.
The TCGA dataset was generated by the TCGA Research Network. Learn more about each component of the network:
Biospecimen Core Resource (BCR) – Tissue samples were carefully cataloged, processed, checked for quality and stored, complete with important medical information about the patient.
Genome Characterization Centers (GCCs) – The Genome Characterization centers used several technologies to analyze genomic changes involved in cancer including gene expression levels and structural rearrangements of the genome.
Genome Sequencing Centers (GSCs) – High-throughput Genome Sequencing Centers identified the changes in DNA sequences that are associated with specific types of cancer.
Data Coordinating Center (DCC) and Cancer Genomics Hub (CGHub) – The information that was generated by TCGA was centrally managed at the DCC and entered into the TCGA Data Portal and Cancer Genomics Hub as it became available. The data is now stored and distributed by the NCI Genomic Data Commons.
Genome Data Analysis Centers (GDACs) – GDACs integrated immense amounts of data from array and sequencing technologies across thousands of samples. These centers provided novel informatics tools to the entire research community to facilitate broader use of TCGA data.
Analysis Working Groups (AWGs) – AWGs are interdisciplinary, international groups of scientists that perform a global, integrative analysis on each TCGA tumor type. Every AWG studies a particular tumor type using all of the TCGA platforms and publishes an analysis of their findings in a peer-reviewed journal to benefit the cancer research and clinical communities.
Learn more about TCGA by selecting a link below:
Learn more about the cancer genomics field and TCGA's place in it by selecting a link below: