Genome Sequencing Centers

The Cancer Genome Atlas (TCGA) Genome Sequencing Centers (GSCs) perform large-scale DNA sequencing using the latest sequencing technologies. Supported by the National Human Genome Research Institute (NHGRI) large-scale sequencing program, the GSCs generate the enormous volume of data required by TCGA, while continually improving existing technologies and methods to expand the frontier of what can be achieved in cancer genome sequencing.  All sequencing data are available in the TCGA Data Portal or from the TCGA page at NIH’s database of Genotype and Phenotype (dbGaP).

Throughout the TCGA program, the GSCs have continued to evolve their approaches, as seen in this brief timeline:

  • October 2008: TCGA publication on the glioblastoma multiforme genome includes polymerase chain reaction/Sanger dideoxy method for sequencing of 601 target genes. At the same time, GSCs are validating protocols using new second-generation sequencing instruments.
  • March 2009: GSCs introduce hybrid-capture procedure and second-generation sequencing instruments (Illumina and ABI SOLiD) to enable analysis of more than 6,000 known cancer-associated target genes and at production scale.
  • July 2009: GSCs submit first of 24 whole genome sequence (i.e., entire 6 billion nucleotides from both tumor and blood specimens from a cancer case) datasets from the glioblastoma multiforme and ovarian tumor projects.
  • January 2010: GSCs validate whole exome capture methods, thereby expanding analysis of each tumor sample from 6,000 genes to all protein-coding and RNA genes.

Whole Exome vs. Whole Genome

Two DNA samples from every TCGA cancer case – one from the tumor specimen and the second from either blood or non-malignant tissue – are sent from a TCGA Biospecimen Core Resource site to a GSC.  The non-tumor DNA serves as a control to confirm that mutations discovered in the tumor DNA are unique to the tumor and not normal genetic variations within the individual.  All samples are analyzed by whole exome sequencing using second-generation sequencing instruments.  Such instruments can generate the exome data from 8 to 16 samples in a single run in 8 to 14 days.  

Next, more than 10 percent of the samples from each TCGA tumor project undergo whole genome sequencing to reveal mutations that lie outside of the exome regions.

NHGRI awarded funding to three centers as part of its Large-Scale Sequencing Research Network. These three GSCs are:

  • Broad Institute Sequencing Platform, Broad Institute, Cambridge, Mass.
    Principal Investigator: Eric Lander, Ph.D.
  • Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas
    Principal Investigator: Richard Gibbs, Ph.D.
  • The Genome Institute at Washington University, Washington University School of Medicine, St. Louis, Mo.
    Principal Investigator: Richard Wilson, Ph.D.