• National Cancer Institute
  • National Human Genome Research Institute
PERSPECTIVES

Posted: April 2, 2014

TCGA and Its Vital Role in Understanding How Germline Variation Informs the Landscape of Somatic Alterations in Cancer

Image: Dr. Stephen Chanock

Dr. Stephen Chanock, M.D., Director of the Division of Cancer Epidemiology & Genetics at the NCI

The initial excitement surrounding the publication of a series of marker papers from the NCI/NHGRI Cancer Genome Atlas (TCGA) has underscored the complexity of cancer. These studies have reported on the breadth and extent of somatic alterations in distinct types of cancers, revealing a wide spectrum of genetic alterations as well as uncovering a diverse set of somatic events that could drive cancer development.  These observations are of a scope greater than what we initially envisioned as this project began in 2006.  As TCGA has transitioned to a larger and bolder program, now focused on at least 30 distinct cancer types, it has become a driving force for the advancement of sequencing technologies, informatics, and standards to solve the many challenges of a large-scale cancer genomics, from tissue acquisition to genomic characterization to analysis. Consequently, it should have pleiotropic effects in cancer genomics and thus contribute to the discovery and characterization of the landscape of alterations that eventually will be the foundation for effective therapeutic approaches. The molecular epidemiology community, including NCI’s intramural Division of Cancer Epidemiology and Genetics (DCEG), has been energized by the findings of TCGA and now collectively face the task of re-interpreting older studies as well as designing new ones that fully account for the new cancer taxonomy (e.g., molecular subtypes) emerging from TCGA. 

The scope of alterations in genes, both drivers of cancer and ‘passengers’, arising due to the disruption of one or more key pathways, is orders of magnitude larger and more complex than what Boveri first proposed in 1914. It is somewhat surprising that the majority of events are observed less commonly- in other words, that there are few recurrent mutations that appear to drive a particular cancer. Instead it appears that there are different genetic ‘routes’ to cancer, perhaps through select common pathways and mechanisms. Still, we can anticipate new insights that will emerge based on the next generation of sophisticated analyses, including by pathways as well as integration across types of events assayed by different platforms.  Nonetheless, new scientific insights have emerged, underscoring the role of epigenetic events, alternative splicing and chromosomal rearrangements.

The findings of TCGA have generated a new set of questions, particularly in molecular epidemiology, that will require further studies. Since TCGA has stringent tumor and normal tissue requirements, sample ascertainment resulted in important biases, some of which have limited our capacity to investigate in TCGA how the germline and environmental factors influence different patterns of somatic alterations. How the germline informs our understanding of specific somatic alterations could have important implications for not only developing new approaches to prevention and early intervention but also precision medicine. For instance, the role of tobacco use-- including types, intensity and duration-- leads to distinct patterns of somatic changes, ones that could be targeted specifically. A major focus for the future will be to better understand genetic and epidemiologic factors that contribute to the risk for distinct molecular subtypes of cancer, some of which were first observed in TCGA analyses. For instance, breast, ovarian and endometrial cancers share risk factors and to a degree some of the somatic profiles overlap, but new studies are needed to interrogate the differences and similarities.  

In TCGA, it is notable that tumors with evidence for strong environmental factors (e.g., smoking for lung and ultraviolet light exposure for melanoma) exhibit more mutations than tumors with no clear extrinsic agent (e.g., chronic lymphocytic leukemia). This revelation points to the need to collect more refined exposure data for the next rounds of studies that comprehensively assess genomic alterations to determine how the germline informs somatic profiles. Accordingly, major clinical trials like ALChEMIST (Adjuvant Lung Cancer Enrichment Marker Identification and Sequencing Trial, conducted in lung adenocarcinoma) have signed onto the inclusion of a mini-epidemiological questionnaire instrument.  In this way, etiologic and outcome heterogeneity due to genes can be more fully investigated in the near future. Similarly, the use of the TCGA-like comprehensive assessment of genomic alterations could be useful for developing effective prognostic markers for outcome and possibly, well in the future, establish precision medicine.

Already, researchers in DCEG regularly utilize TCGA data for a wide range of investigations, including replication studies of genetic and epigenetic findings, methods development for integrative genomic analysis, assessment of functional potential of signals from genome-wide association studies, and evaluation of large-scale chromosomal abnormalities. The ability to use TCGA data to explore this area using gene expression and methylation data is a critical asset as the data are integrated across other somatic alterations and germline genetic variation. SNP genotyping intensity data from TCGA allow us to compare algorithms for detecting somatic large-scale abnormalities such as mosaic gains, losses and copy-neutral events and comparing those changes across tissue sample types. In data from almost 1,800 samples from glioblastoma, ovarian cancer and lung squamous cell carcinoma, we detected large-scale mosaic abnormalities in blood-derived normal samples.  New methods are being developed to carefully analyze germline exome and whole genome sequencing data to identify the scope and density of mosaic events of smaller size. In turn, these studies enable the investigation of how the genome may become unstable with age and contribute to carcinogenesis.   

Now near the conclusion of the data generation phase, TCGA has provided a strong foundation for the discovery and characterization of key biological alterations in the cancer genome. Further discovery of the drivers of cancer is required to generate the catalogue of mutations, some fraction of which could be the target of current or future therapies. This tremendous resource—a catalogue of driver and passenger mutations—enables us to interrogate how germline variants contribute to cancer risk and outcomes, and how they may predict response and toxicity to cancer therapies.