• National Cancer Institute
  • National Human Genome Research Institute

Data Sharing and Data Management

The Cancer Genome Atlas Project (TCGA) is yielding an unprecedented amount of genomic information on participant samples. The informatics component of TCGA involves developing best ways to collect, store and distribute the clinical and genomic data generated by the project. All information, except lower level sequencing data, is available at the TCGA Data Portal

The DCC is tasked with:

  • protecting patient privacy and confidentiality through secure access for research that are classified as controlled access datasets
  • developing data standards and controlled vocabularies
  • establishing informatics pipelines for dataflow from production centers to a central repository for data access

The TCGA Data Portal stores much of the data generated from TCGA. Lower level sequence data are stored only at the Cancer Genomics Hub (CGHub). Within the Data Portal, most data are publicly accessible without any restriction; however, access to some lower level data requires user certification for data access.

View more information on TCGA's Data Access Tiers.