Responsible Use of Data Generated by the TCGA Program
Data Release Policy
The primary purpose of The Cancer Genome Atlas (TCGA), as described on the program’s website (cancergenome.nih.gov) and in published commentary on the program1 is to generate and publish a comprehensive catalog of the genomic changes found in all cancer types that affect the U.S. populace. The National Cancer Institute (NCI) and National Human Genome Research Institute (NHGRI) have identified TCGA as a “community resource project”, a research project specifically devised and implemented to create a set of data, reagents or other material whose primary utility will be as a resource for the broad scientific community. As such, TCGA has adopted and follows a policy of releasing data as quickly as possible, prior to publication, anticipating that these data will be useful to many investigators. TCGA anticipates that its data will be of high value in a number of research areas and will be used in many ways. Those include but are not limited to development of new analytical methods, identification of the genomic etiology of individual tumor types and subtypes, and development of new experimental diagnostic, therapeutic and preventive approaches and strategies for cancer. Thus, TCGA recognizes that the data should be available to all investigators for any bona fide biomedical research purpose. Investigators are required to demonstrate their qualifications and describe their intended research uses to access data sets that harbor potential risk of privacy of research participants2, as described in TCGA Human Subjects Protection and Data Access Policies (The Cancer Genome Atlas Program Human Subjects Protection and Data Access Policies).
Responsible Use of TCGA Data
The recommendations from the Fort Lauderdale meeting on best practices and principles for sharing large-scale genomic data address the roles and responsibilities of data producers, data users and funders of community resource projects. The aim of the recommendations is to establish and maintain an appropriate balance between the interests that data users have in rapid access to data and the needs that data producers have to publish and receive recognition for their work. The conclusion of the attendees at the Fort Lauderdale meeting was that a “responsible use” approach for secondary data users would be sufficient to ensure that the efforts of data producers will be recognized. “Responsible use” was defined as allowing the data producers to have the opportunity to publish the initial global analyses of the data, as specifically articulated at the outset of the project, within a reasonable period of time. TCGA requests that data users abide by this principle, as further articulated below.
TCGA has considered the risks and the advantages of open release of genomic sequence data from the Cancer Cell Line Encyclopedia (CCLE) project. TCGA has decided that genomic sequence and variation data sets generated from the CCLE will be made available to researchers without requiring Data Use Certification. The data are available at CGHub. Read more about TCGA’s analysis and discussion here.
1Collins FS and Barker AD: Mapping the cancer genome. Pinpointing the genes involved in cancer will help chart a new course across the complex landscape of human malignancies. Sci Am. 2007 Mar;296(3):50-7. PMID: 17348159.
2Lowrance WW and Collins FS: Identifiability in genomic research. Science. 2007 Aug 3;317(5838):600-2. PMID: 17673640.