2012 Project B: Statistical Models to Assess the Clinical Importance of Rare Genetic Variants in Cancer
January 24, 2012
Dr. Paul C. Boutros, Principal Investigator at Informatics & Biocomputing, Ontario Institute for Cancer Research
Dr. Ben Neel, Professor at Department of Medical Biophysics, University of Toronto, and Senior Scientist at Ontario Cancer Institute
Dr. Bradly Wouters, Professor at Department of Medical Biophysics, University of Toronto, and Senior Scientist at Ontario Cancer Institute
Ontario Institute for Cancer Research, Toronto
Current cancer drug development typically targets oncogenes and their signalling pathways, based on the premise that tumours become “addicted” to these genes/pathways. Due to these mutations, tumours can also become unusually dependent on pathways that are not directly affected by mutation; we refer to this concept as emergent synthetic lethality. Recent evidence indicates that drugs exploiting synthetic lethality can be highly potent and selective; e.g., PARP inhibitors kill cancer cells with mutations in the BRCA1/2 genes. Yet there has been no systematic, assessment of synthetic lethality relationships in relevant human tumour models.
The Ontario Research Fund has recently funded us to undertake a large, systematic study of this question. In particular, we are sequencing the entire genomes of 200 xenografts – these are primary human cancers (ovarian, colon, lung, pancreas) that have been implanted into immunodeficient mice. For each tumour a diverse array of molecular data is being collected, including exome sequencing, transcriptome sequencing, and genome‐wide copy‐number variation. Two major challenges in analyzing this data will be: 1) the presence of contaminating sequence (noise) from the host mouse genome and 2) the large number of rare‐frequency events. These will be the topics focused on by the intern.
The intern student can be involved in both of these sub‐projects (contaminating noise and rare‐frequency events), but will focus on one of them. To overcome the presence of contaminating noise, we will employ a variety of deconvolution techniques. For example, one approach would be to do a competitive alignment process, attaching a likelihood to each data‐point and using these to determine the weight of evidence for a specific mutation.
To tackle the rare‐frequency event problem may require Bayesian techniques, along with data‐reduction. In particular, if we hypothesize that there is allelic heterogeneity (e.g. as for BRCA1), then individual loci will have insufficient power so nested modeling will be required to collapse across regions or functional groupings. Our proposed plan is to aggregate the rare variants in each protein‐coding gene as well as non‐coding genomic area to improve the statistical power of detection. For each type of variation, we implement methods to assess the functional impact of variations (e.g. a score) on the protein function (based on typical indicators such as evolutionary conservation of amino‐acid residues). We use these scores along with genomic arrangements of each individual as inputs for GWA analyses. For GWA analysis we use statistical signal processing methods to pinpoint and refine the impact of rare events in the noise created by common variants. We propose to investigate methods such as recursive least square filtering, principle component analysis and latent‐variable dimensionality reduction algorithms. We will finally improve our detection capabilities by incorporating information from pathways known to contribute to prostate cancer. This results in a network‐based approach to identify collective contribution of multiple biomarkers in cancer. These are exciting challenges in modern biostatistics, and will provide the intern with exposure to cutting edge techniques.
The intern will be treated as a key member of the ORF/GL2 research team. They will participate in the weekly team meetings, and interact on a regular basis with clinicians, molecular researchers, bioinformaticians, and biostatisticians. Physically, they will be located within the Boutros Lab at OICR, which is a team of biologists, statisticians, engineers, and computer scientists working on techniques for biomarker discovery. They will be able to interact with Dr. Boutros on a regular basis, and will be expected to attend the weekly lab meetings, institutional seminars and journal clubs. OICR has a strong computing facility, including a >5000 core computing cluster and an svn repository for code management. Students will be treated as full members of the lab, and will have the opportunity to present their results in lab‐meetings and, if appropriate, to draft their own manuscripts for publication.