Welcome to Xie Lab

Short Biography

Yang Xie received a PhD in Biostatistics from University of Minnesota in 2006 and a MD from Peking University Health Science Center in 2000. She is an Associate Professor with the Department of Clinical Sciences and Department of Bioinformatics at the UT Southwestern Medical Center. Her research focuses on algorithm development, machine learning and data integration for biomedical research. She is the founding director of the Quantitative Biomedical Research Center and the Pediatric Cancer Data Commons (PCDC) at UT Southwestern Medical Center. She is a member of the NIH Biodata Management and Analysis Study Section [BDMA].

Read More

Education
  • Ph.D. in Biostatistics, 2006

    University of Minnesota-Twin Cities

  • M.S. in Biostatistics, 2002

    University of Minnesota-Twin Cities

  • M.S in Epidemiology, 2000

    Peking Union Medical College

  • M.D. 1997

    Peking University Health Science Center


Research Summary

Biomarker Discovery and Clinical Outcome Prediction

We have developed computational models to predict patient outcomes, which allow clinicians to tailor treatment plans for individual patients ...

Learn more

Methods for High-dimension Data and Integrative Analysis

We have developed bioinformatics tool, computational algorithms and statistical methodologies processing and analysis of high dimensional ...

Learn more

Statistical Learning and Prediction Model

My lab has extensive experience in develop predictive models in biomedical research. Our team won highly competitive international computational challenges ...

Learn more

RNA Regulation

Over the past couple of decades, a surge of discoveries have revealed RNA regulation as a central player in cellular processes. RNAs are regulated by RNA-binding proteins (RBPs) at all post-transcriptional stages. ...

Learn more
Research Interests
  • Biomarker discovery and validation

  • Genomic data analysis and data integration

  • Medical informatics

  • Clinical trial design

  • Lung cancer

Selected Publications

The whole publication list is here.

MORE PULICATIONS
Predicting the future for people with lung cancer

Nature Medicine. 2008 Aug;14(8):812-3. PubMed PMID: 18685594; PubMed Central PMCID: PMC2833359.
Xie Y, Minna JD. Predicting the future for people with lung cancer.

publisher's website

A lung cancer molecular prognostic test ready for prime time

Lancet, 379, 785-787.
Xie, Y., and Minna, J. D.

publisher's website

A 12-Gene Set Predicts Survival Benefits from Adjuvant Chemotherapy in Non-Small-Cell Lung Cancer Patients

Clin Cancer Res; January 28, 2013; doi:10.1158/1078-0432.CCR-12-2321. (*Corresponding Author)
Tang H, Xiao G, Behrens C, Schiller J, Allen J, Chow CW, Suraokar M, Corvalan A, Mao JH, White M, Wistuba II, Minna J, Xie, Y.

publisher's website

Try Out Our Software

We have developed online analysis tools that allow users to explore and analyze lung cancer, germ cell tumor relative gene expression data. PIPECLIP Galaxy is also provided for biologists to identify the most likely cross-linking sites.

Research

I am the director of QBRC (Quantitative Biomedical Research Center). Here, we have multiple research labs for interdisciplinary biological research and provide online tools, packages for biological research. Our team is interested in developing computational models to predict patient outcomes, which will allow clinicians to tailor treatment plans for individual patients. We have developed gene expression signatures to predict patient prognosis and response to chemotherapy.

Research Projects


  • image-responsive

    Cancer biomarkers

    We have developed gene expression signatures to predict patient prognosis and response to chemotherapy.

    We are developing nterested in developing computational models to predict patient outcomes, which will allow clinicians to tailor treatment plans for individual patients. We have developed gene expression signatures to predict patient prognosis and response to chemotherapy. Using an innovative computational and systems biology approach, we previously identified a set of 12 genes that predicts which patients are most likely to benefit from post-surgery chemotherapy (Patent #UTSD2627). Working together with investigators at both UT Southwestern Medical Center and MD Anderson Cancer Center, we are developing a Clinical Laboratory Improvement Amendments (CLIA) -certifiable medical device and designing a prospective clinical study to translate our 12-gene predictive signature to clinical use. Relative publications:

    • Xie Y, Minna JD. Non-small-cell lung cancer mRNA expression signature predicting response to adjuvant chemotherapy. J Clin Oncol. 2010 Oct 10;28(29):4404-7. PMID: 20823415
    • Xie, Y., Xiao, G., Coombes, K. R., Behrens, C., Solis, L. M., Raso, G., Girard, L., Erickson, H. S., Roth, J., Heymach, J. V., Moran, C., Danenberg, K., Minna, J. D., and Wistuba, II. (2011) Robust gene expression signature from formalin-fixed paraffin-embedded samples predicts prognosis of non-small-cell lung cancer patients, Clinical Cancer Research 17, 5705-5714.
    • Tang, H., Xiao, G., Behrens, C., Schiller, J., Allen, J., Chow, C. W., Suraokar, M., Corvalan, A., Mao, J., White, M. A., Wistuba, II, Minna, J. D., and Xie, Y*. (2013) A 12-gene set predicts survival benefits from adjuvant chemotherapy in non-small cell lung cancer patients, Clin Cancer Res 19, 1577-1586.
    • Tang, H., Sebti, S., Titone, R., Zhou, Y., Isidoro, C., Ross, T., Hibshoosh, H., Xiao, G., Packer, M., Xie, Y.*, and Levine, B. Decreased BECN1 mRNA Expression in Human Breast Cancer is Associated with Estrogen Receptor-Negative Subtypes and Poor Prognosis, EBioMedicine, 2015, 2(3), 255–263
    • Tang H, Wang S, Xiao G, Schiller J, Papadimitrakopoulou V., Minna J, Wistuba I.I., Xie Y.. Comprehensive evaluation of published gene expression prognostic signatures for biomarker-based lung cancer clinical studies. Annals of Oncology. 2017 Apr 1;28(4):733-740.
  • image-responsive

    Statistical method for high-dimension data and integrative analysis

    Statistical methodologies for spatial modeling and integrative analysis of different molecular profiling datasets

    We are actively developing new bioinformatics tools and computational algorithms for big data, such as genome-wide RNAi screening data and next-generation sequencing data. We are also developing statistical methodologies for spatial modeling and integrative analysis of different molecular profiling datasets.

    • Xie Y*, Wang X, Story M. Statistical Methods of Background Correction for Illumina BeadArray. Bioinformatics, 2009, Feb 4. PMID: 19193732 PMCID: PMC2654805
    • Xie Y*, JK, Pan W, Xiao G, Khodursky A. A Bayesian Approach to Joint Modeling of Protein-DNA Binding, Gene Expression and Sequence Data. Stat Med. 2010 Feb 20;29(4):489-503.PMID:20049751.
    • Zhong R, Kim M, White M, Xie Y, Xiao G*, Spatial Background Noise Correction for High-Throughput RNAi Screening, Bioinformatics, 2013 Sep 1;29(17):2218-20
    • Zhong R, Kim J, Kim H, Kim M, Lum L, Levine L, Xiao G, White M, Xie Y*. Computational Detection and Suppression of Sequence-specific Off-target Phenotypes from Whole Genome RNAi Screens. 2014, Nucleic Acid Research. Jul;42(13):8214-22.
  • image-responsive

    Statistical learning and prediction model

    My lab has extensive experience in develop predictive models in biomedical research.

    My lab has extensive experience in develop predictive models in biomedical research. Our team won the highly competitive 2012 NCI-DREAM Drug Sensitivity Prediction Challenge (Bansal at al Nature Biotechnology, 2014), the 2013 NIEHS-NCATS-UNC DREAM Toxicogenetics Challenge (Eduati et al Nature Biotechnology, 2015), and co-won the 2014 DREAM-Broad Institute Gene Essentiality Challenge.

    • Xiao G, Ma S, Minna J, Xie Y*, Adaptive prediction model in prospective molecular-signature-based clinical studies, Clinical Cancer Research, 2014 Feb 1;20(3):531-9. PMID:24323903.
    • Bansal M, Yang J, Karan C, Menden MP, Costello JC, Tang H, Xiao G, Li Y, Allen J, Zhong R, Chen B, Kim M, Wang T, Heiser L, Realubit R, Mattioli M, Alvarez M1,2, Shen Y, NCI-DREAM community, Gallahan D, Singer D, Saez-Rodriguez J, Xie Y*, Stolovitzky G*, Califano A*, Predicting activity of drug combinations through crowdsourcing, Nature Biotechnology, 2014 Dec;32(12):1213-22.
    • Eduati, F.#, Mangravite, L.#, Wang, T.#, Tang, H.#, Bare, C., Huang, R., Norman, T., Kellen, M., Menden, M., Yang, J., Zhan, X., Zhong, R., Xiao, G., Xia, M., the NIEHS-NATS-UNC DREAM Toxicogenetics Collaboration, Friend, S., Dearry, A., Simeonov, A., Tice, R., Rusyn, I., Wright, F., Stolovitzky, G., Xie, Y.*, and Saez-Rodriguez, J.* Opportunities and limitations in the prediction of population responses to toxic compounds assessed through a collaborative competition, Nature Biotechnology 2015, 33, 933–940
  • image-responsive

    RNA regulation

    We developed several statistical models to analyze such datasets.

    My research interest in gene regulation started 10 years ago when ChIP-chip technology was invented to study genome-wide transcription factor binding sites, and I developed several statistical models to analyze such datasets. Motivated by several joint projects in post-transcriptional RNA regulation, I developed a strong interest in understanding the role of RNA-binding proteins (RBPs) in RNA regulation. To better analyze the genome-wide CLIP data, my lab has develop several bioinformatics tools and analysis algorithms for CLIP-seq data.

    • Han T, Kato M, Xie S, Wu L, Mirzaei H, Pei J, Chen M, Xie Y, Allen J, Xiao G, McKnight S. Cell-free Formation of RNA Granules: Bound RNAs Identify Features and Components of Cellular Assemblies. Cell 11 May 2012 (Vol. 149, Issue 4, pp. 768-779)
    • Kwon I, Xiang S, Kato M, Wu L, Theodoropoulos P, Wang T, Kim J, Yun J, Xie Y, McKnight SL, Poly-dipeptides encoded by the C9ORF72 repeats bind nucleoli, impede RNA biogenesis, and kill cells, Science 2014, Published online 31 July 2014 [DOI:10.1126/science.1254917]
    • Chu Y, Wang T, Dodd D, Xie Y, Janowski BA, Corey DR. Intramolecular circularization increases efficiency of RNA sequencing and enables CLIP-Seq of nuclear RNA from human cells. Nucleic Acids Res. 2015 Mar 26. pii: gkv213. [Epub ahead of print] PMID: 25813040
    • Sei E, Wang T, Hunter OV, Xie Y*, Conrad NK*. HITS-CLIP analysis uncovers a link between the Kaposi's sarcoma-associated herpesvirus ORF57 protein and host pre-mRNA metabolism. PLoS Pathog. 2015 Feb 24;11(2):e1004652. doi: 10.1371/journal.ppat.1004652. eCollection 2015 Feb
    • Chen B, Yun J, Kim MS, Mendell JT, Xie Y*. PIPE-CLIP: a comprehensive online tool for CLIP-seq data analysis. Genome Biology. 2014 Jan 22;15(1):R18. PMID: 24451213.
    • Wang T, Xie Y and Xiao G*, dCLIP: a computational approach for comparative CLIP-seq analyses, Genome Biology, 2014, 15:R11 doi:10.1186/gb-2014-15-1-r11
    • Wang T, Xiao G, Chu Y, Zhang MQ, Corey DR, Xie Y*. Design and bioinformatics analysis of genome-wide CLIP experiments. Nucleic Acids Res. 2015 May 9. pii: gkv439. [Epub ahead of print]
    • Wang T, Chen B, Kim M, Xie Y, Xiao G. A model-based approach to identify binding sites in CLIP-Seq data. PLoS ONE. 2014 Apr 8;9(4):e93248. doi: 10.1371/journal.pone.0093248. eCollection 2014. PMID: 24714572

Publications

Filter by year: All years/2017-2008.
The whole publication list is here.

Filter by year:

Comprehensive Computational Pathological Image Analysis Predicts Lung Cancer Prognosis

Luo, X., Zang, X., Yang, L., Huang, J., Liang, F., Rodriguez, C.J., Wistuba, II, Gazdar, A., Xie, Y, Xiao, G.
March 2017 Journal of Thoracic Oncology, 12:3, 501-509
image

Abstract

Introduction

Pathological examination of histopathological slides is a routine clinical procedure for lung cancer diagnosis and prognosis. Although the classification of lung cancer has been updated to become more specific, only a small subset of the total morphological features are taken into consideration. The vast majority of the detailed morphological features of tumor tissues, particularly tumor cells’ surrounding microenvironment, are not fully analyzed. The heterogeneity of tumor cells and close interactions between tumor cells and their microenvironments are closely related to tumor development and progression. The goal of this study is to develop morphological feature–based prediction models for the prognosis of patients with lung cancer.

Method

We developed objective and quantitative computational approaches to analyze the morphological features of pathological images for patients with NSCLC. Tissue pathological images were analyzed for 523 patients with adenocarcinoma (ADC) and 511 patients with squamous cell carcinoma (SCC) from The Cancer Genome Atlas lung cancer cohorts. The features extracted from the pathological images were used to develop statistical models that predict patients’ survival outcomes in ADC and SCC, respectively.

Results

We extracted 943 morphological features from pathological images of hematoxylin and eosin–stained tissue and identified morphological features that are significantly associated with prognosis in ADC and SCC, respectively. Statistical models based on these extracted features stratified NSCLC patients into high-risk and low-risk groups. The models were developed from training sets and validated in independent testing sets: a predicted high-risk group versus a predicted low-risk group (for patients with ADC: hazard ratio = 2.34, 95% confidence interval: 1.12–4.91, p = 0.024; for patients with SCC: hazard ratio = 2.22, 95% confidence interval: 1.15–4.27, p = 0.017) after adjustment for age, sex, smoking status, and pathologic tumor stage.

Conclusions

The results suggest that the quantitative morphological features of tumor pathological images predict prognosis in patients with lung cancer.

Finding RNA-Protein Interaction Sites Using HMMs

Wang, T., Yun, J.,Xie, Y, Xiao, G.
February 2017 Methods Mol Biol 1552:177-184. PMID: 28224499
image

Abstract

RNA-binding proteins play important roles in the various stages of RNA maturation through binding to its target RNAs. Cross-linking immunoprecipitation coupled with high-throughput sequencing (CLIP-Seq) has made it possible to identify the targeting sites of RNA-binding proteins in various cell culture systems and tissue types on a genome-wide scale. Several Hidden Markov model-based (HMM) approaches have been suggested to identify protein–RNA binding sites from CLIP-Seq datasets. In this chapter, we describe how HMM can be applied to analyze CLIP-Seq datasets, including the bioinformatics preprocessing steps to extract count information from the sequencing data before HMM and the downstream analysis steps following peak-calling.

Automatic extraction of cell nuclei from H&E-stained histopathological images

Yi, F., Huang, J., Yang, L.,Xie, Y, Xiao, G.
Apr 2017 J Med Imaging 4(2):027502. PMID: 28653017

Abstract

Extraction of cell nuclei from hematoxylin and eosin (H&E)-stained histopathological images is an essential preprocessing step in computerized image analysis for disease detection, diagnosis, and prognosis. We present an automated cell nuclei segmentation approach that works with H&E-stained images. A color deconvolution algorithm was first applied to the image to get the hematoxylin channel. Using a morphological operation and thresholding technique on the hematoxylin channel image, candidate target nuclei and background regions were detected, which were then used as markers for a marker-controlled watershed transform segmentation algorithm. Moreover, postprocessing was conducted to split the touching nuclei. For each segmented region from the previous steps, the regional maximum value positions were identified as potential nuclei centers. These maximum values were further grouped into [Formula: see text]-clusters, and the locations within each cluster were connected with the minimum spanning tree technique. Then, these connected positions were utilized as new markers for a watershed segmentation approach. The final number of nuclei at each region was determined by minimizing an objective function that iterated all of the possible [Formula: see text]-values. The proposed method was applied to the pathological images of the tumor tissues from The Cancer Genome Atlas study. Experimental results show that the proposed method can lead to promising results in terms of segmentation accuracy and separation of touching nuclei.

An Argonaute phosphorylation cycle promotes microRNA-mediated silencing

Golden, R.J., Chen, B., Li, T., Braun, J., Manjunath, H., Chen, X., Wu, J., Schmid, V., Chang, T.C., Kopp, F., Ramirez-Martinez, A., Tagliabracci, VS., Chen, Z.J., Xie, Y, Mendell, J.T.
February 2017 Nature 542, 197–202

Abstract

MicroRNAs (miRNAs) perform critical functions in normal physiology and disease by associating with Argonaute proteins and downregulating partially complementary messenger RNAs (mRNAs). Here we use clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated protein 9 (Cas9) genome-wide loss-of-function screening coupled with a fluorescent reporter of miRNA activity in human cells to identify new regulators of the miRNA pathway. By using iterative rounds of screening, we reveal a novel mechanism whereby target engagement by Argonaute 2 (AGO2) triggers its hierarchical, multi-site phosphorylation by CSNK1A1 on a set of highly conserved residues (S824–S834), followed by rapid dephosphorylation by the ANKRD52–PPP6C phosphatase complex. Although genetic and biochemical studies demonstrate that AGO2 phosphorylation on these residues inhibits target mRNA binding, inactivation of this phosphorylation cycle globally impairs miRNA-mediated silencing. Analysis of the transcriptome-wide binding profile of non-phosphorylatable AGO2 reveals a pronounced expansion of the target repertoire bound at steady-state, effectively reducing the active pool of AGO2 on a per-target basis. These findings support a model in which an AGO2 phosphorylation cycle stimulated by target engagement regulates miRNA:target interactions to maintain the global efficiency of miRNA-mediated silencing.

Prediction of overall survival for patients with metastatic castration-resistant prostate cancer: development of a prognostic model through a crowdsourced challenge with open clinical trial data

Guinney, J., Wang, T., Laajala, T.D., Winner, K.K., Bare, J.C., Neto, E.C., et al, Xie, Y, Aittokallio, T., Zhou, F.L., Costello, J.C.
January 2017 The Lancet Oncology 18(1):132-42.
image

Abstract

Background

Improvements to prognostic models in metastatic castration-resistant prostate cancer have the potential to augment clinical trial design and guide treatment strategies. In partnership with Project Data Sphere, a not-for-profit initiative allowing data from cancer clinical trials to be shared broadly with researchers, we designed an open-data, crowdsourced, DREAM (Dialogue for Reverse Engineering Assessments and Methods) challenge to not only identify a better prognostic model for prediction of survival in patients with metastatic castration-resistant prostate cancer but also engage a community of international data scientists to study this disease.

Methods

Data from the comparator arms of four phase 3 clinical trials in first-line metastatic castration-resistant prostate cancer were obtained from Project Data Sphere, comprising 476 patients treated with docetaxel and prednisone from the ASCENT2 trial, 526 patients treated with docetaxel, prednisone, and placebo in the MAINSAIL trial, 598 patients treated with docetaxel, prednisone or prednisolone, and placebo in the VENICE trial, and 470 patients treated with docetaxel and placebo in the ENTHUSE 33 trial. Datasets consisting of more than 150 clinical variables were curated centrally, including demographics, laboratory values, medical history, lesion sites, and previous treatments. Data from ASCENT2, MAINSAIL, and VENICE were released publicly to be used as training data to predict the outcome of interest-namely, overall survival. Clinical data were also released for ENTHUSE 33, but data for outcome variables (overall survival and event status) were hidden from the challenge participants so that ENTHUSE 33 could be used for independent validation. Methods were evaluated using the integrated time-dependent area under the curve (iAUC). The reference model, based on eight clinical variables and a penalised Cox proportional-hazards model, was used to compare method performance. Further validation was done using data from a fifth trial-ENTHUSE M1-in which 266 patients with metastatic castration-resistant prostate cancer were treated with placebo alone.

Findings

50 independent methods were developed to predict overall survival and were evaluated through the DREAM challenge. The top performer was based on an ensemble of penalised Cox regression models (ePCR), which uniquely identified predictive interaction effects with immune biomarkers and markers of hepatic and renal function. Overall, ePCR outperformed all other methods (iAUC 0·791; Bayes factor >5) and surpassed the reference model (iAUC 0·743; Bayes factor >20). Both the ePCR model and reference models stratified patients in the ENTHUSE 33 trial into high-risk and low-risk groups with significantly different overall survival (ePCR: hazard ratio 3·32, 95% CI 2·39-4·62, p<0·0001; reference model: 2·56, 1·85-3·53, p<0·0001). The new model was validated further on the ENTHUSE M1 cohort with similarly high performance (iAUC 0·768). Meta-analysis across all methods confirmed previously identified predictive clinical variables and revealed aspartate aminotransferase as an important, albeit previously under-reported, prognostic biomarker.

Interpretation

Novel prognostic factors were delineated, and the assessment of 50 methods developed by independent international teams establishes a benchmark for development of methods in the future. The results of this effort show that data-sharing, when combined with a crowdsourced challenge, is a robust and powerful framework to develop new prognostic models in advanced prostate cancer.

Finding

Sanofi US Services, Project Data Sphere.

Comprehensive evaluation of published gene expression prognostic signatures for biomarker-based lung cancer clinical studies

Tang, H., Wang, S., Xiao, G., Schiller, J., Papadimitrakopoulou, V., Minna, J., Wistuba, I.I., Xie, Y,
April 2017 Annals of Oncology, Volume 28, Issue 4

Abstract

Background

A more accurate prognosis for non-small-cell lung cancer (NSCLC) patients could aid in the identification of patients at high risk for recurrence. Many NSCLC mRNA expression signatures claiming to be prognostic have been reported in the literature. The goal of this study was to identify the most promising mRNA prognostic signatures in NSCLC for further prospective clinical validation.

Experimental design

We carried out a systematic review and meta-analysis of published mRNA prognostic signatures for resected NSCLC. The prognostic performance of each signature was evaluated via a meta-analysis of 1927 early stage NSCLC patients collected from 15 studies using three evaluation metrics (hazard ratios, concordance scores, and time-dependent receiver-operating characteristic curves). The performance of each signature was then evaluated against 100 random signatures. The prognostic power independent of clinical risk factors was assessed by multivariate Cox models.

Results

Through a literature search, we identified 42 lung cancer prognostic signatures derived from genome-wide expression profiling analysis. Based on meta-analysis, 25 signatures were prognostic for survival after adjusting for clinical risk factors and 18 signatures carried out significantly better than random signatures. When analyzing histology types separately, 17 signatures and 8 signatures are prognostic for adenocarcinoma and squamous cell lung cancer, respectively. Despite little overlap among published gene signatures, the top-performing signatures are highly concordant in predicted patient outcomes.

Conclusions

Based on this large-scale meta-analysis, we identified a set of mRNA expression prognostic signatures appropriate for further validation in prospective clinical studies.

Targeting renal cell carcinoma with a HIF-2 antagonist.

Chen W, Hill H, Christie A, Kim MS, Holloman E, Pavia-Jimenez A, Homayoun F, Ma Y, Patel N, Yell P, Hao G, Yousuf Q, Joyce A, Pedrosa I, Geiger H, Zhang H, Chang J, Gardner KH, Bruick RK, Reeves C, Hwang TH, Courtney K, Frenkel E, Sun X, Zojwalla N, Wong T, Rizzi JP, Wallace EM, Josey JA, Xie, Y, Xie XJ, Kapur P, McKay RM, Brugarolas J.
November 2016 Nature, 1539, 112–117

Abstract

Clear cell renal cell carcinoma (ccRCC) is characterized by inactivation of the von Hippel-Lindau tumour suppressor gene (VHL)1,2. Because no other gene is mutated as frequently in ccRCC and VHL mutations are truncal3, VHL inactivation is regarded as the governing event4. VHL loss activates the HIF-2 transcription factor, and constitutive HIF-2 activity restores tumorigenesis in VHL-reconstituted ccRCC cells5. HIF-2 has been implicated in angiogenesis and multiple other processes6,7,8,9, but angiogenesis is the main target of drugs such as the tyrosine kinase inhibitor sunitinib10. HIF-2 has been regarded as undruggable11. Here we use a tumourgraft/patient-derived xenograft platform12,13 to evaluate PT2399, a selective HIF-2 antagonist that was identified using a structure-based design approach. PT2399 dissociated HIF-2 (an obligatory heterodimer of HIF-2α–HIF-1β)14 in human ccRCC cells and suppressed tumorigenesis in 56% (10 out of 18) of such lines. PT2399 had greater activity than sunitinib, was active in sunitinib-progressing tumours, and was better tolerated. Unexpectedly, some VHL-mutant ccRCCs were resistant to PT2399. Resistance occurred despite HIF-2 dissociation in tumours and evidence of Hif-2 inhibition in the mouse, as determined by suppression of circulating erythropoietin, a HIF-2 target15 and possible pharmacodynamic marker. We identified a HIF-2-dependent gene signature in sensitive tumours. Gene expression was largely unaffected by PT2399 in resistant tumours, illustrating the specificity of the drug. Sensitive tumours exhibited a distinguishing gene expression signature and generally higher levels of HIF-2α. Prolonged PT2399 treatment led to resistance. We identified binding site and second site suppressor mutations in HIF-2α and HIF-1β, respectively. Both mutations preserved HIF-2 dimers despite treatment with PT2399. Finally, an extensively pretreated patient whose tumour had given rise to a sensitive tumourgraft showed disease control for more than 11 months when treated with a close analogue of PT2399, PT2385. We validate HIF-2 as a target in ccRCC, show that some ccRCCs are HIF-2 independent, and set the stage for biomarker-driven clinical trials.

FMAP: Functional Mapping and Analysis Pipeline for metagenomics and metatranscriptomics studies.

Kim J, Kim MS, Koh AY, Xie, Y, Zhan X.
October 2016 BMC Bioinformatics, 10;17(1):420.

Abstract

Background

Given the lack of a complete and comprehensive library of microbial reference genomes, determining the functional profile of diverse microbial communities is challenging. The available functional analysis pipelines lack several key features: (i) an integrated alignment tool, (ii) operon-level analysis, and (iii) the ability to process large datasets.

Method

We developed objective and quantitative computational approaches to analyze the morphological features of pathological images for patients with NSCLC. Tissue pathological images were analyzed for 523 patients with adenocarcinoma (ADC) and 511 patients with squamous cell carcinoma (SCC) from The Cancer Genome Atlas lung cancer cohorts. The features extracted from the pathological images were used to develop statistical models that predict patients’ survival outcomes in ADC and SCC, respectively.

Results

Here we introduce our open-sourced, stand-alone functional analysis pipeline for analyzing whole metagenomic and metatranscriptomic sequencing data, FMAP (Functional Mapping and Analysis Pipeline). FMAP performs alignment, gene family abundance calculations, and statistical analysis (three levels of analyses are provided: differentially-abundant genes, operons and pathways). The resulting output can be easily visualized with heatmaps and functional pathway diagrams. FMAP functional predictions are consistent with currently available functional analysis pipelines.

Conclusions

FMAP is a comprehensive tool for providing functional analysis of metagenomic/metatranscriptomic sequencing data. With the added features of integrated alignment, operon-level analysis, and the ability to process large datasets, FMAP will be a valuable addition to the currently available functional analysis toolbox. We believe that this software will be of great value to the wider biology and bioinformatics communities.

Increase in Cancer Center Staff Effort Related to Electronic Patient Portal Use.

Laccetti AL, Chen B, Cai J, Gates S, , Xie, Y, Lee SJ, Gerber DE.
December 2017 Journal of Oncology Practice 12(12):e981-e990.

Abstract

PURPOSE

Electronic portals provide patients with real-time access to personal health records. Use of this technology by individuals with cancer is particularly intensive. We therefore examined patterns of use of electronic portals by clinic staff at a National Cancer Institute-designated comprehensive cancer center.

Method

We identified and characterized cancer center providers and clinic staff who performed electronic activities related to MyChart, the institution's personal health records portal, from 2009 to 2014. Total MyChart actions and messages received were quantified and characterized according to type, timing, and staff category.

Results

Two hundred eighty-nine employees were included in our analysis: 85 nurses (29%), 79 ancillary staff (27%), 49 clerical/managerial staff (17%), 47 physicians (16%), and 29 advanced practice providers (10%). These individuals performed 740,613 MyChart actions and received 117,799 messages. Seventy-seven percent of actions were performed by nurses, 11% by ancillary staff, 6% by advanced practice providers, 5% by physicians, and 1% by clerical/managerial staff. From 2011 to 2014, staff MyChart activity increased approximately 10-fold. On average, 6.3 staff MyChart actions were performed per patient-initiated message. In 2014, nurses performed an average of 3,838 MyChart actions and received an average of 589 messages, compared with 591 actions and 87 messages in 2011 ( P < .001). Sixteen percent of all actions occurred outside clinic hours.

Conclusions

Cancer center employee effort related to an electronic patient portal has increased markedly over time, particularly among nursing staff. Because further uptake of this technology is expected, it is critical to consider potential effects on clinical resources, employee and patient satisfaction, and patient safety.

Crowdsourced assessment of common genetic contribution to predicting anti-TNF treatment response in rheumatoid arthritis.

Sieberts SK, Zhu F, García-García J, Stahl E, Pratap A, Pandey G, Pappas D, Aguilar D, Anton B, Bonet J, Eksi R, Fornés O, Guney E, Li H, Marín MA, Panwar B, Planas-Iglesias J, Poglayen D, Cui J, Falcao AO, Suver C, Hoff B, Balagurusamy VS, Dillenberger D, Neto EC, Norman T, Aittokallio T, Ammad-Ud-Din M, Azencott CA, Bellón V, Boeva V, Bunte K, Chheda H, Cheng L, Corander J, Dumontier M, Goldenberg A, Gopalacharyulu P, Hajiloo M, Hidru D, Jaiswal A, Kaski S, Khalfaoui B, Khan SA, Kramer ER, Marttinen P, Mezlini AM, Molparia B, Pirinen M, Saarela J, Samwald M, Stoven V, Tang H, Tang J, Torkamani A, Vert JP, Wang B, Wang T, Wennerberg K, Wineinger NE, Xiao G, Xie, Y, Yeung R, Zhan X, Zhao C; Members of the Rheumatoid Arthritis Challenge Consortium, Greenberg J, Kremer J, Michaud K, Barton A, Coenen M, Mariette X, Miceli C, Shadick N, Weinblatt M, de Vries N, Tak PP, Gerlag D, Huizinga TW, Kurreeman F, Allaart CF, Louis Bridges S Jr, Criswell L, Moreland L, Klareskog L, Saevarsdottir S, Padyukov L, Gregersen PK, Friend S, Plenge R, Stolovitzky G, Oliva B, Guan Y, Mangravite LM, Bridges SL, Criswell L, Moreland L, Klareskog L, Saevarsdottir S, Padyukov L, Gregersen PK, Friend S, Plenge R, Stolovitzky G, Oliva B, Guan Y, Mangravite LM.
August 2016 Nature Communications 7, Article number: 12460

Abstract

ERheumatoid arthritis (RA) affects millions world-wide. While anti-TNF treatment is widely used to reduce disease progression, treatment fails in ∼one-third of patients. No biomarker currently exists that identifies non-responders before treatment. A rigorous community-based assessment of the utility of SNP data for predicting anti-TNF treatment efficacy in RA patients was performed in the context of a DREAM Challenge (http://www.synapse.org/RA_Challenge). An open challenge framework enabled the comparative evaluation of predictions developed by 73 research groups using the most comprehensive available data and covering a wide range of state-of-the-art modelling methodologies. Despite a significant genetic heritability estimate of treatment non-response trait (h2=0.18, P value=0.02), no significant genetic contribution to prediction accuracy is observed. Results formally confirm the expectations of the rheumatology community that SNP information does not significantly improve predictive performance relative to standard clinical traits, thereby justifying a refocusing of future efforts on collection of other data.

Severe Gut Microbiota Dysbiosis Is Associated With Poor Growth in Patients With Short Bowel Syndrome.

Piper HG, Fan D, Coughlin LA, Ho EX, McDaniel MM, Channabasappa N, Kim J, Kim M, Zhan X, Xie, Y, Koh AY.
September 2016 JPEN J Parenter Enteral Nutr. 41(7):1202-1212.

Abstract

Background

Children with short bowel syndrome (SBS) can vary significantly in their growth trajectory. Recent data have shown that children with SBS possess a unique gut microbiota signature compared with healthy controls. We hypothesized that children with SBS and poor growth would exhibit more severe gut microbiota dysbiosis compared with those with SBS who are growing adequately, despite similar intestinal anatomy.

Materials and Methods

Stool samples were collected from children with SBS (n = 8) and healthy controls (n = 3) over 3 months. Gut microbiota populations (16S ribosomal RNA sequencing and metagenomic shotgun sequencing) were compared, including a more in-depth analysis of SBS children exhibiting poor and good growth. Statistical analysis was performed using Mann-Whitney, Kruskal-Wallis, and χ2 tests as appropriate.

Results

Children with SBS had a significant deficiency of the commensal Firmicutes order Clostridiales ( P = .025, Kruskal-Wallis) compared with healthy children. Furthermore, children with SBS and poor growth were deficient in beneficial bacteria known to produce short-chain fatty acids and had expansion of proinflammatory Enterobacteriaceae ( P = .038, Kruskal-Wallis) compared with children with SBS who were growing adequately. Using metabolic function analyses, SBS/poor growth microbiomes were deficient in genes needed for gluconeogenesis but enriched in branched and aromatic amino acid synthesis and citrate cycle pathway genes.

Conclusion

Patients with SBS, particularly those with suboptimal growth, have a marked gut dysbiosis characterized by a paucity of beneficial commensal anaerobes, resulting in a deficiency of key metabolic enzymes found in the gut microbiomes of healthy children.

An Expression Signature as an Aid to the Histologic Classification of Non-Small Cell Lung Cancer.

Girard L, Rodriguez-Canales J, Behrens C, Thompson DM, Botros IW, Tang H, Xie, Y, Rekhtman N, Travis WD, Wistuba II, Minna JD, Gazdar AF.
October 2016 Clinical Cancer Research 10.1158/1078-0432.CCR-15-2900

Abstract

Purpose

Most non-small cell lung cancers (NSCLC) are now diagnosed from small specimens, and classification using standard pathology methods can be difficult. This is of clinical relevance as many therapy regimens and clinical trials are histology dependent. The purpose of this study was to develop an mRNA expression signature as an adjunct test for routine histopathologic classification of NSCLCs.

Experimental Design

A microarray dataset of resected adenocarcinomas (ADC) and squamous cell carcinomas (SCC) was used as the learning set for an ADC-SCC signature. The Cancer Genome Atlas (TCGA) lung RNAseq dataset was used for validation. Another microarray dataset of ADCs and matched nonmalignant lung was used as the learning set for a tumor versus nonmalignant signature. The classifiers were selected as the most differentially expressed genes and sample classification was determined by a nearest distance approach.

Results

We developed a 62-gene expression signature that contained many genes used in immunostains for NSCLC typing. It includes 42 genes that distinguish ADC from SCC and 20 genes differentiating nonmalignant lung from lung cancer. Testing of the TCGA and other public datasets resulted in high prediction accuracies (93%–95%). In addition, a prediction score was derived that correlates both with histologic grading and prognosis. We developed a practical version of the Classifier using the HTG EdgeSeq nuclease protection–based technology in combination with next-generation sequencing that can be applied to formalin-fixed paraffin-embedded (FFPE) tissues and small biopsies.

Conclusion

Our RNA classifier provides an objective, quantitative method to aid in the pathologic diagnosis of lung cancer. Clin Cancer Res; 22(19); 4880–9. ©2016 AACR.

The antitumor toxin CD437 is a direct inhibitor of DNA polymerase α.

Han T, Goralski M, Capota E, Padrick SB, Kim J, Xie, Y, Nijhawan D.
July 2016 Nature Chemical Biology, 12(7):511-5

Abstract

CD437 is a retinoid-like small molecule that selectively induces apoptosis in cancer cells, but not in normal cells, through an unknown mechanism. We used a forward-genetic strategy to discover mutations in POLA1 that coincide with CD437 resistance (POLA1R). Introduction of one of these mutations into cancer cells by CRISPR-Cas9 genome editing conferred CD437 resistance, demonstrating causality. POLA1 encodes DNA polymerase α, the enzyme responsible for initiating DNA synthesis during the S phase of the cell cycle. CD437 inhibits DNA replication in cells and recombinant POLA1 activity in vitro. Both effects are abrogated by the identified POLA1 mutations, supporting POLA1 as the direct antitumor target of CD437. In addition, we detected an increase in the total fluorescence intensity and anisotropy of CD437 in the presence of increasing concentrations of POLA1 that is consistent with a direct binding interaction. The discovery of POLA1 as the direct anticancer target for CD437 has the potential to catalyze the development of CD437 into an anticancer therapeutic.

Crowdsourced estimation of cognitive decline and resilience in Alzheimer's disease.

Allen GI, Amoroso N, Anghel C, Balagurusamy V, Bare CJ, Beaton D, Bellotti R, Bennett DA, Boehme KL, Boutros PC, Caberlotto L, Caloian C, Campbell F, Chaibub Neto E, Chang YC, Chen B, Chen CY, Chien TY, Clark T, Das S, Davatzikos C, Deng J, Dillenberger D, Dobson RJ, Dong Q, Doshi J, Duma D, Errico R, Erus G, Everett E, Fardo DW, Friend SH, Fröhlich H, Gan J, St George-Hyslop P, Ghosh SS, Glaab E, Green RC, Guan Y, Hong MY, Huang C, Hwang J, Ibrahim J, Inglese P, Iyappan A, Jiang Q, Katsumata Y, Kauwe JS, Klein A, Kong D, Krause R, Lalonde E, Lauria M, Lee E, Lin X, Liu Z, Livingstone J, Logsdon BA, Lovestone S, Ma TW, Malhotra A, Mangravite LM, Maxwell TJ, Merrill E, Nagorski J, Namasivayam A, Narayan M, Naz M, Newhouse SJ, Norman TC, Nurtdinov RN, Oyang YJ, Pawitan Y, Peng S, Peters MA, Piccolo SR, Praveen P, Priami C, Sabelnykova VY, Senger P, Shen X, Simmons A, Sotiras A, Stolovitzky G, Tangaro S, Tateo A, Tung YA, Tustison NJ, Varol E, Vradenburg G, Weiner MW, Xiao G, Xie L, Xie Y, Xu J, Yang H, Zhan X, Zhou Y, Zhu F, Zhu H, Zhu S; Alzheimer's Disease Neuroimaging Initiative.
June 2016 Alzheimer's & Dementia, Volume 12, Issue 6.

Abstract

Identifying accurate biomarkers of cognitive decline is essential for advancing early diagnosis and prevention therapies in Alzheimer's disease. The Alzheimer's disease DREAM Challenge was designed as a computational crowdsourced project to benchmark the current state-of-the-art in predicting cognitive outcomes in Alzheimer's disease based on high dimensional, publicly available genetic and structural imaging data. This meta-analysis failed to identify a meaningful predictor developed from either data modality, suggesting that alternate approaches should be considered for prediction of cognitive performance.

High-dimensional genomic data bias correction and data integration using MANCIE.

Zang C, Wang T, Deng K, Li B, Hu S, Qin Q, Xiao T, Zhang S, Meyer CA, He HH, Brown M, Liu JS, Xie Y, Liu XS.
April 2016 Nature Communications 7, Article number: 11305

Abstract

High-dimensional genomic data analysis is challenging due to noises and biases in high-throughput experiments. We present a computational method matrix analysis and normalization by concordant information enhancement (MANCIE) for bias correction and data integration of distinct genomic profiles on the same samples. MANCIE uses a Bayesian-supported principal component analysis-based approach to adjust the data so as to achieve better consistency between sample-wise distances in the different profiles. MANCIE can improve tissue-specific clustering in ENCODE data, prognostic prediction in Molecular Taxonomy of Breast Cancer International Consortium and The Cancer Genome Atlas data, copy number and expression agreement in Cancer Cell Line Encyclopedia data, and has broad applications in cross-platform, high-dimensional data integration.

The Kub5-Hera/RPRD1B interactome: a novel role in preserving genetic stability by regulating DNA mismatch repair.

Patidar PL, Motea EA, Fattah FJ, Zhou Y, Morales JC, Xie Y, Garner HR, Boothman DA.
February 2016 Nucleic Acids Research, Volume 44, Issue 4,
image

Abstract

Ku70-binding protein 5 (Kub5)-Hera (K-H)/RPRD1B maintains genetic integrity by concomitantly minimizing persistent R-loops and promoting repair of DNA double strand breaks (DSBs). We used tandem affinity purification-mass spectrometry, co-immunoprecipitation and gel-filtration chromatography to define higher-order protein complexes containing K-H scaffolding protein to gain insight into its cellular functions. We confirmed known protein partners (Ku70, RNA Pol II, p15RS) and discovered several novel associated proteins that function in RNA metabolism (Topoisomerase 1 and RNA helicases), DNA repair/replication processes (PARP1, MSH2, Ku, DNA-PKcs, MCM proteins, PCNA and DNA Pol δ) and in protein metabolic processes, including translation. Notably, this approach directed us to investigate an unpredicted involvement of K-H in DNA mismatch repair (MMR) where K-H depletion led to concomitant MMR deficiency and compromised global microsatellite stability. Mechanistically, MMR deficiency in K-H-depleted cells was a consequence of reduced stability of the core MMR proteins (MLH1 and PMS2) caused by elevated basal caspase-dependent proteolysis. Pan-caspase inhibitor treatment restored MMR protein loss. These findings represent a novel mechanism to acquire MMR deficiency/microsatellite alterations. A significant proportion of colon, endometrial and ovarian cancers exhibit k-h expression/copy number loss and may have severe mutator phenotypes with enhanced malignancies that are currently overlooked based on sporadic MSI+ screening.

Noncoding RNA NORAD Regulates Genomic Stability by Sequestering PUMILIO Proteins.

Lee S, Kopp F, Chang TC, Sataluri A, Chen B, Sivakumar S, Yu H, Xie Y, Mendell JT.
January 2016 Cell.Volume 164, Issues 1–2, Pages 69-80
image

Abstract

Long noncoding RNAs (lncRNAs) have emerged as regulators of diverse biological processes. Here, we describe the initial functional analysis of a poorly characterized human lncRNA (LINC00657) that is induced after DNA damage, which we termed “noncoding RNA activated by DNA damage”, or NORAD. NORAD is highly conserved and abundant, with expression levels of approximately 500–1,000 copies per cell. Remarkably, inactivation of NORAD triggers dramatic aneuploidy in previously karyotypically stable cell lines. NORAD maintains genomic stability by sequestering PUMILIO proteins, which repress the stability and translation of mRNAs to which they bind. In the absence of NORAD, PUMILIO proteins drive chromosomal instability by hyperactively repressing mitotic, DNA repair, and DNA replication factors. These findings introduce a mechanism that regulates the activity of a deeply conserved and highly dosage-sensitive family of RNA binding proteins and reveal unanticipated roles for a lncRNA and PUMILIO proteins in the maintenance of genomic stability.

Transcription Factor Hepatocyte Nuclear Factor-1β Regulates Renal Cholesterol Metabolism.

Aboudehen K, Kim MS, Mitsche M, Garland K, Anderson N, Noureddine L, Pontoglio M, Patel V, Xie Y, DeBose-Boyd R, Igarashi P.
January 2016 Journal of the American Society of Nephrology

Abstract

HNF-1β is a tissue-specific transcription factor that is expressed in the kidney and other epithelial organs. Humans with mutations in HNF-1β develop kidney cysts, and HNF-1β regulates the transcription of several cystic disease genes. However, the complete spectrum of HNF-1β-regulated genes and pathways is not known. Here, using chromatin immunoprecipitation/next generation sequencing and gene expression profiling, we identified 1545 protein-coding genes that are directly regulated by HNF-1β in murine kidney epithelial cells. Pathway analysis predicted that HNF-1β regulates cholesterol metabolism. Expression of dominant negative mutant HNF-1β or kidney-specific inactivation of HNF-1β decreased the expression of genes that are essential for cholesterol synthesis, including sterol regulatory element binding factor 2 (Srebf2) and 3-hydroxy-3-methylglutaryl-CoA reductase (Hmgcr). HNF-1β mutant cells also expressed lower levels of cholesterol biosynthetic intermediates and had a lower rate of cholesterol synthesis than control cells. Additionally, depletion of cholesterol in the culture medium mitigated the inhibitory effects of mutant HNF-1β on the proteins encoded by Srebf2 and Hmgcr, and HNF-1β directly controlled the renal epithelial expression of proprotein convertase subtilisin-like kexin type 9, a key regulator of cholesterol uptake. These findings reveal a novel role of HNF-1β in a transcriptional network that regulates intrarenal cholesterol metabolism.

A Phase I Dose-Escalation Trial of Single-Fraction Stereotactic Radiation Therapy for Liver Metastases.

Meyer JJ, Foster RD, Lev-Cohain N, Yokoo T, Dong Y, Schwarz RE, Rule W, Tian J, Xie Y, Hannan R, Nedzi L, Solberg T, Timmerman R.
January 2016 Annals of Surgical Oncology, Volume 23, Issue 1, pp 218–224

Abstract

Background

There is significant interest in the use of stereotactic ablative radiotherapy (SABR) as a treatment modality for liver metastases. A variety of SABR fractionation schemes are in clinical use. We conducted a phase I dose-escalation study to determine the maximum tolerated dose of single-fraction liver SABR.

Methods

Patients with liver metastases from solid tumors, for whom a critical volume dose constraint could be met, were treated with single-fraction SABR. Seven patients were enrolled to the first group, with a prescription dose of 35 Gy. Dose was then escalated to 40 Gy in a single fraction, and seven more patients were treated at this dose level. Patients were followed for toxicity and underwent serial imaging to assess lesion response and local control.

Results

Fourteen patients with 17 liver metastases were treated. There were no dose-limiting toxicities observed at either dose level. Nine of the 13 lesions assessable for treatment response showed a complete radiographic response to treatment; the remainder showed partial response. Local control of irradiated lesions was 100 % at a median imaging follow-up of 2.5 years. Two-year overall survival for all patients was 78 %.

Conclusions

For selected patients with liver metastases, single-fraction SABR at doses of 35 and 40 Gy is tolerable and shows promising signs of efficacy at intermediate follow-up.

Comprehensive functional characterization of cancer-testis antigens defines obligate participation in multiple hallmarks of cancer..

Maxfield KE, Taus PJ, Corcoran K, Wooten J, Macion J, Zhou Y, Borromeo M, Kollipara RK, Yan J, Xie Y, Xie XJ, Whitehurst AW.
November 2015 Nature Communications 6, Article number: 8840 (2015)
image

Abstract

Tumours frequently activate genes whose expression is otherwise biased to the testis, collectively known as cancer-testis antigens (CTAs). The extent to which CTA expression represents epiphenomena or confers tumorigenic traits is unknown. In this study, to address this, we implemented a multidimensional functional genomics approach that incorporates 7 different phenotypic assays in 11 distinct disease settings. We identify 26 CTAs that are essential for tumor cell viability and/or are pathological drivers of HIF, WNT or TGFβ signalling. In particular, we discover that Foetal and Adult Testis Expressed 1 (FATE1) is a key survival factor in multiple oncogenic backgrounds. FATE1 prevents the accumulation of the stress-sensing BH3-only protein, BCL-2-Interacting Killer (BIK), thereby permitting viability in the presence of toxic stimuli. Furthermore, ZNF165 promotes TGFβ signalling by directly suppressing the expression of negative feedback regulatory pathways. This action is essential for the survival of triple negative breast cancer cells in vitro and in vivo. Thus, CTAs make significant direct contributions to tumour biology.

NRF2 regulates serine biosynthesis in non-small cell lung cancer.

DeNicola GM, Chen PH, Mullarky E, Sudderth JA, Hu Z, Wu D, Tang H, Xie Y, Asara JM, Huffman KE, Wistuba II, Minna JD, DeBerardinis RJ, Cantley LC.
December 2015 Nature Genetics 47, 1475–1481

Abstract

Tumors have high energetic and anabolic needs for rapid cell growth and proliferation, and the serine biosynthetic pathway was recently identified as an important source of metabolic intermediates for these processes. We integrated metabolic tracing and transcriptional profiling of a large panel of non-small cell lung cancer (NSCLC) cell lines to characterize the activity and regulation of the serine/glycine biosynthetic pathway in NSCLC. Here we show that the activity of this pathway is highly heterogeneous and is regulated by NRF2, a transcription factor frequently deregulated in NSCLC. We found that NRF2 controls the expression of the key serine/glycine biosynthesis enzyme genes PHGDH, PSAT1 and SHMT2 via ATF4 to support glutathione and nucleotide production. Moreover, we show that expression of these genes confers poor prognosis in human NSCLC. Thus, a substantial fraction of human NSCLCs activates an NRF2-dependent transcriptional program that regulates serine and glycine metabolism and is linked to clinical aggressiveness.

Phase 1 study of romidepsin plus erlotinib in advanced non-small cell lung cancer.

Gerber DE, Boothman DA, Fattah FJ, Dong Y, Zhu H, Skelton RA, Priddy LL, Vo P, Dowell JE, Sarode V, Leff R, Meek C, Xie Y, Schiller JH.
December 2015 Lung Cancer Volume 90, Issue 3, Pages 534-541

Abstract

Purpose

Preclinical studies demonstrated anti-tumor efficacy of the combination of the histone deacetylase (HDAC) inhibitor romidepsin plus erlotinib in non-small cell lung cancer (NSCLC) models that were insensitive to erlotinib monotherapy. We therefore studied this combination in a phase 1 clinical trial in previously treated advanced NSCLC.

Methods

Romidepsin (8 or 10mg/m(2)) was administered intravenously on days 1, 8, and 15 every 28 days in combination with erlotinib (150 mg orally daily), with romidepsin monotherapy lead-in during Cycle 1. Correlative studies included peripheral blood mononuclear cell HDAC activity and histone acetylation status, and EGFR pathway activation status in skin biopsies.

Results

A total of 17 patients were enrolled. Median number of prior lines of therapy was 3 (range 1-5). No cases had a sensitizing EGFR mutation. The most common related adverse events were nausea, vomiting, and fatigue (each 82%), diarrhea (65%), anorexia (53%), and rash (41%). Dose-limiting nausea and vomiting occurred at the romidepsin 10 mg/m(2) level despite aggressive antiemetic prophylaxis and treatment. Among 10 evaluable patients, the best response was stable disease (n=7) and progressive disease (n=3). Median progression-free survival (PFS) was 3.3 months (range 1.4-16.5 months). Prolonged PFS (>6 months) was noted in a KRAS mutant adenocarcinoma and a squamous cell cancer previously progressed on erlotinib monotherapy. Romidepsin monotherapy inhibited HDAC activity, increased histone acetylation status, and inhibited EGFR phosphorylation.

Conclusions

Romidepsin 8 mg/m(2) plus erlotinib appears well tolerated, has evidence of disease control, and exhibits effects on relevant molecular targets in an unselected advanced NSCLC population.

Targeting glutamine metabolism sensitizes pancreatic cancer to PARP-driven metabolic catastrophe induced by ß-lapachone.

Chakrabarti G, Moore ZR, Luo X, Ilcheva M, Ali A, Padanad M, Zhou Y, Xie Y, Burma S, Scaglioni PP, Cantley LC, DeBerardinis RJ, Kimmelman AC, Lyssiotis CA, Boothman DA.
October 2015 Cancer & Metabolism

Abstract

Bcakground

Pancreatic ductal adenocarcinomas (PDA) activate a glutamine-dependent pathway of cytosolic nicotinamide adenine dinucleotide phosphate (NADPH) production to maintain redox homeostasis and support proliferation. Enzymes involved in this pathway (GLS1 (mitochondrial glutaminase 1), GOT1 (cytoplasmic glutamate oxaloacetate transaminase 1), and GOT2 (mitochondrial glutamate oxaloacetate transaminase 2)) are highly upregulated in PDA, and among these, inhibitors of GLS1 were recently deployed in clinical trials to target anabolic glutamine metabolism. However, single-agent inhibition of this pathway is cytostatic and unlikely to provide durable benefit in controlling advanced disease.

Results

Here, we report that reducing NADPH pools by genetically or pharmacologically (bis-2-(5-phenylacetamido-1,2,4-thiadiazol-2-yl)ethyl sulfide (BPTES) or CB-839) inhibiting glutamine metabolism in mutant Kirsten rat sarcoma viral oncogene homolog (KRAS) PDA sensitizes cell lines and tumors to ß-lapachone (ß-lap, clinical form ARQ761). ß-Lap is an NADPH:quinone oxidoreductase (NQO1)-bioactivatable drug that leads to NADPH depletion through high levels of reactive oxygen species (ROS) from the futile redox cycling of the drug and subsequently nicotinamide adenine dinucleotide (NAD)+ depletion through poly(ADP ribose) polymerase (PARP) hyperactivation. NQO1 expression is highly activated by mutant KRAS signaling. As such, ß-lap treatment concurrent with inhibition of glutamine metabolism in mutant KRAS, NQO1 overexpressing PDA leads to massive redox imbalance, extensive DNA damage, rapid PARP-mediated NAD+ consumption, and PDA cell death-features not observed in NQO1-low, wild-type KRAS expressing cells.

Conclusions

This treatment strategy illustrates proof of principle that simultaneously decreasing glutamine metabolism-dependent tumor anti-oxidant defenses and inducing supra-physiological ROS formation are tumoricidal and that this rationally designed combination strategy lowers the required doses of both agents in vitro and in vivo. The non-overlapping specificities of GLS1 inhibitors and ß-lap for PDA tumors afford high tumor selectivity, while sparing normal tissue.

A systematic analysis reveals heterogeneous changes in the endocytic activities of cancer cells.

Elkin SR, Bendris N, Reis CR, Zhou Y, Xie Y, Huffman KE, Minna JD, Schmid SL.
November 2015 Cancer Research, Volume 75, Issue 21
image

Abstract

Metastasis is a multistep process requiring cancer cell signaling, invasion, migration, survival, and proliferation. These processes require dynamic modulation of cell surface proteins by endocytosis. Given this functional connection, it has been suggested that endocytosis is dysregulated in cancer. To test this, we developed In-Cell ELISA assays to measure three different endocytic pathways: clathrin-mediated endocytosis, caveolae-mediated endocytosis, and clathrin-independent endocytosis and compared these activities using two different syngeneic models for normal and oncogene-transformed human lung epithelial cells. We found that all endocytic activities were reduced in the transformed versus normal counterparts. However, when we screened 29 independently isolated non-small cell lung cancer (NSCLC) cell lines to determine whether these changes were systematic, we observed significant heterogeneity. Nonetheless, using hierarchical clustering based on their combined endocytic properties, we identified two phenotypically distinct clusters of NSCLCs. One co-clustered with mutations in KRAS, a mesenchymal phenotype, increased invasion through collagen and decreased growth in soft agar, whereas the second was enriched in cells with an epithelial phenotype. Interestingly, the two clusters also differed significantly in clathrin-independent internalization and surface expression of CD44 and CD59. Taken together, our results suggest that endocytotic alterations in cancer cells that affect cell surface expression of critical molecules have a significant influence on cancer-relevant phenotypes, with potential implications for interventions to control cancer by modulating endocytic dynamics.

Deciphering the associations between gene expression and copy number alteration using a sparse double Laplacian shrinkage approach.

Shi X, Zhao Q, Huang J, Xie Y, Ma S.
December 2015 Bioinformatics, Volume 31, Issue 24, Pages 3977–3983

Abstract

Motivation

Both gene expression levels (GEs) and copy number alterations (CNAs) have important biological implications. GEs are partly regulated by CNAs, and much effort has been devoted to understanding their relations. The regulation analysis is challenging with one gene expression possibly regulated by multiple CNAs and one CNA potentially regulating the expressions of multiple genes. The correlations among GEs and among CNAs make the analysis even more complicated. The existing methods have limitations and cannot comprehensively describe the regulation.

Results

A sparse double Laplacian shrinkage method is developed. It jointly models the effects of multiple CNAs on multiple GEs. Penalization is adopted to achieve sparsity and identify the regulation relationships. Network adjacency is computed to describe the interconnections among GEs and among CNAs. Two Laplacian shrinkage penalties are imposed to accommodate the network adjacency measures. Simulation shows that the proposed method outperforms the competing alternatives with more accurate marker identification. The Cancer Genome Atlas data are analysed to further demonstrate advantages of the proposed method.

Availability and Implementation

R code is available at http://works.bepress.com/shuangge/49/.

Prediction of human population responses to toxic compounds by a collaborative competition.

Eduati F, Mangravite LM, Wang T, Tang H, Bare JC, Huang R, Norman T, Kellen M, Menden MP, Yang J, Zhan X, Zhong R, Xiao G, Xia M, Abdo N, Kosyk O; NIEHS-NCATS-UNC DREAM Toxicogenetics Collaboration, Friend S, Dearry A, Simeonov A, Tice RR, Rusyn I, Wright FA, Stolovitzky G, Xie Y, Saez-Rodriguez J.
September 2015 Nature Biotechnology 33, 933–940 (2015)
image

Abstract

The ability to computationally predict the effects of toxic compounds on humans could help address the deficiencies of current chemical safety testing. Here, we report the results from a community-based DREAM challenge to predict toxicities of environmental compounds with potential adverse health effects for human populations. We measured the cytotoxicity of 156 compounds in 884 lymphoblastoid cell lines for which genotype and transcriptional data are available as part of the Tox21 1000 Genomes Project. The challenge participants developed algorithms to predict interindividual variability of toxic response from genomic profiles and population-level cytotoxicity data from structural attributes of the compounds. 179 submitted predictions were evaluated against an experimental data set to which participants were blinded. Individual cytotoxicity predictions were better than random, with modest correlations (Pearson's r < 0.28), consistent with complex trait genomic prediction. In contrast, predictions of population-level response to different compounds were higher (r < 0.66). The results highlight the possibility of predicting health risks associated with unknown compounds, although risk estimation accuracy remains suboptimal.

Activation of HIF-1α and LL-37 by commensal bacteria inhibits Candida albicans colonization.

Fan D, Coughlin LA, Neubauer MM, Kim J, Kim MS, Zhan X, Simms-Waldrip TR, Xie Y, Hooper LV, Koh AY.
July 2015 Nature Medicine 21, 808–814 (2015)

Abstract

Candida albicans colonization is required for invasive disease. Unlike humans, adult mice with mature intact gut microbiota are resistant to C. albicans gastrointestinal (GI) colonization, but the factors that promote C. albicans colonization resistance are unknown. Here we demonstrate that commensal anaerobic bacteria-specifically clostridial Firmicutes (clusters IV and XIVa) and Bacteroidetes-are critical for maintaining C. albicans colonization resistance in mice. Using Bacteroides thetaiotamicron as a model organism, we find that hypoxia-inducible factor-1α (HIF-1α), a transcription factor important for activating innate immune effectors, and the antimicrobial peptide LL-37 (CRAMP in mice) are key determinants of C. albicans colonization resistance. Although antibiotic treatment enables C. albicans colonization, pharmacologic activation of colonic Hif1a induces CRAMP expression and results in a significant reduction of C. albicans GI colonization and a 50% decrease in mortality from invasive disease. In the setting of antibiotics, Hif1a and Camp (which encodes CRAMP) are required for B. thetaiotamicron-induced protection against C. albicans colonization of the gut. Thus, modulating C. albicans GI colonization by activation of gut mucosal immune effectors may represent a novel therapeutic approach for preventing invasive fungal disease in humans.

Identifying CDKN3 Gene Expression as a Prognostic Biomarker in Lung Adenocarcinoma via Meta-analysis.

Zang X, Chen M, Zhou Y, Xiao G, Xie Y, Wang X.
May 2015 Cancer Informatics, 24;14(Suppl 2):183-91

Abstract

Lung cancer is among the major causes of cancer deaths, and the survival rate of lung cancer patients is extremely low. Recent studies have demonstrated that the gene CDKN3 is related to neoplasia, but in the literature severe controversy exists over whether it is involved in cancer progression or, conversely, tumor inhibition. In this study, we investigated the expression of CDKN3 and its association with prognosis in lung adenocarcinoma (ADC) and squamous cell carcinoma (SCC) using datasets in Lung Cancer Explorer (LCE; http://qbrc.swmed.edu/lce/). We found that CDKN3 was up-regulated in ADC and SCC compared to normal tissues. We also found that CDKN3 was expressed at a higher level in SCC than in ADC, which was further validated through meta-analysis (coefficient = 2.09, 95% CI = 1.50-2.67, P < 0.0001). In addition, based on meta-analysis for the prognostic value of CDKN3, we found that higher CDKN3 expression was associated with poorer survival outcomes in ADC (HR = 1.65, 95% CI = 1.39-1.96, P < 0.0001), but not in SCC (HR = 1.10, 95% CI = 0.84-1.44, P = 0.494). Our findings indicate that CDKN3 may be a prognostic marker in ADC, though the detailed mechanism is yet to be revealed.

Elucidation of changes in molecular signalling leading to increased cellular transformation in oncogenically progressed human bronchial epithelial cells exposed to radiations of increasing LET.

Ding LH, Park S, Xie Y, Girard L, Minna JD, Story MD.
September 2015 Mutagenesis, Volume 30, Issue 5, Pages 685–694,
image

Abstract

The early transcriptional response and subsequent induction of anchorage-independent growth after exposure to particles of high Z and energy (HZE) as well as γ-rays were examined in human bronchial epithelial cells (HBEC3KT) immortalised without viral oncogenes and an isogenic variant cell line whose p53 expression was suppressed but that expressed an active mutant K-RAS(V12) (HBEC3KT-P53KRAS). Cell survival following irradiation showed that HBEC3KT-P53KRAS cells were more radioresistant than HBEC3KT cells irrespective of the radiation species. In addition, radiation enhanced the ability of the surviving HBEC3KT-P53RAS cells but not the surviving HBEC3KT cells to grow in anchorage-independent fashion (soft agar colony formation). HZE particle irradiation was far more efficient than γ-rays at rendering HBEC3KT-P53RAS cells permissive for soft agar growth. Gene expression profiles after radiation showed that the molecular response to radiation for HBEC3KT-P53RAS, similar to that for HBEC3KT cells, varies with radiation quality. Several pathways associated with anchorage independent growth, including the HIF-1α, mTOR, IGF-1, RhoA and ERK/MAPK pathways, were over-represented in the irradiated HBEC3KT-P53RAS cells compared to parental HBEC3KT cells. These results suggest that oncogenically progressed human lung epithelial cells are at greater risk for cellular transformation and carcinogenic risk after ionising radiation, but particularly so after HZE radiations. These results have implication for: (i) terrestrial radiation and suggests the possibility of enhanced carcinogenic risk from diagnostic CT screens used for early lung cancer detection; (ii) enhanced carcinogenic risk from heavy particles used in radiotherapy; and (iii) for space radiation, raising the possibility that astronauts harbouring epithelial regions of dysplasia or hyperplasia within the lung that contain oncogenic changes, may have a greater risk for lung cancers based upon their exposure to heavy particles present in the deep space environment.

Design and bioinformatics analysis of genome-wide CLIP experiments.

Wang T, Xiao G, Chu Y, Zhang MQ, Corey DR, Xie Y,
June 2015 Nucleic Acids Research, Volume 43, Issue 11, Pages 5263–5274,

Abstract

The past decades have witnessed a surge of discoveries revealing RNA regulation as a central player in cellular processes. RNAs are regulated by RNA-binding proteins (RBPs) at all post-transcriptional stages, including splicing, transportation, stabilization and translation. Defects in the functions of these RBPs underlie a broad spectrum of human pathologies. Systematic identification of RBP functional targets is among the key biomedical research questions and provides a new direction for drug discovery. The advent of cross-linking immunoprecipitation coupled with high-throughput sequencing (genome-wide CLIP) technology has recently enabled the investigation of genome-wide RBP-RNA binding at single base-pair resolution. This technology has evolved through the development of three distinct versions: HITS-CLIP, PAR-CLIP and iCLIP. Meanwhile, numerous bioinformatics pipelines for handling the genome-wide CLIP data have also been developed. In this review, we discuss the genome-wide CLIP technology and focus on bioinformatics analysis. Specifically, we compare the strengths and weaknesses, as well as the scopes, of various bioinformatics tools. To assist readers in choosing optimal procedures for their analysis, we also review experimental design and procedures that affect bioinformatics analyses.

Decreased BECN1 mRNA Expression in Human Breast Cancer is Associated with Estrogen Receptor-Negative Subtypes and Poor Prognosis.

Tang H, Sebti S, Titone R, Zhou Y, Isidoro C, Ross TS, Hibshoosh H, Xiao G, Packer M, Xie Y, Levine B.
March 2015 EBioMedicine, 2(3):255-263.

Abstract

Both BRCA1 and Beclin 1 (BECN1) are tumor suppressor genes, which are in close proximity on the human chromosome 17q21 breast cancer tumor susceptibility locus and are often concurrently deleted. However, their importance in sporadic human breast cancer is not known. To interrogate the effects of BECN1 and BRCA1 in breast cancer, we studied their mRNA expression patterns in breast cancer patients from two large datasets: The Cancer Genome Atlas (TCGA) (n=1067) and the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) (n=1992). In both datasets, low expression of BECN1 was more common in HER2-enriched and basal-like (mostly triple-negative) breast cancers compared to luminal A/B intrinsic tumor subtypes, and was also strongly associated with TP53 mutations and advanced tumor grade. In contrast, there was no significant association between low BRCA1 expression and HER2-enriched or basal-like subtypes, TP53 mutations or tumor grade. In addition, low expression of BECN1 (but not low BRCA1) was associated with poor prognosis, and BECN1 (but not BRCA1) expression was an independent predictor of survival. These findings suggest that decreased mRNA expression of the autophagy gene BECN1 may contribute to the pathogenesis and progression of HER2-enriched, basal-like, and TP53 mutant breast cancers.

Intramolecular circularization increases efficiency of RNA sequencing and enables CLIP-Seq of nuclear RNA from human cells.

Chu Y, Wang T, Dodd D, Xie Y, Janowski BA, Corey DR.
March 2015 Nucleic Acids Research, Volume 43, Issue 11, Pages e75,

Abstract

RNA sequencing (RNA-Seq) is a powerful tool for analyzing the identity of cellular RNAs but is often limited by the amount of material available for analysis. In spite of extensive efforts employing existing protocols, we observed that it was not possible to obtain useful sequencing libraries from nuclear RNA derived from cultured human cells after crosslinking and immunoprecipitation (CLIP). Here, we report a method for obtaining strand-specific small RNA libraries for RNA sequencing that requires picograms of RNA. We employ an intramolecular circularization step that increases the efficiency of library preparation and avoids the need for intermolecular ligations of adaptor sequences. Other key features include random priming for full-length cDNA synthesis and gel-free library purification. Using our method, we generated CLIP-Seq libraries from nuclear RNA that had been UV-crosslinked and immunoprecipitated with anti-Argonaute 2 (Ago2) antibody. Computational protocols were developed to enable analysis of raw sequencing data and we observe substantial differences between recognition by Ago2 of RNA species in the nucleus relative to the cytoplasm. This RNA self-circularization approach to RNA sequencing (RC-Seq) allows data to be obtained using small amounts of input RNA that cannot be sequenced by standard methods.

The nuclear receptor DAF-12 regulates nutrient metabolism and reproductive growth in nematodes.

Wang Z, Stoltzfus J, You YJ, Ranjit N, Tang H, Xie Y, Lok JB, Mangelsdorf DJ, Kliewer SA.
March 2015 PLoS Genet. 11(3):e1005027.
image

Abstract

Appropriate nutrient response is essential for growth and reproduction. Under favorable nutrient conditions, the C. elegans nuclear receptor DAF-12 is activated by dafachronic acids, hormones that commit larvae to reproductive growth. Here, we report that in addition to its well-studied role in controlling developmental gene expression, the DAF-12 endocrine system governs expression of a gene network that stimulates the aerobic catabolism of fatty acids. Thus, activation of the DAF-12 transcriptome coordinately mobilizes energy stores to permit reproductive growth. DAF-12 regulation of this metabolic gene network is conserved in the human parasite, Strongyloides stercoralis, and inhibition of specific steps in this network blocks reproductive growth in both of the nematodes. Our study provides a molecular understanding for metabolic adaptation of nematodes to their environment, and suggests a new therapeutic strategy for treating parasitic diseases.

HITS-CLIP analysis uncovers a link between the Kaposi's sarcoma-associated herpesvirus ORF57 protein and host pre-mRNA metabolism.

Sei E, Wang T, Hunter OV, Xie Y, Conrad NK.
Febrery 2015 PLoS Pathog. 11(2):e1004652. doi: 10.1371/journal.ppat.1004652.

Abstract

The Kaposi's sarcoma associated herpesvirus (KSHV) is an oncogenic virus that causes Kaposi's sarcoma, primary effusion lymphoma (PEL), and some forms of multicentric Castleman's disease. The KSHV ORF57 protein is a conserved posttranscriptional regulator of gene expression that is essential for virus replication. ORF57 is multifunctional, but most of its activities are directly linked to its ability to bind RNA. We globally identified virus and host RNAs bound by ORF57 during lytic reactivation in PEL cells using high-throughput sequencing of RNA isolated by cross-linking immunoprecipitation (HITS-CLIP). As expected, ORF57-bound RNA fragments mapped throughout the KSHV genome, including the known ORF57 ligand PAN RNA. In agreement with previously published ChIP results, we observed that ORF57 bound RNAs near the oriLyt regions of the genome. Examination of the host RNA fragments revealed that a subset of the ORF57-bound RNAs was derived from transcript 5' ends. The position of these 5'-bound fragments correlated closely with the 5'-most exon-intron junction of the pre-mRNA. We selected four candidates (BTG1, EGR1, ZFP36, and TNFSF9) and analyzed their pre-mRNA and mRNA levels during lytic phase. Analysis of both steady-state and newly made RNAs revealed that these candidate ORF57-bound pre-mRNAs persisted for longer periods of time throughout infection than control RNAs, consistent with a role for ORF57 in pre-mRNA metabolism. In addition, exogenous expression of ORF57 was sufficient to increase the pre-mRNA levels and, in one case, the mRNA levels of the putative ORF57 targets. These results demonstrate that ORF57 interacts with specific host pre-mRNAs during lytic reactivation and alters their processing, likely by stabilizing pre-mRNAs. These data suggest that ORF57 is involved in modulating host gene expression in addition to KSHV gene expression during lytic reactivation.

Real-time resolution of point mutations that cause phenovariance in mice.

Wang T, Zhan X, Bu CH, Lyon S, Pratt D, Hildebrand S, Choi JH, Zhang Z, Zeng M, Wang KW, Turer E, Chen Z, Zhang D, Yue T, Wang Y, Shi H, Wang J, Sun L, SoRelle J, McAlpine W, Hutchins N, Zhan X, Fina M, Gobert R, Quan J, Kreutzer M, Arnett S, Hawkins K, Leach A, Tate C, Daniel C, Reyna C, Prince L, Davis S, Purrington J, Bearden R, Weatherly J, White D, Russell J, Sun Q, Tang M, Li X, Scott L, Moresco EM, McInerney GM, Karlsson Hedestam GB, Xie Y, Beutler B.
Febrery 2015 Proc Natl Acad Sci U S A. 112(5):E440-9. doi: 10.1073/pnas.1423216112.

Abstract

With the wide availability of massively parallel sequencing technologies, genetic mapping has become the rate limiting step in mammalian forward genetics. Here we introduce a method for real-time identification of N-ethyl-N-nitrosourea-induced mutations that cause phenotypes in mice. All mutations are identified by whole exome G1 progenitor sequencing and their zygosity is established in G2/G3 mice before phenotypic assessment. Quantitative and qualitative traits, including lethal effects, in single or multiple combined pedigrees are then analyzed with Linkage Analyzer, a software program that detects significant linkage between individual mutations and aberrant phenotypic scores and presents processed data as Manhattan plots. As multiple alleles of genes are acquired through mutagenesis, pooled "superpedigrees" are created to analyze the effects. Our method is distinguished from conventional forward genetic methods because it permits (1) unbiased declaration of mappable phenotypes, including those that are incompletely penetrant (2), automated identification of causative mutations concurrent with phenotypic screening, without the need to outcross mutant mice to another strain and backcross them, and (3) exclusion of genes not involved in phenotypes of interest. We validated our approach and Linkage Analyzer for the identification of 47 mutations in 45 previously known genes causative for adaptive immune phenotypes; our analysis also implicated 474 genes not previously associated with immune function. The method described here permits forward genetic analysis in mice, limited only by the rates of mutant production and screening.

iScreen: Image-Based High-Content RNAi Screening Analysis Tools.

Zhong R, Dong X, Levine B, Xie Y, Xiao G.
September 2015 J Biomol Screen. 20(8):998-1002. doi: 10.1177/1087057114564348.

Abstract

High-throughput RNA interference (RNAi) screening has opened up a path to investigating functional genomics in a genome-wide pattern. However, such studies are often restricted to assays that have a single readout format. Recently, advanced image technologies have been coupled with high-throughput RNAi screening to develop high-content screening, in which one or more cell image(s), instead of a single readout, were generated from each well. This image-based high-content screening technology has led to genome-wide functional annotation in a wider spectrum of biological research studies, as well as in drug and target discovery, so that complex cellular phenotypes can be measured in a multiparametric format. Despite these advances, data analysis and visualization tools are still largely lacking for these types of experiments. Therefore, we developed iScreen (image-Based High-content RNAi Screening Analysis Tool), an R package for the statistical modeling and visualization of image-based high-content RNAi screening. Two case studies were used to demonstrate the capability and efficiency of the iScreen package. iScreen is available for download on CRAN (http://cran.cnr.berkeley.edu/web/packages/iScreen/index.html). The user manual is also available as a supplementary document.

A community computational challenge to predict the activity of pairs of compounds.

Bansal M, Yang J, Karan C, Menden MP, Costello JC, Tang H, Xiao G, Li Y, Allen J, Zhong R, Chen B, Kim M, Wang T, Heiser LM, Realubit R, Mattioli M, Alvarez MJ, Shen Y; NCI-DREAM Community, Gallahan D, Singer D, Saez-Rodriguez J, Xie Y, Stolovitzky G, Califano A; NCI-DREAM Community.
December 2014 Nature Biotechnology 32, 1213–1222

Abstract

Recent therapeutic successes have renewed interest in drug combinations, but experimental screening approaches are costly and often identify only small numbers of synergistic combinations. The DREAM consortium launched an open challenge to foster the development of in silico methods to computationally rank 91 compound pairs, from the most synergistic to the most antagonistic, based on gene-expression profiles of human B cells treated with individual compounds at multiple time points and concentrations. Using scoring metrics based on experimental dose-response curves, we assessed 32 methods (31 community-generated approaches and SynGen), four of which performed significantly better than random guessing. We highlight similarities between the methods. Although the accuracy of predictions was not optimal, we find that computational prediction of compound-pair activity is possible, and that community challenges can be useful to advance the field of in silico compound-synergy prediction.

Ensemble-based network aggregation improves the accuracy of gene network reconstruction.

Zhong R, Allen JD, Xiao G, Xie Y,
November 2014 PLoS One. 9(11):e106319. doi: 10.1371/journal.pone.0106319.

Abstract

Reverse engineering approaches to constructing gene regulatory networks (GRNs) based on genome-wide mRNA expression data have led to significant biological findings, such as the discovery of novel drug targets. However, the reliability of the reconstructed GRNs needs to be improved. Here, we propose an ensemble-based network aggregation approach to improving the accuracy of network topologies constructed from mRNA expression data. To evaluate the performances of different approaches, we created dozens of simulated networks from combinations of gene-set sizes and sample sizes and also tested our methods on three Escherichia coli datasets. We demonstrate that the ensemble-based network aggregation approach can be used to effectively integrate GRNs constructed from different studies - producing more accurate networks. We also apply this approach to building a network from epithelial mesenchymal transition (EMT) signature microarray data and identify hub genes that might be potential drug targets. The R code used to perform all of the analyses is available in an R package entitled "ENA", accessible on CRAN (http://cran.r-project.org/web/packages/ENA/).

ASCL1 is a lineage oncogene providing therapeutic targets for high-grade neuroendocrine lung cancers.

Augustyn A, Borromeo M, Wang T, Fujimoto J, Shao C, Dospoy PD, Lee V, Tan C, Sullivan JP, Larsen JE, Girard L, Behrens C, Wistuba II, Xie Y, Cobb MH, Gazdar AF, Johnson JE, Minna JD.
October 2014 PNAS 111 (41) 14788-14793;

Abstract

Aggressive neuroendocrine lung cancers, including small cell lung cancer (SCLC) and non-small cell lung cancer (NSCLC), represent an understudied tumor subset that accounts for approximately 40,000 new lung cancer cases per year in the United States. No targeted therapy exists for these tumors. We determined that achaete-scute homolog 1 (ASCL1), a transcription factor required for proper development of pulmonary neuroendocrine cells, is essential for the survival of a majority of lung cancers (both SCLC and NSCLC) with neuroendocrine features. By combining whole-genome microarray expression analysis performed on lung cancer cell lines with ChIP-Seq data designed to identify conserved transcriptional targets of ASCL1, we discovered an ASCL1 target 72-gene expression signature that (i) identifies neuroendocrine differentiation in NSCLC cell lines, (ii) is predictive of poor prognosis in resected NSCLC specimens from three datasets, and (iii) represents novel "druggable" targets. Among these druggable targets is B-cell CLL/lymphoma 2, which when pharmacologically inhibited stops ASCL1-dependent tumor growth in vitro and in vivo and represents a proof-of-principle ASCL1 downstream target gene. Analysis of downstream targets of ASCL1 represents an important advance in the development of targeted therapy for the neuroendocrine class of lung cancers, providing a significant step forward in the understanding and therapeutic targeting of the molecular vulnerabilities of neuroendocrine lung cancer.

Poly-dipeptides encoded by the C9orf72 repeats bind nucleoli, impede RNA biogenesis, and kill cells.

Kwon I, Xiang S, Kato M, Wu L, Theodoropoulos P, Wang T, Kim J, Yun J, Xie Y, McKnight SL.
September 2014 Science Vol. 345, Issue 6201, pp. 1139-1145

Abstract

Many RNA regulatory proteins controlling pre-messenger RNA splicing contain serine:arginine (SR) repeats. Here, we found that these SR domains bound hydrogel droplets composed of fibrous polymers of the low-complexity domain of heterogeneous ribonucleoprotein A2 (hnRNPA2). Hydrogel binding was reversed upon phosphorylation of the SR domain by CDC2-like kinases 1 and 2 (CLK1/2). Mutated variants of the SR domains changing serine to glycine (SR-to-GR variants) also bound to hnRNPA2 hydrogels but were not affected by CLK1/2. When expressed in mammalian cells, these variants bound nucleoli. The translation products of the sense and antisense transcripts of the expansion repeats associated with the C9orf72 gene altered in neurodegenerative disease encode GRn and PRn repeat polypeptides. Both peptides bound to hnRNPA2 hydrogels independent of CLK1/2 activity. When applied to cultured cells, both peptides entered cells, migrated to the nucleus, bound nucleoli, and poisoned RNA biogenesis, which caused cell death.

Predictors and intensity of online access to electronic medical records among patients with cancer.

Gerber DE, Laccetti AL, Chen B, Yan J, Cai J, Gates S, Xie Y, Lee SJ.
September 2014 Journal of Oncology Practice 10, no. 5

Abstract

Introduction

Electronic portals are secure Web-based servers that provide patients with real-time access to their personal health record (PHR). These applications are now widely used at cancer centers nationwide, but their impact has not been well studied. This study set out to determine predictors and patterns of use of a Web-based portal for accessing PHRs and communicating with health providers among patients with cancer.

Methods

Retrospective analysis of enrollment in and use of MyChart, a PHR portal for the Epic electronic medical record system, among patients seen at a National Cancer Institute-designated cancer center. Predictors of MyChart use were analyzed through univariable and multivariable regression models.

Results

A total of 6,495 patients enrolled in MyChart from 2007 to 2012. The median number of log-ins over this period was 57 (interquartile range 17-137). The most common portal actions were viewing test results (37%), viewing and responding to clinic messages (29%), and sending medical advice requests (6.4%). Increased portal use was significantly associated with younger age, white race, and an upper aerodigestive malignancy diagnosis. Thirty-seven percent of all log-ins and 31% of all medical advice requests occurred outside clinic hours. Over the study period, the average number of patient log-ins per year more than doubled.

Conclusions

Among patients with cancer, PHR portal use is frequent and increasing. Younger patients, white patients, and patients with upper aerodigestive malignancies exhibit the heaviest portal use. Understanding the implications of this new technology will be central to the delivery of safe and effective care.

Computational detection and suppression of sequence-specific off-target phenotypes from whole genome RNAi screens.

Zhong R, Kim J, Kim HS, Kim M, Lum L, Levine B, Xiao G, White MA, Xie Y,
July 2014 Nucleic Acids Research, Volume 42, Issue 13, Pages 8214–8222,

Abstract

A challenge for large-scale siRNA loss-of-function studies is the biological pleiotropy resulting from multiple modes of action of siRNA reagents. A major confounding feature of these reagents is the microRNA-like translational quelling resulting from short regions of oligonucleotide complementarity to many different messenger RNAs. We developed a computational approach, deconvolution analysis of RNAi screening data, for automated quantitation of off-target effects in RNAi screening data sets. Substantial reduction of off-target rates was experimentally validated in five distinct biological screens across different genome-wide siRNA libraries. A public-access graphical-user-interface has been constructed to facilitate application of this algorithm.

Hereditary lung cancer syndrome targets never smokers with germline EGFR gene T790M mutations.

Gazdar A, Robinson L, Oliver D, Xing C, Travis WD, Soh J, Toyooka S, Watumull L, Xie Y, Kernstine K, Schiller JH.
April 2014 Journal of Thoracic Oncology Volume 9, Issue 4, Pages 456-463
image

Abstract

Introduction

Hereditary lung cancer syndromes are rare, and T790M germline mutations of the epidermal growth factor receptor (EGFR) gene predispose to the development of lung cancer. The goal of this study was to determine the clinical features and smoking status of lung cancer cases and unaffected family members with this germline mutation and to estimate its incidence and penetrance.

Methods

We studied a family with germline T790M mutations over five generations (14 individuals) and combined our observations with data obtained from a literature search (15 individuals).

Results

T790M germline mutations occurred in approximately 1% of non-small-cell lung cancer cases and in less than one in 7500 subjects without lung cancer. Both sporadic and germline T790M mutations were predominantly adenocarcinomas, favored female gender, and were occasionally multifocal. Of lung cancer tumors arising in T790M germline mutation carriers, 73% contained a second activating EGFR gene mutation. Inheritance was dominant. The odds ratio that T790M germline carriers who are smokers will develop lung cancer compared with never smoker carriers was 0.31 (p = 6.0E-05). There was an overrepresentation of never smokers with lung cancer with this mutation compared with the general lung cancer population (p = 7.4E-06).

Conclusion

Germline T790M mutations result in a unique hereditary lung cancer syndrome that targets never smokers, with a preliminary estimate of 31% risk for lung cancer in never smoker carriers, and this risk may be lower for heavy smokers. The resultant cancers share several features and differences with lung cancers containing sporadic EGFR mutations.

A model-based approach to identify binding sites in CLIP-Seq data.

Wang T, Chen B, Kim M, Xie Y, Xiao G.
April 2014 PLoS One. 9(4):e93248. doi: 10.1371/journal.pone.0093248.

Abstract

Cross-linking immunoprecipitation coupled with high-throughput sequencing (CLIP-Seq) has made it possible to identify the targeting sites of RNA-binding proteins in various cell culture systems and tissue types on a genome-wide scale. Here we present a novel model-based approach (MiClip) to identify high-confidence protein-RNA binding sites from CLIP-seq datasets. This approach assigns a probability score for each potential binding site to help prioritize subsequent validation experiments. The MiClip algorithm has been tested in both HITS-CLIP and PAR-CLIP datasets. In the HITS-CLIP dataset, the signal/noise ratios of miRNA seed motif enrichment produced by the MiClip approach are between 17% and 301% higher than those by the ad hoc method for the top 10 most enriched miRNAs. In the PAR-CLIP dataset, the MiClip approach can identify ∼50% more validated binding targets than the original ad hoc method and two recently published methods. To facilitate the application of the algorithm, we have released an R package, MiClip (http://cran.r-project.org/web/packages/MiClip/index.html), and a public web-based graphical user interface software (http://galaxy.qbrc.org/tool_runner?tool_id=mi_clip) for customized analysis.

Detection of candidate tumor driver genes using a fully integrated Bayesian approach.

Yang J, Wang X, Kim M, Xie Y, Xiao G.
May 2014 Stat Med. 33(10):1784-800. doi: 10.1002/sim.6066.

Abstract

DNA copy number alterations (CNAs), including amplifications and deletions, can result in significant changes in gene expression and are closely related to the development and progression of many diseases, especially cancer. For example, CNA-associated expression changes in certain genes (called candidate tumor driver genes) can alter the expression levels of many downstream genes through transcription regulation and cause cancer. Identification of such candidate tumor driver genes leads to discovery of novel therapeutic targets for personalized treatment of cancers. Several approaches have been developed for this purpose by using both copy number and gene expression data. In this study, we propose a Bayesian approach to identify candidate tumor driver genes, in which the copy number and gene expression data are modeled together, and the dependency between the two data types is modeled through conditional probabilities. The proposed joint modeling approach can identify CNA and differentially expressed genes simultaneously, leading to improved detection of candidate tumor driver genes and comprehensive understanding of underlying biological processes. We evaluated the proposed method in simulation studies, and then applied to a head and neck squamous cell carcinoma data set. Both simulation studies and data application show that the joint modeling approach can significantly improve the performance in identifying candidate tumor driver genes, when compared with other existing approaches.

Adaptive prediction model in prospective molecular signature-based clinical studies.

Xiao G, Ma S, Minna J, Xie Y,
Febrery 2014 Clin Cancer Res. 20(3):531-9. doi: 10.1158/1078-0432.CCR-13-2127.
image

Abstract

Use of molecular profiles and clinical information can help predict which treatment would give the best outcome and survival for each individual patient, and thus guide optimal therapy, which offers great promise for the future of clinical trials and practice. High prediction accuracy is essential for selecting the best treatment plan. The gold standard for evaluating the prediction models is prospective clinical studies, in which patients are enrolled sequentially. However, there is no statistical method using this sequential feature to adapt the prediction model to the current patient cohort. In this article, we propose a reweighted random forest (RWRF) model, which updates the weight of each decision tree whenever additional patient information is available, to account for the potential heterogeneity between training and testing data. A simulation study and a lung cancer example are used to show that the proposed method can adapt the prediction model to current patients' characteristics, and, therefore, can improve prediction accuracy significantly. We also show that the proposed method can identify important and consistent predictive variables. Compared with rebuilding the prediction model, the RWRF updates a well-tested model gradually, and all of the adaptive procedure/parameters used in the RWRF model are prespecified before patient recruitment, which are important practical advantages for prospective clinical studies.

27-Hydroxycholesterol promotes cell-autonomous, ER-positive breast cancer growth.

Potts MB, Kim HS, Fisher KW, Hu Y, Carrasco YP, Bulut GB, Ou YH, Herrera-Herrera ML, Cubillos F, Mendiratta S, Xiao G, Hofree M, Ideker T, Xie Y, Huang LJ, Lewis RE, MacMillan JB, White MA.
November 2013 Cell Reports Volume 5, Issue 3, Pages 637-645

Abstract

To date, estrogen is the only known endogenous estrogen receptor (ER) ligand that promotes ER+ breast tumor growth. We report that the cholesterol metabolite 27-hydroxycholesterol (27HC) stimulates MCF-7 cell xenograft growth in mice. More importantly, in ER+ breast cancer patients, 27HC content in normal breast tissue is increased compared to that in cancer-free controls, and tumor 27HC content is further elevated. Increased tumor 27HC is correlated with diminished expression of CYP7B1, the 27HC metabolizing enzyme, and reduced expression of CYP7B1 in tumors is associated with poorer patient survival. Moreover, 27HC is produced by MCF-7 cells, and it stimulates cell-autonomous, ER-dependent, and GDNF-RET-dependent cell proliferation. Thus, 27HC is a locally modulated, nonaromatized ER ligand that promotes ER+ breast tumor growth.

Using functional signature ontology (FUSION) to identify mechanisms of action for natural products.

Potts MB, Kim HS, Fisher KW, Hu Y, Carrasco YP, Bulut GB, Ou YH, Herrera-Herrera ML, Cubillos F, Mendiratta S, Xiao G, Hofree M, Ideker T, Xie Y, Huang LJ, Lewis RE, MacMillan JB, White MA.
October 2013 Science Signal, Vol. 6, Issue 297, pp. ra90

Abstract

A challenge for biomedical research is the development of pharmaceuticals that appropriately target disease mechanisms. Natural products can be a rich source of bioactive chemicals for medicinal applications but can act through unknown mechanisms and can be difficult to produce or obtain. To address these challenges, we developed a new marine-derived, renewable natural products resource and a method for linking bioactive derivatives of this library to the proteins and biological processes that they target in cells. We used cell-based screening and computational analysis to match gene expression signatures produced by natural products to those produced by small interfering RNA (siRNA) and synthetic microRNA (miRNA) libraries. With this strategy, we matched proteins and miRNAs with diverse biological processes and also identified putative protein targets and mechanisms of action for several previously undescribed marine-derived natural products. We confirmed mechanistic relationships for selected siRNAs, miRNAs, and compounds with functional roles in autophagy, chemotaxis mediated by discoidin domain receptor 2, or activation of the kinase AKT. Thus, this approach may be an effective method for screening new drugs while simultaneously identifying their targets.

Cytoplasmic TRADD confers a worse prognosis in glioblastoma.

Chakraborty S, Li L, Tang H, Xie Y, Puliyappadamba VT, Raisanen J, Burma S, Boothman DA, Cochran B, Wu J, Habib AA.
August 2013 Neoplasia. 15(8):888-97

Abstract

Tumor necrosis factor receptor 1 (TNFR1)-associated death domain protein (TRADD) is an important adaptor in TNFR1 signaling and has an essential role in nuclear factor kappa-light-chain-enhancer of activated B cells (NF-κB) activation and survival signaling. Increased expression of TRADD is sufficient to activate NF-κB. Recent studies have highlighted the importance of NF-κB activation as a key pathogenic mechanism in glioblastoma multiforme (GBM), the most common primary malignant brain tumor in adults.We examined the expression of TRADD by immunohistochemistry (IHC) and find that TRADD is commonly expressed at high levels in GBM and is detected in both cytoplasmic and nuclear distribution. Cytoplasmic IHC TRADD scoring is significantly associated with worse progression-free survival (PFS) both in univariate and multivariate analysis but is not associated with overall survival (n = 43 GBMs). PFS is a marker for responsiveness to treatment. We propose that TRADD-mediated NF-κB activation confers chemoresistance and thus a worse PFS in GBM. Consistent with the effect on PFS, silencing TRADD in glioma cells results in decreased NF-κB activity, decreased proliferation of cells, and increased sensitivity to temozolomide. TRADD expression is common in glioma-initiating cells. Importantly, silencing TRADD in GBM-initiating stem cell cultures results in decreased viability of stem cells, suggesting that TRADD may be required for maintenance of GBM stem cell populations. Thus, our study suggests that increased expression of cytoplasmic TRADD is both an important biomarker and a key driver of NF-κB activation in GBM and supports an oncogenic role for TRADD in GBM.

Influence of medical comorbidities on the presentation and outcomes of stage I-III non-small-cell lung cancer.

Ahn DH, Mehta N, Yorio JT, Xie Y, Yan J, Gerber DE.
November 2013 Clinical Lung Cancer Volume 14, Issue 6, Pages 644-650

Abstract

Background

Non-small-cell lung cancer presentation, treatment, and outcomes vary widely according to socioeconomic factors and other patient characteristics. To determine whether medical comorbidities account for these observations, we incorporated a validated medical comorbidity index into an analysis of patients diagnosed with stage I to III NSCLC.

Patients and Methods

We performed a retrospective analysis of consecutive patients diagnosed with stage I to III NSCLC. Demographic, tumor, and comorbidity data were obtained from hospital tumor registries and individual patient records. The association between variables was assessed using multivariate logistic regression and survival analysis.

Results

A total of 454 patients met criteria for analysis. The median age was 65 years, and 51% were men. Individuals with a higher Charlson Comorbidity Index (CCI) were significantly more likely to present with early stage (stage I-II) NSCLC than were patients with lower CCI (odds ratio, 1.72; 95% confidence interval, 1.14-2.63; P = .01), although this association lost statistical significance (P = .21) in a multivariate model. In multivariate logistic regression, overall survival remained associated with all variables: age, sex, race, insurance type, stage, histology, and CCI (P = .0007). The CCI was associated with survival for patients with early stage (P = .02) and locally advanced (P = .02) disease.

Conclusion

In this cohort of patients with stage I to III NSCLC, increasing comorbidity burden had a nonsignificant association with diagnosis at earlier disease stage. Although comorbidity burden was significantly associated with outcome for early stage and locally advanced disease, it did not account for survival differences based on multiple other patient and disease characteristics.

SbacHTS: spatial background noise correction for high-throughput RNAi screening.

Zhong R, Kim MS, White MA, Xie Y, Xiao G.
September 2013 Bioinformatics, Volume 29, Issue 17, Pages 2218–2220,
image

Abstract

Motivation

High-throughput cell-based phenotypic screening has become an increasingly important technology for discovering new drug targets and assigning gene functions. Such experiments use hundreds of 96-well or 384-well plates, to cover whole-genome RNAi collections and/or chemical compound files, and often collect measurements that are sensitive to spatial background noise whose patterns can vary across individual plates. Correcting these position effects can substantially improve measurement accuracy and screening success.

Result

We developed SbacHTS (Spatial background noise correction for High-Throughput RNAi Screening) software for visualization, estimation and correction of spatial background noise in high-throughput RNAi screens. SbacHTS is supported on the Galaxy open-source framework with a user-friendly open access web interface. We find that SbacHTS software can effectively detect and correct spatial background noise, increase signal to noise ratio and enhance statistical detection power in high-throughput RNAi screening experiments.

Availability
http://www.galaxy.qbrc.org/

Distinct transcriptome profiles identified in normal human bronchial epithelial cells after exposure to γ-rays and different elemental particles of high Z and energy.

Ding LH, Park S, Peyton M, Girard L, Xie Y, Minna JD, Story MD.
June 2013 BMC Genomics, 1;14:372. doi: 10.1186/1471-2164-14-372.

Abstract

Background

Ionizing radiation composed of accelerated ions of high atomic number (Z) and energy (HZE) deposits energy and creates damage in cells in a discrete manner as compared to the random deposition of energy and damage seen with low energy radiations such as γ- or x-rays. Such radiations can be highly effective at cell killing, transformation, and oncogenesis, all of which are concerns for the manned space program and for the burgeoning field of HZE particle radiotherapy for cancer. Furthermore, there are differences in the extent to which cells or tissues respond to such exposures that may be unrelated to absorbed dose. Therefore, we asked whether the energy deposition patterns produced by different radiation types would cause different molecular responses. We performed transcriptome profiling using human bronchial epithelial cells (HBECs) after exposure to γ-rays and to two different HZE particles (28Si and 56Fe) with different energy transfer properties to characterize the molecular response to HZE particles and γ-rays as a function of dose, energy deposition pattern, and time post-irradiation.

Results

Clonogenic assay indicated that the relative biological effectiveness (RBE) for 56Fe was 3.91 and for 28Si was 1.38 at 34% cell survival. Unsupervised clustering analysis of gene expression segregated samples according to the radiation species followed by the time after irradiation, whereas dose was not a significant parameter for segregation of radiation response. While a subset of genes associated with p53-signaling, such as CDKN1A, TRIM22 and BTG2 showed very similar responses to all radiation qualities, distinct expression changes were associated with the different radiation species. Gene enrichment analysis categorized the differentially expressed genes into functional groups related to cell death and cell cycle regulation for all radiation types, while gene pathway analysis revealed that the pro-inflammatory Acute Phase Response Signaling was specifically induced after HZE particle irradiation. A 73 gene signature capable of predicting with 96% accuracy the radiation species to which cells were exposed, was developed.

Conclusion

These data suggest that the molecular response to the radiation species used here is a function of the energy deposition characteristics of the radiation species. This novel molecular response to HZE particles may have implications for radiotherapy including particle selection for therapy and risk for second cancers, risk for cancers from diagnostic radiation exposures, as well as NASA's efforts to develop more accurate lung cancer risk estimates for astronaut safety. Lastly, irrespective of the source of radiation, the gene expression changes observed set the stage for functional studies of initiation or progression of radiation-induced lung carcinogenesis.

Detection of epigenetic changes using ANOVA with spatially varying coefficients.

Guanghua X, Xinlei W, Quincey L, Nestler EJ, Xie Y,
March 2013 Journal of Thoracic Oncology Volume 9, Issue 4, Pages 456-463

Abstract

Identification of genome-wide epigenetic changes, the stable changes in gene function without a change in DNA sequence, under various conditions plays an important role in biomedical research. High-throughput epigenetic experiments are useful tools to measure genome-wide epigenetic changes, but the measured intensity levels from these high-resolution genome-wide epigenetic profiling data are often spatially correlated with high noise levels. In addition, it is challenging to detect genome-wide epigenetic changes across multiple conditions, so efficient statistical methodology development is needed for this purpose. In this study, we consider ANOVA models with spatially varying coefficients, combined with a hierarchical Bayesian approach, to explicitly model spatial correlation caused by location-dependent biological effects (i.e., epigenetic changes) and borrow strength among neighboring probes to compare epigenetic changes across multiple conditions. Through simulation studies and applications in drug addiction and depression datasets, we find that our approach compares favorably with competing methods; it is more efficient in estimation and more effective in detecting epigenetic changes. In addition, it can provide biologically meaningful results.

Human lung epithelial cells progressed to malignancy through specific oncogenic manipulations.

Sato M, Larsen JE, Lee W, Sun H, Shames DS, Dalvi MP, Ramirez RD, Tang H, DiMaio JM, Gao B, Xie Y, Wistuba II, Gazdar AF, Shay JW, Minna JD.
June 2013 Mol Cancer Res. 11(6):638-50. doi: 10.1158/1541-7786.MCR-12-0634-T.
image

Abstract

We used CDK4/hTERT-immortalized normal human bronchial epithelial cells (HBEC) from several individuals to study lung cancer pathogenesis by introducing combinations of common lung cancer oncogenic changes (p53, KRAS, and MYC) and followed the stepwise transformation of HBECs to full malignancy. This model showed that: (i) the combination of five genetic alterations (CDK4, hTERT, sh-p53, KRAS(V12), and c-MYC) is sufficient for full tumorigenic conversion of HBECs; (ii) genetically identical clones of transformed HBECs exhibit pronounced differences in tumor growth, histology, and differentiation; (iii) HBECs from different individuals vary in their sensitivity to transformation by these oncogenic manipulations; (iv) high levels of KRAS(V12) are required for full malignant transformation of HBECs, however, prior loss of p53 function is required to prevent oncogene-induced senescence; (v) overexpression of c-MYC greatly enhances malignancy but only in the context of sh-p53+KRAS(V12); (vi) growth of parental HBECs in serum-containing medium induces differentiation, whereas growth of oncogenically manipulated HBECs in serum increases in vivo tumorigenicity, decreases tumor latency, produces more undifferentiated tumors, and induces epithelial-to-mesenchymal transition (EMT); (vii) oncogenic transformation of HBECs leads to increased sensitivity to standard chemotherapy doublets; (viii) an mRNA signature derived by comparing tumorigenic versus nontumorigenic clones was predictive of outcome in patients with lung cancer. Collectively, our findings show that this HBEC model system can be used to study the effect of oncogenic mutations, their expression levels, and serum-derived environmental effects in malignant transformation, while also providing clinically translatable applications such as development of prognostic signatures and drug response phenotypes.

A 12-gene set predicts survival benefits from adjuvant chemotherapy in non-small cell lung cancer patients.

Tang H, Xiao G, Behrens C, Schiller J, Allen J, Chow CW, Suraokar M, Corvalan A, Mao J, White MA, Wistuba II, Minna JD, Xie Y,
March 2014 Clin Cancer Res. 19(6):1577-86. doi: 10.1158/1078-0432.CCR-12-2321.
image

Abstract

Purpose

Prospectively identifying who will benefit from adjuvant chemotherapy (ACT) would improve clinical decisions for non-small cell lung cancer (NSCLC) patients. In this study, we aim to develop and validate a functional gene set that predicts the clinical benefits of ACT in NSCLC.

Experimental Design

An 18-hub-gene prognosis signature was developed through a systems biology approach, and its prognostic value was evaluated in six independent cohorts. The 18-hub-gene set was then integrated with genome-wide functional (RNAi) data and genetic aberration data to derive a 12-gene predictive signature for ACT benefits in NSCLC.

Results

Using a cohort of 442 stage I to III NSCLC patients who underwent surgical resection, we identified an 18-hub-gene set that robustly predicted the prognosis of patients with adenocarcinoma in all validation datasets across four microarray platforms. The hub genes, identified through a purely data-driven approach, have significant biological implications in tumor pathogenesis, including NKX2-1, Aurora Kinase A, PRC1, CDKN3, MBIP, and RRM2. The 12-gene predictive signature was successfully validated in two independent datasets (n = 90 and 176). The predicted benefit group showed significant improvement in survival after ACT (UT Lung SPORE data: HR = 0.34, P = 0.017; JBR.10 clinical trial data: HR = 0.36, P = 0.038), whereas the predicted nonbenefit group showed no survival benefit for 2 datasets (HR = 0.80, P = 0.70; HR = 0.91, P = 0.82).

Conclusion

This is the first study to integrate genetic aberration, genome-wide RNAi data, and mRNA expression data to identify a functional gene set that predicts which resectable patients with non-small cell lung cancer will have a survival benefit with ACT.

Consent timing and experience: modifiable factors that may influence interest in clinical research.

Gerber DE, Rasco DW, Skinner CS, Dowell JE, Yan J, Sayne JR, Xie Y,
March 2012 J Oncol Pract.8(2):91-6. doi: 10.1200/JOP.2011.000335.

Abstract

Purpose

Low rates of participation in cancer clinical trials have been attributed to patient, institutional, and study characteristics. However, few studies have examined factors related to the consent process. We therefore evaluated the impact of consent timing and experience on markers of patient interest in research.

Methods

We performed a retrospective analysis of patients enrolled in a cancer center tissue repository. During enrollment, patients were asked if they were willing to be contacted in the future to provide medical follow-up information and/or to participate in other clinical research. We analyzed the association between patient responses to these questions and consent process factors using univariate analysis and multivariate logistic regression.

Results

Of 922 patients evaluated, 85% agreed to be contacted to provide follow-up information, and 83% agreed to be contacted to participate in future research studies. In univariate analysis, willingness to be contacted for future research was associated with consenter experience (P = .01) and had a trend toward association with the timing of enrollment in relation to diagnosis (P = .08), but it was not associated with patient sex, race, or diagnosis. In multivariate analysis, responses remained associated with consenter experience (P = .02).

Conclusion

Factors related to the consent process, including consenter experience and timing of study enrollment, are significantly associated with or have a trend toward association with markers of patient interest in clinical research. These understudied and potentially modifiable variables warrant further evaluation.

The starvation hormone, fibroblast growth factor-21, extends lifespan in mice.

Zhang Y, Xie Y, Berglund ED, Coate KC, He TT, Katafuchi T, Xiao G, Potthoff MJ, Wei W, Wan Y, Yu RT, Evans RM, Kliewer SA, Mangelsdorf DJ.
October 2012 Elife. 1:e00065. doi: 10.7554/eLife.00065.

Abstract

Fibroblast growth factor-21 (FGF21) is a hormone secreted by the liver during fasting that elicits diverse aspects of the adaptive starvation response. Among its effects, FGF21 induces hepatic fatty acid oxidation and ketogenesis, increases insulin sensitivity, blocks somatic growth and causes bone loss. Here we show that transgenic overexpression of FGF21 markedly extends lifespan in mice without reducing food intake or affecting markers of NAD+ metabolism or AMP kinase and mTOR signaling. Transcriptomic analysis suggests that FGF21 acts primarily by blunting the growth hormone/insulin-like growth factor-1 signaling pathway in liver. These findings raise the possibility that FGF21 can be used to extend lifespan in other species.DOI:http://dx.doi.org/10.7554/eLife.00065.001.

A multicenter phase II study of cisplatin, pemetrexed, and bevacizumab in patients with advanced malignant mesothelioma.

Dowell JE, Dunphy FR, Taub RN, Gerber DE, Ngov L, Yan J, Xie Y, Kindler HL.
September 2012 Lung Cancer Volume 77, Issue 3, Pages 567-571
image

Abstract

Introduction

Malignant mesothelioma (MM) cells express the vascular endothelial growth factor (VEGF) receptor, and VEGF protein expression is detected in a majority of human mesothelioma biopsy specimens. Bevacizumab is a recombinant humanized monoclonal antibody that blocks the binding of VEGF to its receptor. We evaluated the addition of bevacizumab to cisplatin and pemetrexed as first-line treatment in patients with advanced, unresectable MM.

Methods

Previously untreated MM patients with advanced, unresectable disease received cisplatin (75 mg/m(2)), pemetrexed (500 mg/m(2)), and bevacizumab (15 mg/kg) intravenously every 21 days for a maximum of 6 cycles. Patients with responsive or stable disease received bevacizumab (15 mg/kg) intravenously every 21 days until progression or intolerance. The primary endpoint was progression-free survival rate at 6 months.

Results

53 patients were enrolled at 4 centers; 52 were evaluable for this analysis. The progression-free survival rate at 6 months was 56% and the median progression-free survival was 6.9 months (95% confidence interval [CI], 5.3-7.8 months). The partial response rate was 40% and 35% of patients had stable disease. Median overall survival was 14.8 months (95% CI; 10.0-17.0 months). Grade 3/4 toxicities included neutropenia in 11%, hypertension in 6%, and venous thromboembolism in 13% of patients.

Conclusion

This trial evaluating the addition of bevacizumab to cisplatin and pemetrexed in patients with previously untreated, advanced MM failed to meet the primary endpoint of a 33% improvement in the progression-free survival rate at 6 months compared with historical controls treated with cisplatin and pemetrexed alone.

Socioeconomic disparities in lung cancer treatment and outcomes persist within a single academic medical center.

Yorio JT, Yan J, Xie Y, Gerber DE.
November 2012 Clinical Lung Cancer Volume 13, Issue 6, Pages 448-457

Abstract

Background

Socioeconomic disparities in treatment and outcomes of non-small-cell lung cancer (NSCLC) are well established. To explore whether these differences are secondary to individual or institutional characteristics, we examined treatment selection and outcome in a diverse population treated at a single medical center.

Patients and Methods

We performed a retrospective analysis of consecutive patients diagnosed with NSCLC stages I-III from 2000 to 2005 at the University of Texas Southwestern Medical Center. Treatment selection was dichotomized as 'standard' (surgery for stage I-II; surgery and/or radiation therapy for stage III) or 'other.' Associations between patient characteristics (including socioeconomic status) and treatment selection were examined using logistic regression; associations between characteristics and overall survival were examined using Cox regression models and Kaplan-Meier survival analysis.

Results

A total of 450 patients were included. Twenty-eight percent of patients had private insurance, 43% had Medicare, and 29% had an indigent care plan. The likelihood of receiving 'standard' therapy was significantly associated with insurance type (indigent plan versus private insurance odds ratio [OR] 0.13, 95% confidence interval [CI] 0.04, 0.43 for stage I-II; OR 0.38, 95% CI 0.14, 1.00 for stage III). For patients with stage I-II NSCLC, survival was associated with age, sex, insurance type (indigent plan versus private insurance hazard ratio for death 1.98; 95% CI 1.16, 3.37), stage, and treatment selection. In stage III NSCLC, survival was associated with treatment selection.

Conclusion

ithin a single academic medical center, socioeconomically disadvantaged patients with stage I-III NSCLC are less likely to receive 'standard' therapy. Socioeconomically disadvantaged patients with stage I-II NSCLC have inferior survival independent of therapy.

Cell-free formation of RNA granules: bound RNAs identify features and components of cellular assemblies.

Han TW, Kato M, Xie S, Wu LC, Mirzaei H, Pei J, Chen M, Xie Y, Allen J, Xiao G, McKnight SL.
May 2012 Cell Volume 149, Issue 4, Pages 768-779
image

Abstract

Cellular granules lacking boundary membranes harbor RNAs and their associated proteins and play diverse roles controlling the timing and location of protein synthesis. Formation of such granules was emulated by treatment of mouse brain extracts and human cell lysates with a biotinylated isoxazole (b-isox) chemical. Deep sequencing of the associated RNAs revealed an enrichment for mRNAs known to be recruited to neuronal granules used for dendritic transport and localized translation at synapses. Precipitated mRNAs contain extended 3' UTR sequences and an enrichment in binding sites for known granule-associated proteins. Hydrogels composed of the low complexity (LC) sequence domain of FUS recruited and retained the same mRNAs as were selectively precipitated by the b-isox chemical. Phosphorylation of the LC domain of FUS prevented hydrogel retention, offering a conceptual means of dynamic, signal-dependent control of RNA granule assembly.

A lung cancer molecular prognostic test ready for prime time.

Xie Y, Minna JD.
March 2012 The Lancet, Volume 379, Issue 9819, Pages 785-787

Development of methods for quantitative comparison of pooled shRNAs by mass sequencing.

Hoshiyama H, Tang J, Batten K, Xiao G, Rouillard JM, Shay JW, Xie Y, Wright WE.
February 2012 J Biomol Screen. 17(2):258-65. doi: 10.1177/1087057111423101.

Abstract

Pooled short-hairpin RNA (shRNA) library screening is a powerful tool for identifying a set of genes in biological pathways that require stable expression to produce a desired phenotype. Massive parallel sequencing of half-hairpins has proven highly variable and has not given satisfactory results concerning the relative abundance of different shRNAs before and after selection. Here, the authors describe a method for quantitative comparison of half-hairpins from pooled shRNAs in the mir30-based pGIPZ vector that is analyzed by massive parallel sequencing. Introducing a multiplexing code and refining the sample preparation scheme resulted in the predicted ability to detect twofold enrichments. These improvements should permit half-hairpin sequencing to analyze either dropout screens or selective pooled shRNA screens of limited stringency to analyze phenotypes not accessible in transient experiments.

Incidence of unanticipated difficult airway in obstetric patients in a teaching institution.

Tao W, Edwards JT, Tu F,Xie Y, Sharma SK.
January 2012 Journal of Anesthesia, Volume 26, Issue 3, pp 339345

Abstract

Purpose

Our aim was to determine the incidence of difficult intubation during pregnancy-related surgery at a high-risk, high-volume teaching institution.

Methods

Airway experience was analyzed among patients who had pregnancy-related surgery under general anesthesia from January 2001 through February 2006. A difficult airway was defined as needing three or more direct laryngoscopy (DL) attempts, use of the additional airway equipment after the DL attempts, or conversion to regional anesthesia due to inability to intubate. Airway characteristics were compared between patients with and without a difficult airway. In addition, pre- and postoperative airway evaluations were compared to identify factors closely related to changes from pregnancy.

Results

In a total of 30,766 operations, 2,158 (7%) were performed with general anesthesia. Among these, 1,026 (47.5%) were for emergency cesarean delivery (CD), 610 (28.3%) for nonemergency CD, and 522 (24.2%) for non-CD procedures. A total of 12 patients (0.56%) were identified as having a difficult airway. Four patients were intubated with further DL attempts; others required mask ventilation and other airway equipment. Two patients were ventilated through a laryngeal mask airway without further intubation attempts. Ten of the 12 difficult airway cases were encountered by residents during their first year of clinical anesthesia training. There were no maternal or fetal complications except one possible aspiration.

Conclusion

Unanticipated difficult airways accounted for 0.56% of all pregnancy-related surgical patients. More than 99.9% of all obstetric patients could be intubated. A difficult airway is more likely to be encountered by anesthesia providers with <1 year of experience. Proper use of airway equipment may help secure the obstetric airway or provide adequate ventilation. Emergency CD did not add an additional level of difficulty over nonemergency CD.

Comparing statistical methods for constructing large scale gene networks.

Allen JD, Xie Y, Chen M, Girard L, Xiao G.
January 2012 PLoS One. 7(1):e29348. doi: 10.1371/journal.pone.0029348.

Abstract

The gene regulatory network (GRN) reveals the regulatory relationships among genes and can provide a systematic understanding of molecular mechanisms underlying biological processes. The importance of computer simulations in understanding cellular processes is now widely accepted; a variety of algorithms have been developed to study these biological networks. The goal of this study is to provide a comprehensive evaluation and a practical guide to aid in choosing statistical methods for constructing large scale GRNs. Using both simulation studies and a real application in E. coli data, we compare different methods in terms of sensitivity and specificity in identifying the true connections and the hub genes, the ease of use, and computational speed. Our results show that these algorithms performed reasonably well, and each method has its own advantages: (1) GeneNet, WGCNA (Weighted Correlation Network Analysis), and ARACNE (Algorithm for the Reconstruction of Accurate Cellular Networks) performed well in constructing the global network structure; (2) GeneNet and SPACE (Sparse PArtial Correlation Estimation) performed well in identifying a few connections with high specificity.

Probe mapping across multiple microarray platforms.

Allen JD, Wang S, Chen M, Girard L, Minna JD, Xie Y, Xiao G.
September 2012 Briefings in Bioinformatics, Volume 13, Issue 5, Pages 547554,
image

Abstract

Access to gene expression data has become increasingly common in recent years; however, analysis has become more difficult as it is often desirable to integrate data from different platforms. Probe mapping across microarray platforms is the first and most crucial step for data integration. In this article, we systematically review and compare different approaches to map probes across seven platforms from different vendors: U95A, U133A and U133 Plus 2.0 from Affymetrix, Inc.; HT-12 v1, HT-12v2 and HT-12v3 from Illumina, Inc.; and 4112A from Agilent, Inc. We use a unique data set, which contains 56 lung cancer cell line samples-each of which has been measured by two different microarray platforms-to evaluate the consistency of expression measurement across platforms using different approaches. Based on the evaluation from the empirical data set, the BLAST alignment of the probe sequences to a recent revision of the Transcriptome generated better results than using annotations provided by Vendors or from Bioconductor's Annotate package. However, a combination of all three methods (deemed the 'Consensus Annotation') yielded the most consistent expression measurement across platforms. To facilitate data integration across microarray platforms for the research community, we develop a user-friendly web-based tool, an API and an R package to map data across different microarray platforms from Affymetrix, Illumina and Agilent. Information on all three can be found at http://qbrc.swmed.edu/software/probemapper/.

SMAC mimetic (JP1201) sensitizes non-small cell lung cancers to multiple chemotherapy agents in an IAP-dependent but TNF-α-independent manner.

Greer RM, Peyton M, Larsen JE, Girard L,Xie Y, Gazdar AF, Harran P, Wang L, Brekken RA, Wang X, Minna JD.
December 2011 Cancer Research, Volume 71, Issue 24

Abstract

Inhibitors of apoptosis proteins (IAP) are key regulators of apoptosis and are inhibited by the second mitocondrial activator of caspases (SMAC). Previously, a small subset of TNF-α-expressing non-small cell lung cancers (NSCLC) was found to be sensitive to SMAC mimetics alone. In this study, we determined if a SMAC mimetic (JP1201) could sensitize nonresponsive NSCLC cell lines to standard chemotherapy. We found that JP1201 sensitized NSCLCs to doxorubicin, erlotinib, gemcitabine, paclitaxel, vinorelbine, and the combination of carboplatin with paclitaxel in a synergistic manner at clinically achievable drug concentrations. Sensitization did not occur with platinum alone. Furthermore, sensitization was specific for tumor compared with normal lung epithelial cells, increased in NSCLCs harvested after chemotherapy treatment, and did not induce TNF-α secretion. Sensitization also was enhanced in vivo with increased tumor inhibition and increased survival of mice carrying xenografts. These effects were accompanied by caspase 3, 4, and 9 activation, indicating that both mitochondrial and endoplasmic reticulum stress-induced apoptotic pathways are activated by the combination of vinorelbine and JP1201. Chemotherapies that induce cell death through the mitochondrial pathway required only inhibition of X-linked IAP (XIAP) for sensitization, whereas chemotherapies that induce cell death through multiple apoptotic pathways required inhibition of cIAP1, cIAP2, and XIAP. Therefore, the data suggest that IAP-targeted therapy using a SMAC mimetic provides a new therapeutic strategy for synergistic sensitization of NSCLCs to standard chemotherapy agents, which seems to occur independently of TNF-α secretion.

Image-based genome-wide siRNA screen identifies selective autophagy factors.

Orvedahl A, Sumpter R Jr, Xiao G, Ng A, Zou Z, Tang Y, Narimatsu M, Gilpin C, Sun Q, Roth M, Forst CV, Wrana JL, Zhang YE, Luby-Phelps K, Xavier RJ, Xie Y, Levine B.
December 2011 Nature 480, 113117

Abstract

Selective autophagy involves the recognition and targeting of specific cargo, such as damaged organelles, misfolded proteins, or invading pathogens for lysosomal destruction. Yeast genetic screens have identified proteins required for different forms of selective autophagy, including cytoplasm-to-vacuole targeting, pexophagy and mitophagy, and mammalian genetic screens have identified proteins required for autophagy regulation. However, there have been no systematic approaches to identify molecular determinants of selective autophagy in mammalian cells. Here, to identify mammalian genes required for selective autophagy, we performed a high-content, image-based, genome-wide small interfering RNA screen to detect genes required for the colocalization of Sindbis virus capsid protein with autophagolysosomes. We identified 141 candidate genes required for viral autophagy, which were enriched for cellular pathways related to messenger RNA processing, interferon signalling, vesicle trafficking, cytoskeletal motor function and metabolism. Ninety-six of these genes were also required for Parkin-mediated mitophagy, indicating that common molecular determinants may be involved in autophagic targeting of viral nucleocapsids and autophagic targeting of damaged mitochondria. Murine embryonic fibroblasts lacking one of these gene products, the C2-domain containing protein, SMURF1, are deficient in the autophagosomal targeting of Sindbis and herpes simplex viruses and in the clearance of damaged mitochondria. Moreover, SMURF1-deficient mice accumulate damaged mitochondria in the heart, brain and liver. Thus, our study identifies candidate determinants of selective autophagy, and defines SMURF1 as a newly recognized mediator of both viral autophagy and mitophagy.

Robust gene expression signature from formalin-fixed paraffin-embedded samples predicts prognosis of non-small-cell lung cancer patients.

Xie Y, Xiao G, Coombes KR, Behrens C, Solis LM, Raso G, Girard L, Erickson HS, Roth J, Heymach JV, Moran C, Danenberg K, Minna JD, Wistuba II.
September 2011 Clinical Cancer Research, Volume 17, Issue 17
image

Abstract

Purpose

The requirement of frozen tissues for microarray experiments limits the clinical usage of genome-wide expression profiling by using microarray technology. The goal of this study is to test the feasibility of developing lung cancer prognosis gene signatures by using genome-wide expression profiling of formalin-fixed paraffin-embedded (FFPE) samples, which are widely available and provide a valuable rich source for studying the association of molecular changes in cancer and associated clinical outcomes.

Experimental Design

We randomly selected 100 Non-Small-Cell lung cancer (NSCLC) FFPE samples with annotated clinical information from the UT-Lung SPORE Tissue Bank. We microdissected tumor area from FFPE specimens and used Affymetrix U133 plus 2.0 arrays to attain gene expression data. After strict quality control and analysis procedures, a supervised principal component analysis was used to develop a robust prognosis signature for NSCLC. Three independent published microarray datasets were used to validate the prognosis model.

Results

This study showed that the robust gene signature derived from genome-wide expression profiling of FFPE samples is strongly associated with lung cancer clinical outcomes and can be used to refine the prognosis for stage I lung cancer patients, and the prognostic signature is independent of clinical variables. This signature was validated in several independent studies and was refined to a 59-gene lung cancer prognosis signature.

Conclusion

We conclude that genome-wide profiling of FFPE lung cancer samples can identify a set of genes whose expression level provides prognostic information across different platforms and studies, which will allow its application in clinical settings.

Knockdown of oncogenic KRAS in non-small cell lung cancers suppresses tumor growth and sensitizes tumor cells to targeted therapy.

Sunaga N, Shames DS, Girard L, Peyton M, Larsen JE, Imai H, Soh J, Sato M, Yanagitani N, Kaira K, Xie Y, Gazdar AF, Mori M, Minna JD.
February 2011 Molecular Cancer Therapeutics, Volume 10, Issue 2, 10(2):336-46.

Abstract

Oncogenic KRAS is found in more than 25% of lung adenocarcinomas, the major histologic subtype of non-small cell lung cancer (NSCLC), and is an important target for drug development. To this end, we generated four NSCLC lines with stable knockdown selective for oncogenic KRAS. As expected, stable knockdown of oncogenic KRAS led to inhibition of in vitro and in vivo tumor growth in the KRAS-mutant NSCLC cells, but not in NSCLC cells that have wild-type KRAS (but mutant NRAS). Surprisingly, we did not see large-scale induction of cell death and the growth inhibitory effect was not complete. To further understand the ability of NSCLCs to grow despite selective removal of mutant KRAS expression, we conducted microarray expression profiling of NSCLC cell lines with or without mutant KRAS knockdown and isogenic human bronchial epithelial cell lines with and without oncogenic KRAS. We found that although the mitogen-activated protein kinase pathway is significantly downregulated after mutant KRAS knockdown, these NSCLCs showed increased levels of phospho-STAT3 and phospho-epidermal growth factor receptor, and variable changes in phospho-Akt. In addition, mutant KRAS knockdown sensitized the NSCLCs to p38 and EGFR inhibitors. Our findings suggest that targeting oncogenic KRAS by itself will not be sufficient treatment, but may offer possibilities of combining anti-KRAS strategies with other targeted drugs.

Predictors and impact of second-line chemotherapy for advanced non-small cell lung cancer in the United States: real-world considerations for maintenance therapy.

Gerber DE, Rasco DW, Le P, Yan J, Dowell JE, Xie Y,
February 2011 Journal of Thoracic Oncology, Volume 6, Issue 2, Pages 365371

Abstract

Introduction

Recent clinical trials incorporating maintenance chemotherapy into the initial treatment of advanced non-small cell lung cancer (NSCLC) have highlighted the benefits of exposing patients to second-line therapies. We, therefore, determined the predictors and impact of second-line chemotherapy administration in a contemporary, diverse NSCLC population.

Methods

We performed a retrospective analysis of consecutive patients diagnosed with stage IV NSCLC from 2000 to 2007 at clinical facilities associated with the University of Texas Southwestern Medical Center. Demographic, disease, treatment, and outcome data were obtained from hospital tumor registries. The association between these variables was assessed using univariate analysis and multivariate logistic regression.

Results

A total of 406 patients in this cohort received first-line chemotherapy and were included in the analysis. Mean age was 59 years, 28% were women, and 59% were white. Among these patients, 197 (49%) received second-line chemotherapy. Among those patients who had not progressed after four to six cycles of first-line chemotherapy, 67% received second-line chemotherapy. Receipt of second-line chemotherapy was significantly associated with patient insurance type (p = 0.007), number of cycles of first-line chemotherapy (p < 0.001), and receipt of prechemotherapy palliative radiation therapy (p = 0.005) but was not associated with patient age, gender, race, histology, or year of diagnosis. In a multivariate model, second-line chemotherapy administration remained associated with insurance type (p = 0.003), number of cycles of first-line chemotherapy (p < 0.001), and receipt of prechemotherapy palliative radiation therapy (p = 0.008). The number of cycles of first-line chemotherapy and administration of second-line chemotherapy were associated with overall survival in both univariate and multivariate analyses.

Conclusion

In this unselected, contemporary, and diverse cohort of patients with advanced NSCLC, 67% of individuals whose disease had not progressed after four to six cycles of first-line chemotherapy eventually received second-line chemotherapy. Markers of socioeconomic status, symptom burden, and response to and tolerance of first-line chemotherapy were associated with receipt of second-line chemotherapy. These factors may assist in the selection of patients most likely to benefit from maintenance chemotherapy.

A novel approach to DNA copy number data segmentation.

Wang S, Wang Y, Xie Y, Xiao G.
February 2011 J Bioinform Comput Biol. 9(1): 131–148.

Abstract

DNA copy number (DCN) is the number of copies of DNA at a region of a genome. The alterations of DCN are highly associated with the development of different tumors. Recently, microarray technologies are being employed to detect DCN changes at many loci at the same time in tumor samples. The resulting DCN data are often very noisy, and the tumor sample is often contaminated by normal cells. The goal of computational analysis of array-based DCN data is to infer the underlying DCNs from raw DCN data. Previous methods for this task do not model the tumor/normal cell mixture ratio explicitly and they cannot output segments with DCN annotations. We developed a novel model-based method using the minimum description length (MDL) principle for DCN data segmentation. Our new method can output underlying DCN for each chromosomal segment, and at the same time, infer the underlying tumor proportion in the test samples. Empirical results show that our method achieves better accuracies on average as compared to three previous methods, namely Circular Binary Segmentation, Hidden Markov Model and Ultrasome.

Nuclear receptor expression defines a set of prognostic biomarkers for lung cancer.

Jeong Y, Xie Y, Xiao G, Behrens C, Girard L, Wistuba II, Minna JD, Mangelsdorf DJ.
December 2010 PLoS Med. 7(12):e1000378. doi: 10.1371/journal.pmed.1000378.
image

Abstract

Background

The identification of prognostic tumor biomarkers that also would have potential as therapeutic targets, particularly in patients with early stage disease, has been a long sought-after goal in the management and treatment of lung cancer. The nuclear receptor (NR) superfamily, which is composed of 48 transcription factors that govern complex physiologic and pathophysiologic processes, could represent a unique subset of these biomarkers. In fact, many members of this family are the targets of already identified selective receptor modulators, providing a direct link between individual tumor NR quantitation and selection of therapy. The goal of this study, which begins this overall strategy, was to investigate the association between mRNA expression of the NR superfamily and the clinical outcome for patients with lung cancer, and to test whether a tumor NR gene signature provided useful information (over available clinical data) for patients with lung cancer.

Methods and Findings

Using quantitative real-time PCR to study NR expression in 30 microdissected non-small-cell lung cancers (NSCLCs) and their pair-matched normal lung epithelium, we found great variability in NR expression among patients' tumor and non-involved lung epithelium, found a strong association between NR expression and clinical outcome, and identified an NR gene signature from both normal and tumor tissues that predicted patient survival time and disease recurrence. The NR signature derived from the initial 30 NSCLC samples was validated in two independent microarray datasets derived from 442 and 117 resected lung adenocarcinomas. The NR gene signature was also validated in 130 squamous cell carcinomas. The prognostic signature in tumors could be distilled to expression of two NRs, short heterodimer partner and progesterone receptor, as single gene predictors of NSCLC patient survival time, including for patients with stage I disease. Of equal interest, the studies of microdissected histologically normal epithelium and matched tumors identified expression in normal (but not tumor) epithelium of NGFIB3 and mineralocorticoid receptor as single gene predictors of good prognosis.

Conclusion

NR expression is strongly associated with clinical outcomes for patients with lung cancer, and this expression profile provides a unique prognostic signature for lung cancer patient survival time, particularly for those with early stage disease. This study highlights the potential use of NRs as a rational set of therapeutically tractable genes as theragnostic biomarkers, and specifically identifies short heterodimer partner and progesterone receptor in tumors, and NGFIB3 and MR in non-neoplastic lung epithelium, for future detailed translational study in lung cancer. Please see later in the article for the Editors' Summary.

Aldehyde dehydrogenase activity selects for lung adenocarcinoma stem cells dependent on notch signaling.

Sullivan JP, Spinola M, Dodge M, Raso MG, Behrens C, Gao B, Schuster K, Shao C, Larsen JE, Sullivan LA, Honorio S, Xie Y, Scaglioni PP, DiMaio JM, Gazdar AF, Shay JW, Wistuba II, Minna JD.
December 2010 Cancer Res. 70(23):9937-48. doi: 10.1158/0008-5472.CAN-10-0881.

Abstract

Aldehyde dehydrogenase (ALDH) is a candidate marker for lung cancer cells with stem cell-like properties. Immunohistochemical staining of a large panel of primary non-small cell lung cancer (NSCLC) samples for ALDH1A1, ALDH3A1, and CD133 revealed a significant correlation between ALDH1A1 (but not ALDH3A1 or CD133) expression and poor prognosis in patients including those with stage I and N0 disease. Flow cytometric analysis of a panel of lung cancer cell lines and patient tumors revealed that most NSCLCs contain a subpopulation of cells with elevated ALDH activity, and that this activity is associated with ALDH1A1 expression. Isolated ALDH(+) lung cancer cells were observed to be highly tumorigenic and clonogenic as well as capable of self-renewal compared with their ALDH(-) counterparts. Expression analysis of sorted cells revealed elevated Notch pathway transcript expression in ALDH(+) cells. Suppression of the Notch pathway by treatment with either a γ-secretase inhibitor or stable expression of shRNA against NOTCH3 resulted in a significant decrease in ALDH(+) lung cancer cells, commensurate with a reduction in tumor cell proliferation and clonogenicity. Taken together, these findings indicate that ALDH selects for a subpopulation of self-renewing NSCLC stem-like cells with increased tumorigenic potential, that NSCLCs harboring tumor cells with ALDH1A1 expression have inferior prognosis, and that ALDH1A1 and CD133 identify different tumor subpopulations. Therapeutic targeting of the Notch pathway reduces this ALDH(+) component, implicating Notch signaling in lung cancer stem cell maintenance.

Steroid receptor coactivator-3 expression in lung cancer and its role in the regulation of cancer cell survival and proliferation.

Cai D, Shames DS, Raso MG, Xie Y, Kim YH, Pollack JR, Girard L, Sullivan JP, Gao B, Peyton M, Nanjundan M, Byers L, Heymach J, Mills G, Gazdar AF, Wistuba I, Kodadek T, Minna JD.
August 2010 Cancer Res. 70(16):6477-85. doi: 10.1158/0008-5472.CAN-10-0005.

Abstract

Steroid receptor coactivator-3 (SRC-3) is a histone acetyltransferase and nuclear hormone receptor coactivator, located on 20q12, which is amplified in several epithelial cancers and well studied in breast cancer. However, its possible role in lung cancer pathogenesis is unknown. We found SRC-3 to be overexpressed in 27% of non-small cell lung cancer (NSCLC) patients (n = 311) by immunohistochemistry, which correlated with poor disease-free (P = 0.0015) and overall (P = 0.0008) survival. Twenty-seven percent of NSCLCs exhibited SRC-3 gene amplification, and we found that lung cancer cell lines expressed higher levels of SRC-3 than did immortalized human bronchial epithelial cells (HBEC), which in turn expressed higher levels of SRC-3 than did cultured primary human HBECs. Small interfering RNA-mediated downregulation of SRC-3 in high-expressing, but not in low-expressing, lung cancer cells significantly inhibited tumor cell growth and induced apoptosis. Finally, we found that SRC-3 expression is inversely correlated with gefitinib sensitivity and that SRC-3 knockdown results in epidermal growth factor receptor tyrosine kinase inhibitor-resistant lung cancers becoming more sensitive to gefitinib. Taken together, these data suggest that SRC-3 may be an important oncogene and therapeutic target for lung cancer.

Statistical methods for integrating multiple types of high-throughput data.

Xie Y, Ahn C.
2010 Statistical Methods in Molecular Biology pp 511-529
image
image

Abstract

Large-scale sequencing, copy number, mRNA, and protein data have given great promise to the biomedical research, while posing great challenges to data management and data analysis. Integrating different types of high-throughput data from diverse sources can increase the statistical power of data analysis and provide deeper biological understanding. This chapter uses two biomedical research examples to illustrate why there is an urgent need to develop reliable and robust methods for integrating the heterogeneous data. We then introduce and review some recently developed statistical methods for integrative analysis for both statistical inference and classification purposes. Finally, we present some useful public access databases and program code to facilitate the integrative analysis in practice.

Looking beyond surveillance, epidemiology, and end results: patterns of chemotherapy administration for advanced non-small cell lung cancer in a contemporary, diverse population.

Rasco DW, Yan J, Xie Y, Dowell JE, Gerber DE.
October 2010 Journal of Thoracic Oncology, Volume 5, Issue 10, Pages 15291535

Abstract

Introduction

Chemotherapy prolongs survival without substantially impairing quality of life for medically fit patients with advanced non-small cell lung cancer (NSCLC), but population-based studies have shown that only 20 to 30% of these patients receive chemotherapy. These earlier studies have relied on Medicare-linked Surveillance, Epidemiology, and End Results (SEER) data, thus excluding the 30 to 35% of lung cancer patients younger than 65 years. Therefore, we determined the use of chemotherapy in a contemporary, diverse NSCLC population encompassing all patient ages.

Methods

We performed a retrospective analysis of patients diagnosed with stage IV NSCLC from 2000 to 2007 at the University of Texas Southwestern Medical Center. Demographic, treatment, and outcome data were obtained from hospital tumor registries. The association between these variables was assessed using univariate analysis and multivariate logistic regression.

Results

In all, 718 patients met criteria for analysis. Mean age was 60 years, 58% were men, and 45% were white. Three hundred fifty-three patients (49%) received chemotherapy. In univariate analysis, receipt of chemotherapy was associated with age (53% of patients younger than 65 years versus 41% of patients aged 65 years and older; p = 0.003) and insurance type (p < 0.001). In a multivariate model, age and insurance type remained associated with receipt of chemotherapy. For individuals receiving chemotherapy, median survival was 9.2 months, compared with 2.3 months for untreated patients (p < 0.001).

Conclusion

In a contemporary population representing the full age range of patients with advanced NSCLC, chemotherapy was administered to approximately half of all patients-more than twice the rate reported in some earlier studies. Patient age and insurance type are associated with receipt of chemotherapy.

A Bayesian approach to joint modeling of protein-DNA binding, gene expression and sequence data.

Xie Y, Pan W, Jeong KS, Xiao G, Khodursky AB.
February 2010 Stat Med. 29(4):489-503.

Abstract

The genome-wide DNA-protein-binding data, DNA sequence data and gene expression data represent complementary means to deciphering global and local transcriptional regulatory circuits. Combining these different types of data can not only improve the statistical power, but also provide a more comprehensive picture of gene regulation. In this paper, we propose a novel statistical model to augment protein-DNA-binding data with gene expression and DNA sequence data when available. We specify a hierarchical Bayes model and use Markov chain Monte Carlo simulations to draw inferences. Both simulation studies and an analysis of an experimental data set show that the proposed joint modeling method can significantly improve the specificity and sensitivity of identifying target genes as compared with conventional approaches relying on a single data source.

Lack of host SPARC enhances vascular function and tumor spread in an orthotopic murine model of pancreatic carcinoma.

Arnold SA, Rivera LB, Miller AF, Carbon JG, Dineen SP, Xie Y, Castrillon DH, Sage EH, Puolakkainen P, Bradshaw AD, Brekken RA.
Jan-Feb 2012 Disease Models & Mechanisms,3(1-2):57-72.

Abstract

Utilizing subcutaneous tumor models, we previously validated SPARC (secreted protein acidic and rich in cysteine) as a key component of the stromal response, where it regulated tumor size, angiogenesis and extracellular matrix deposition. In the present study, we demonstrate that pancreatic tumors grown orthotopically in Sparc-null (Sparc(-/-)) mice are more metastatic than tumors grown in wild-type (Sparc(+/+)) littermates. Tumors grown in Sparc(-/-) mice display reduced deposition of fibrillar collagens I and III, basement membrane collagen IV and the collagen-associated proteoglycan decorin. In addition, microvessel density and pericyte recruitment are reduced in tumors grown in the absence of host SPARC. However, tumors from Sparc(-/-) mice display increased permeability and perfusion, and a subsequent decrease in hypoxia. Finally, we found that tumors grown in the absence of host SPARC exhibit an increase in alternatively activated macrophages. These results suggest that increased tumor burden in the absence of host SPARC is a consequence of reduced collagen deposition, a disrupted vascular basement membrane, enhanced vascular function and an immune-tolerant, pro-metastatic microenvironment.

Lung cancer diagnostic and treatment intervals in the United States: a health care disparity?

Yorio JT, Xie Y, Yan J, Gerber DE.
November 2009 Journal of Thoracic Oncology, Volume 4, Issue 11, Pages 13221330
image

Abstract

Introduction

Lung cancer diagnostic and treatment delays have been described for several patient populations. However, few studies have analyzed these intervals among patients treated in contemporary health care systems in the United States. We therefore studied the timing of lung cancer diagnosis and treatment at a U.S. medical center providing care to a diverse patient population within two different hospital systems.

Methods and Findings

Lung cancer diagnostic and treatment delays have been described for several patient populations. However, few studies have analyzed these intervals among patients treated in contemporary health care systems in the United States. We therefore studied the timing of lung cancer diagnosis and treatment at a U.S. medical center providing care to a diverse patient population within two different hospital systems.

Results

A total of 482 patients met criteria for analysis. In univariate analyses, the image-treatment interval was significantly associated with race, age, income, insurance type, and hospital type (76 days for public versus 45 days for private; p < 0.0001). In multivariate analysis, only hospital type remained significantly associated with the image-treatment interval; patients in the private hospital setting were more likely to receive timely treatment (hazard ratio 1.85; 95% confidence interval, 1.37-2.50; p < 0.001). In univariate analysis, the image-treatment interval was not associated with disease stage (p = 0.27) or with survival (p = 0.42).

Conclusion

Intervals between suspicion, diagnosis, and treatment of lung cancer vary widely among patients. Health care system factors, such as hospital type, largely account for these discrepancies. In this study, these intervals do not appear to be associated with clinical outcomes.

The impact of consenter characteristics and experience on patient interest in clinical research.

Rasco DW, Xie Y, Yan J, Sayne JR, Skinner CS, Dowell JE, Gerber DE.
May 2009 THE ONCOLOGIST, 14(5):468-75.
image

Abstract

Background

To explain the historically low rates of participation in cancer clinical trials, several factors have been studied. These include subject characteristics and attitudes, clinical trial availability and eligibility criteria, and physician attitudes and communication skills. However, the impact of nonphysician research personnel, who often consent patients for studies, is unclear. We therefore evaluated the association between consenter characteristics and subject interest in clinical research.

Methods

We performed a retrospective review of subjects enrolled in a university-based cancer center tissue repository. During enrollment, subjects were asked if they were willing to be contacted in the future to (a) provide medical follow-up information and (b) participate in other clinical research. We analyzed the association between responses to these questions and consenter characteristics using univariate analysis and multivariate logistic regression.

Results

In total, 181 consenters enrolled 922 subjects. The majority of subjects agreed to be contacted for follow-up (84.9%) and future research (83.1%). Subject willingness to be contacted for future research was associated with greater consenter experience in univariate and multivariate analyses. In multivariate analysis, subject willingness to be contacted for future research was associated with discordance between subject and consenter gender, but not with subject gender, race, or income, or consenter gender or race.

Conclusion

Consenter experience and subject-consenter gender discordance were associated with greater subject interest in participating in future research. The role of consenters in clinical research merits future study and should be considered in efforts to increase cancer clinical trial accrual.

The receptor interacting protein 1 inhibits p53 induction through NF-kappaB activation and confers a worse prognosis in glioblastoma.

Park S, Hatanpaa KJ, Xie Y, Mickey BE, Madden CJ, Raisanen JM, Ramnarain DB, Xiao G, Saha D, Boothman DA, Zhao D, Bachoo RM, Pieper RO, Habib AA.
April 2010 Cancer Res. 69(7):2809-16. doi: 10.1158/0008-5472.CAN-08-4079.

Abstract

Nuclear factor-kappaB (NF-kappaB) activation may play an important role in the pathogenesis of cancer and also in resistance to treatment. Inactivation of the p53 tumor suppressor is a key component of the multistep evolution of most cancers. Links between the NF-kappaB and p53 pathways are under intense investigation. In this study, we show that the receptor interacting protein 1 (RIP1), a central component of the NF-kappaB signaling network, negatively regulates p53 tumor suppressor signaling. Loss of RIP1 from cells results in augmented induction of p53 in response to DNA damage, whereas increased RIP1 level leads to a complete shutdown of DNA damage-induced p53 induction by enhancing levels of cellular mdm2. The key signal generated by RIP1 to up-regulate mdm2 and inhibit p53 is activation of NF-kappaB. The clinical implication of this finding is shown in glioblastoma, the most common primary malignant brain tumor in adults. We show that RIP1 is commonly overexpressed in glioblastoma, but not in grades II and III glioma, and increased expression of RIP1 confers a worse prognosis in glioblastoma. Importantly, RIP1 levels correlate strongly with mdm2 levels in glioblastoma. Our results show a key interaction between the NF-kappaB and p53 pathways that may have implications for the targeted treatment of glioblastoma.

Alterations in genes of the EGFR signaling pathway and their relationship to EGFR tyrosine kinase inhibitor sensitivity in lung cancer cell lines.

Gandhi J, Zhang J, Xie Y, Soh J, Shigematsu H, Zhang W, Yamamoto H, Peyton M, Girard L, Lockwood WW, Lam WL, Varella-Garcia M, Minna JD, Gazdar AF.
2009 PLoS One. 4(2):e4576. doi: 10.1371/journal.pone.0004576.

Abstract

Background

Deregulation of EGFR signaling is common in non-small cell lung cancers (NSCLC) and this finding led to the development of tyrosine kinase inhibitors (TKIs) that are highly effective in a subset of NSCLC. Mutations of EGFR (mEGFR) and copy number gains (CNGs) of EGFR (gEGFR) and HER2 (gHER2) have been reported to predict for TKI response. Mutations in KRAS (mKRAS) are associated with primary resistance to TKIs.

Methodology/principal Findings

We investigated the relationship between mutations, CNGs and response to TKIs in a large panel of NSCLC cell lines. Genes studied were EGFR, HER2, HER3 HER4, KRAS, BRAF and PIK3CA. Mutations were detected by sequencing, while CNGs were determined by quantitative PCR (qPCR), fluorescence in situ hybridization (FISH) and array comparative genomic hybridization (aCGH). IC50 values for the TKIs gefitinib (Iressa) and erlotinib (Tarceva) were determined by MTS assay. For any of the seven genes tested, mutations (39/77, 50.6%), copy number gains (50/77, 64.9%) or either (65/77, 84.4%) were frequent in NSCLC lines. Mutations of EGFR (13%) and KRAS (24.7%) were frequent, while they were less frequent for the other genes. The three techniques for determining CNG were well correlated, and qPCR data were used for further analyses. CNGs were relatively frequent for EGFR and KRAS in adenocarcinomas. While mutations were largely mutually exclusive, CNGs were not. EGFR and KRAS mutant lines frequently demonstrated mutant allele specific imbalance i.e. the mutant form was usually in great excess compared to the wild type form. On a molar basis, sensitivity to gefitinib and erlotinib were highly correlated. Multivariate analyses led to the following results: 1. mEGFR and gEGFR and gHER2 were independent factors related to gefitinib sensitivity, in descending order of importance. 2. mKRAS was associated with increased in vitro resistance to gefitinib.

Conclusion/Significance

Our in vitro studies confirm and extend clinical observations and demonstrate the relative importance of both EGFR mutations and CNGs and HER2 CNGs in the sensitivity to TKIs.

Statistical methods of background correction for Illumina BeadArray data.

Xie Y, Wang X, Story M.
March 2009 Bioinformatics, Volume 25, Issue 6, Pages 751757
image

Abstract

Motivation

Advances in technology have made different microarray platforms available. Among the many, Illumina BeadArrays are relatively new and have captured significant market share. With BeadArray technology, high data quality is generated from low sample input at reduced cost. However, the analysis methods for Illumina BeadArrays are far behind those for Affymetrix oligonucleotide arrays, and so need to be improved.

Results

In this article, we consider the problem of background correction for BeadArray data. One distinct feature of BeadArrays is that for each array, the noise is controlled by over 1000 bead types conjugated with non-specific oligonucleotide sequences. We extend the robust multi-array analysis (RMA) background correction model to incorporate the information from negative control beads, and consider three commonly used approaches for parameter estimation, namely, non-parametric, maximum likelihood estimation (MLE) and Bayesian estimation. The proposed approaches, as well as the existing background correction methods, are compared through simulation studies and a data example. We find that the maximum likelihood and Bayes methods seem to be the most promising.

Supplementary Information

Supplementary data are available at Bioinformatics online.

Histone deacetylase inhibitor romidepsin enhances anti-tumor effect of erlotinib in non-small cell lung cancer (NSCLC) cell lines.

Zhang W, Peyton M, Xie Y, Soh J, Minna JD, Gazdar AF, Frenkel EP.
February 2009 Journal of Thoracic Oncology, Volume 4, Issue 2, Pages 161166

Abstract

Introduction

Most epidermal growth factor receptor (EGFR) mutant non-small cell lung cancers (NSCLCs) are sensitive to EGFR tyrosine kinase inhibitors (TKIs) such as erlotinib or gefitinib, but many EGFR wild type NSCLCs are resistant to TKIs. In this study, we examined the effects of the histone deacetylase inhibitor, romidepsin, in combination with erlotinib, in NSCLC cell lines and xenografts.

Methods

For in vitro studies, nine NSCLC cell lines with varying mutation status and histology were treated with erlotinib and romidepsin alone or in combination. 3-(4,5-dimethylthiazol-2-yl)-5-(3-carboxymethoxyphenyl)-2-(4-sulfophenyl)-2H-tetrazolium assays were performed to determine the concentration that inhibits 50% (IC50) value of each drug or the combination. For in vivo studies, NCI-H1299 xenografts were inoculated subcutaneously into athymic nude mice. Romidepsin and/or erlotinib were injected intraperitoneally after tumors developed and tumor sizes were measured.

Results

We found that romidepsin increased the sensitivity of erlotinib synergistically in all nine NSCLC cell lines including EGFR and KRAS wild type cell lines, KRAS mutant cell lines, and TKI resistant EGFR mutant cell lines. This effect was partially due to enhanced apoptosis. Furthermore, cotreatment of erlotinib and romidepsin inhibited NCI-H1299 xenograft growth in athymic nude mice.

Conclusion

These observations support a role for the combination of a histone deacetylase inhibitor and a TKI in the treatment of NSCLCs.

Cytoglobin, the newest member of the globin family, functions as a tumor suppressor gene.

Shivapurkar N, Stastny V, Okumura N, Girard L, Xie Y, Prinsen C, Thunnissen FB, Wistuba II, Czerniak B, Frenkel E, Roth JA, Liloglou T, Xinarianos G, Field JK, Minna JD, Gazdar AF.
September 2008 Cancer Res. 68(18):7448-56. doi: 10.1158/0008-5472.
image

Abstract

Cytoglobin (CYGB) is a recently discovered vertebrate globin distantly related to myoglobin with unknown function. CYGB is assigned to chromosomal region 17q25, which is frequently lost in multiple malignancies. Previous studies failed to detect evidence for mutations in the CYGB gene. Recent studies provided preliminary evidence for increased methylation of the gene in lung cancer. Our study was aimed at investigating the role of CYGB as a tumor suppressor gene. By nested methylation-specific DNA sequencing analysis of lung and breast cancer cell lines and bronchial and mammary epithelial cell lines, we identified that methylation of a 110-bp CpG-rich segment of the CYGB promoter was correlated with gene silencing. We specifically targeted this sequence and developed a quantitative methylation-specific PCR assay, suitable for high-throughput analysis. We showed that the tumor specificity of CYGB methylation in discriminating patients with and without lung cancer, using biopsies and sputum samples. We further showed the tumor specificity of this assay with multiple other epithelial and hematologic malignancies. To show tumor suppressor activity of CYGB, we performed the following: (a) RNA interference-mediated knockdown of CYGB gene on colony formation in a CYGB expression-positive lung cancer cell line, resulting in increased colony formation; (b) enforced gene expression in CYGB expression-negative lung and breast cancer cell lines, reducing colony formation; and (c) identification of potential proximate targets down-stream of the CYGB genes. Our data constitute the first direct functional evidence for CYGB, the newest member of the globin family, as a tumor suppressor gene.

Enhanced identification and biological validation of differential gene expression via Illumina whole-genome expression arrays through the use of the model-based background correction methodology.

Ding LH, Xie Y, Park S, Xiao G, Story MD.
June 2008 Nucleic Acids Research, Volume 36, Issue 10, Pages e58,

Abstract

Despite the tremendous growth of microarray usage in scientific studies, there is a lack of standards for background correction methodologies, especially in single-color microarray platforms. Traditional background subtraction methods often generate negative signals and thus cause large amounts of data loss. Hence, some researchers prefer to avoid background corrections, which typically result in the underestimation of differential expression. Here, by utilizing nonspecific negative control features integrated into Illumina whole genome expression arrays, we have developed a method of model-based background correction for BeadArrays (MBCB). We compared the MBCB with a method adapted from the Affymetrix robust multi-array analysis algorithm and with no background subtraction, using a mouse acute myeloid leukemia (AML) dataset. We demonstrated that differential expression ratios obtained by using the MBCB had the best correlation with quantitative RT-PCR. MBCB also achieved better sensitivity in detecting differentially expressed genes with biological significance. For example, we demonstrated that the differential regulation of Tnfr2, Ikk and NF-kappaB, the death receptor pathway, in the AML samples, could only be detected by using data after MBCB implementation. We conclude that MBCB is a robust background correction method that will lead to more precise determination of gene expression and better biological interpretation of Illumina BeadArray data.

Differential methylation of a short CpG-rich sequence within exon 1 of TCF21 gene: a promising cancer biomarker assay.

Shivapurkar N, Stastny V, Xie Y, Prinsen C, Frenkel E, Czerniak B, Thunnissen FB, Minna JD, Gazdar AF.
April 2008 Cancer Epidemiol Biomarkers Prev. 17(4):995-1000. doi: 10.1158/1055-9965.

Abstract

Detection of cancer cells at early stages could potentially increase survival rates in cancer patients. Aberrant promoter hypermethylation is a major mechanism for silencing tumor suppressor genes in many kinds of human cancers. A recent report from our laboratory described the use of quantitative methylation-specific PCR assays for discriminating patients with lung cancer from those without lung cancer using lung biopsies as well as sputum samples. TCF21 is known to be essential for differentiation of epithelial cells adjacent to mesenchyme. Using restriction landmark genomic scanning, a recent study identified TCF21 as candidate tumor suppressor at 6q23-q24 that is epigenetically inactivated in lung and head and neck cancers. Using DNA sequencing technique, we narrowed down a short CpG-rich segment (eight specific CpG sites in the CpG island within exon 1) of the TCF21 gene, which was unmethylated in normal lung epithelial cells but predominantly methylated in lung cancer cell lines. We specifically targeted this short CpG-rich sequence and developed a quantitative methylation-specific PCR assay suitable for high-throughput analysis. We showed the usefulness of this assay in discriminating patients with lung cancer from those without lung cancer using biopsies and sputum samples. We further showed similar applications with multiple other malignancies. Our assay might have important implications in early detection and surveillance of multiple malignancies.

Software

We have developed online analysis tools that allow users to explore and analyze lung cancer, germ cell tumor relative gene expression data. PIPECLIP Galaxy is also provided for biologists to identify the most likely cross-linking sites.

Online Software


Software

Lung Cancer Explorer

The Lung Cancer Explorer is an online analysis tool which allows users to explore and analyze gene expression data from dozens of public lung cancer datasets.

Try Software
Software

Pipeclip Galaxy

PIPECLIP provides a pipeline for both bioinformaticians and biologist to identify the most likely cross-linking sites from PAR-CLIP, HITS-CLIP and iCLIP sequencing data.

Try Software
Software

Germ Cell Tumor Explorer

Germ Cell Tumor Explorer is an online analysis tool which allows users to explore and analyze gene expression data from dozens of public Germ Cell Tumor datasets.

Try Software

Software Packages


HITS-CLIP Analysis

We developed a model-based approach to detect RNA-protein binding sites in HITS-CLIP. The two-stage model, is established on all the sequencing reads to investigate binding sites at single base pair resolution. This toolbox provides essential MATLAB functions to implement our model for the identification of binding sites using heterogeneous logit models via semi-supervised learning.

PAR-CLIP HMM

The photoactivatable ribonucleoside enhanced cross-linking immunoprecipitation (PAR-CLIP) has been increasingly used for the global mapping of RNA-protein interaction sites. This package provides an integrative model to establish a joint distribution of read and mutation counts. To pinpoint the interaction sites at single base-pair resolution, we adopts non-homogeneous hidden Markov models that incorporate the nucleotide sequence.

dCLIP

dCLIP is written in Perl for discovering differential binding regions in two CLIP-Seq (HITS-CLIP or PAR-CLIP) experiments. It is appropiate in experiments where the common binding regions that are significantly enriched in both conditions tend to have similar binding strength and when researchers are more interested in the difference in binding strength rather than the binary event of whether binding site is common or not.

Bayesian Joint Analysis

Identifying which genes are differentially expressed (DE) and which gene sets are altered under two experimental conditions are both key questions in microarray analysis. Bayesian joint modeling approach to address the two key questions in parallel, which incorporates the information of functional annotations into expression data analysis and meanwhile infer the enrichment of functional groups.
Reference: Wang X, Chen M, Khodursky AB and Xiao G, Bayesian Joint Analysis of Gene Expression Data and Gene Functional Annotations, Statistics in Biosciences. 2012 Nov; 4(2): 300-318

DecoRNAi

High-throughput RNAi screening has been widely used in a spectrum of biomedical research and made it possible to study functional genomics. However, a challenge for authentic biological interpretation of large-scale siRNA or shRNA-mediated loss-of-function studies is the biological pleiotropy resulting from multiple modes of action of siRNA and shRNA reagents. A major confounding feature of these reagents is the microRNA-like translational quelling that can result from short regions (~6 nucleotides) of oligonucleotide complementarity to many different mRNAs. To help identify and correct miRNA-mimic off-target effects, we have developed DecoRNAi (deconvolution analysis of RNAi screening data) for automated quantitation and annotation of microRNA-like off-target effects in primary RNAi screening data sets. DecoRNAi can effectively identify and correct off-target effects from primary screening data and provide data visualization for study and publication. DecoRNAi contains pre-computed seed sequence families for 3 commonly employed commercial siRNA libraries. For custom collections, the tool will compute seed sequence membership from a user-supplied reagent sequence table. All parameters are tunable and output files include global data visualization, the identified seed family associations, the siRNA pools containing off-target seed families, corrected z-scores and the potential miRNAs with phenotypes of interest.

Probemapper

Connects to QBRC’s EntrezToProbe engine system to handle mappings between probes and genes and provide access to information about probes and genes.
References: Allen JD, Wang S, Chen M, Girard L, Minna J, Xie Y, Xiao G*. Probe mapping across multiple microarray platforms, Briefings in Bioinformatics, 2012 Sep;13(5):547-54. doi: 10.1093/bib/bbr076. PMID: 22199380

SbacHTS

Genome-wide RNAi screening experiments are customarily carried out on hundreds of 96-well or 384-well plates in order to study gene functions and discover novel drug targets. Spatial background noises however often blur interpretation of experimental results by distorting the distinct spatial patterns between different plates. It is therefore important to identify and correct the spatial background noises when analyzing RNAi screening data. Here, we developed an algorithm SbacHTS (Spatial background correction for High-Throughput RNAi Screening), for visualization, estimation and correction of spatial background noises of RNAi screening experiment results. SbacHTS can effectively detect and correct spatial background noise leading to higher signal/noise ratio and improved hits discovery for RNAi screening experiments. The only input required by the algorithm is the raw reads from the replicate plates.

MBCB

This package provides a model-based background correction method, which incorporates the negative control beads to pre-process Illumina BeadArray data.
References: Xie Y, Wang X, Story M. Statistical Methods of Background Correction for Illumina BeadArray. Bioinformatics, 2009, Mar 15;25(6):751-7. doi: 10.1093/bioinformatics/btp040. PMID: 19193732
Allen JD, Chen M, Xie Y (2009) Model-Based Background Correction (MBCB): R methods and GUI for Illumina Bead-array Data. J Canc Sci Ther 1: 025-027. doi:10.4172/1948-5956.1000004

Ensemble Network Aggregation (ENA)

Ensemble network aggregation is an approach which leverages the inverse-rank-product (IRP) method to combine networks. This package provides the capabilities to use IRP to bootstrap a dataset using a single method, to aggregate the networks produced by multiple methods, or to aggregate the networks produced on different datasets. Additionally, it offers convenience functions for converting between adjacency lists and matrices, and computing discrete graphs based on the Rank-Product method.

Members

Staff

Ling Cai

Data Scientist

Ruichen Rong

Data Scientist

Zhiqun Xie

Computational Biologist I

Bo Yao

Scientific Programmer II

Rong Lu

Biostatistical Consultant III

Lin Zhong

Data Scientist

Donghan Yang

Data Scientist

ShinYi Lin

Scientific Programmer

Kenian Chen

Computational Biologist II

Danni Luo

Scientific Programmer

He Zhang

Computational Biologist III

PHD Students

Bo Ci

PhD

Minzhe Zhang

PhD

Shidan Wang

PhD

Xinyi Zhang

PhD

Alumni

Beibei Chen

Computational Biologist I

Yunyun Zhou

Computational Biologist II

Faliu Yi

Postdoctoral Researcher

Jonghyun Yun

Postdoctoral Researcher(2012-2014)

Tang Hao

Postdoctoral Researcher
Assitant Professor

Donghyeon Yu

Postdoctoral Researcher(2012-2014)

Jungsik Noh

Postdoctoral Researcher(2012-2014)

Tao Wang

Ph.D. Student (2011-2015)

Rui Zhong

Ph.D Student (2009-2014)

Jichen Yang

Postdoctoral Researcher

Sangin Lee

Postdoctoral Researcher

Gaoxiang Jia

PhD

Qiwei Li

Postdoctoral Researcher

About PI

Yang Xie

Associate Professor & Director, Pediatric Cancer Data Commons, Quantitative Biomedical Research Center, UT Southwestern Medical Center
  •  214-648-5178
  •  214-648-1663
  •  Yang.Xie@utsw.edu
  •  Suite NC8.512, 5323 Harry Hines Blvd. Dallas, TX 75390

Download CV

Biography


I am the founding director of the Quantitative Biomedical Research Center and the Pediatric Cancer Data Commons (PCDC) at UT Southwestern Medical Center. My training was in biostatistics, medicine and epidemiology. My primary statistical expertise is in integrated analysis of high-dimensional data, preprocessing and analysis of high-throughput data, prediction model building, and validation. My research interests are translational research, medical informatics, developing predictive and prognostic biomarkers, and precision medicine. I also have extensive experience with the design and analysis of clinical trials and epidemiological studies and the development and maintenance of comprehensive databases. I have served as member of the NIH Biodata Management and Analysis Study Section [BDMA].

In addition, our team has extensive experience in developing and maintaining user-friendly software and comprehensive databases/web portals, including disease-specific web portals with online analytic tools for cancer:
http://lce.biohpc.swmed.edu/lungcancer/
https://qbrc.swmed.edu/projects/kidneyspore/index.php

Academic Position


  • Present 2015
    Founding Director
    Bioinformatics Core Facility, UTSW Medical Center
  • Present 2013
    Associate Professor (with Tenure)
    Department of Clinical Sciences, UTSW Medical Center
  • Present 2011
    Founding Director
    Simmons Cancer Center Bioinformatics Shared Resources, UTSW Medical Center
  • Present 2010
    Founding Director
    Quantitative Biomedical Research Center, UTSW Medical Center
  • 2013 2006
    Assistant Professor
    Department of Clinical Sciences, UTSW Medical Center

Education & Training


  • PhD 2006
    Biostatistics
    University of Minnesota, Minneapolis, MN, USA
  • MS 2003
    Biostatistics
    University of Minnesota, Minneapolis, MN, USA
  • MS 2000
    Epidemiology
    Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
  • MD 1997
    Assistant Professor
    Peking University Health Science Center, Beijing, China

Awards & Grants


  • 2013
    Best Performing Team (Team leader)
    image
    Our team won the reward of "Best Performing Team" in “The NIEHS-NCATS-UNC DREAM Toxicogenetics Challenge” (both sub-challenges), NIEHS, NCATS & DREAM organization
  • 2012
    Best Performing Team (Team leader)
    image
    Our team won the reward of "Best Performing Team" in “NCI-DREAM Drug Sensitivity Prediction Challenge,” National Cancer Institute & DREAM organization
  • 2008
    American Association for Cancer Research (AACR) Cancer Biostatistics Workshop Travel Award
  • 2008
    Scholarship for Internal Biometrics Society Young Researcher Workshop.
  • 2008
    Bayesian Conference Travel Award
  • 2006
    Jacob E. Bearman Student Achievement Award
  • 2005-2006
    Doctoral Dissertation Fellowship
    I obtained Doctoral Dissertation Fellowship from University of Minnesota Graduate School
  • 1997
    Outstanding Student of Beijing City
  • 1996-1997
    Guanghua Excellent Students Award (First Prize)
  • 1992-1995
    Scholarship for Excellent Student (First Prize)
  • 2012-2015
    Grant Review