We developed a Bayes Hierarchical model, DisHet, for dissection of heterogeneous bulk tumors, to evaluate the tumor microenvironment in renal cell carcinoma (RCC). DisHet was used to separate the normal, tumor, and immune/stromal components from RNA-sequencing (RNA-seq) data. DisHet analyses uncovered 610 genes not previously linked to the RCC tumor microenvironment and showed that half of the previously designated immune signature genes are not expressed in the RCC tumor microenvironment. These RCC-specific immune signature genes defined by DisHet analyses were termed eTME. Together with data from The Cancer Genome Atlas, the DisHet and eTME analyses characterized a highly-inflamed RCC subtype (termed IS) that exhibited enrichment of regulatory T cells, natural killer cells, Th1 cells, neutrophils, macrophages, B cells, and CD8+ T cells. The IS subtype was associated with aggressive disease, including BAP1-deficient clear-cell RCC and type 2 papillary tumors, and predicted poor survival in patients with RCC. These findings provide a missing link between tumor cells, the tumor microenvironment, and systemic factors.
We developed SCINA, a semi-supervised cell type assignment tool for single cell RNA-Seq and CyTOF/FACs data. One feature that distinguishes SCINA from previously used approaches is the consideration of prior knowledge as a form of supervision. The prior knowledge is denoted by a list of signature genes for each type of cell. SCINA searches for a segregation of the pool of profiled cells such that each type of assigned cells highly expresses the signature genes specified by the researcher. The subset of cells that do not highly express any of the signature genes will be designated as cells of unknown type. SCINA is also general and can be applied on other data of similar format, such as patient bulk RNA-Seq data. In our validation datasets, SCINA demonstrated superior performance to unsupervised approaches such as t-SNE and K-means clustering. Overall, SCINA, representing a “signature-to-category” approach, addresses a critical research need that has been previously neglected. Nevertheless, it is also synergistic with traditional unsupervised “category-to-signature” approaches.
Lack of responsiveness to checkpoint inhibitors is a central problem in the modern era of cancer immunotherapy. Tumor neoantigens are critical mediators of immunotherapy treatment efficacy. Current studies of neoantigens almost entirely focus on total neoantigen load, which has been linked with treatment response and prognosis only in some studies, but not others. We developed state-of-the-art bioinformatics pipelines to detect neoantigens from patient tumors with a high level of sensitivity. Then we developed a novel modeling strategy, CSiN, of the neoantigen data profiled by our pipelines to characterize the degree of concentration of immunogenic neoantigens in truncal mutations, based on a derivation of the Cauchy-Schwarz inequality. By exploiting the clinical responses in 501 immunotherapy-treated patients and the overall survival of 1,978 patients, we showed that CSiN scores predict treatment response to checkpoint inhibitors and prognosis.