UT Southwestern Medical Center

Software Packages

scSplitter

Tool for splitting 10x scRNA-seq fastq data by individual cells. This step enables the following mutation analyses to be carried out at the single cell level.

QBRC mutation calling pipeline

The QBRC mutation calling pipeline is a flexible and comprehensive pipeline for mutation calling that has glued together a lot of commonly used software and data processing steps for mutation calling. The mutation calling software include: sambamba, speedseq, varscan, shimmer, strelka, manta, lofreq_tar. It identifies somatic and germline variants from whole exome sequencing (WXS), RNA sequencing and deep sequencing data. It can be used for human, PDX, and mouse data (fastq files or bam files as input).

QBRC neoantigen calling pipeline

The QBRC neoantigen calling pipeline is a comprehensive and user-friendly neoantigen calling pipeline for human genomics samples. It needs the somatic mutation calling results of the QBRC mutation calling pipeline, the tumor/normal exome-seq data for HLA typing, and optionally RNA-seq data for filtering neoantigens called from the exome-seq data. It profiles both MHC I and II-binding neoantigens. The calculation of CSiN (Cauchy-Schwarz index of Neoantigens), which describes neoantigen clonal balance, is embedded in the pipeline.

Please contact Tianshi.lu@utsouthwestern.edu for genome reference files.

dCLIP

dCLIP is written in Perl for discovering differential binding regions in two CLIP-Seq (HITS-CLIP or PAR-CLIP) experiments. It is appropriate in experiments where the common binding regions that are significantly enriched in both conditions tend to have similar binding strength and when researchers are more interested in the difference in binding strength rather than the binary event of whether binding site is common or not.

DisHet

An R package for estimating the gene expression levels and component proportions of the normal, stroma (immune) and tumor components of bulk tumors. The DisHet package also documents a series of gene signatures for tumor infiltrating immune cells, which are defined with empirical evidence gained from DisHet analysis of 35 RCC trio RNA-Seq data.

SCINA

SCINA is a semi-supervised algorithm for identification of cell types in single cell profiling data. It automatically exploits prior knowledge of cell type-specific signatures as a form of supervision. It also works for disease subtyping at the patient level, or other scenarios where data of similar format are available.