TissueEnrich: A tool to calculate tissue-specific gene enrichment

Analysis of RNA-Sequencing data results in lists of genes that may have similar function, based on differential gene expression analysis or co-expression network analysis. Multiple tools have been developed that use gene ontologies to identify biological processes that are enriched in the genes sets. While these tools provide insights into the biological processes, there is no information about the tissue specificity of the genes, which is important when studying human disease. Therefore, we developed TissueEnrich, a tool that calculates tissue-specific gene enrichment in an input gene set. We demonstrated that TissueEnrich is very robust in identifying the lineage of single cell clusters and differentiated embryonic stem cells. TissueEnrich is available as a user-friendly and interactive web application as well as an R package (Bioconductor) allowing additional flexibility in usage.

Shiny application for IPL T20 cricket data analysis

I developed a shiny application to visualize and analyze the IPL T20 cricket data as a part of my STAT585 class. I used the IPL data from the year 2008 to 2016. The raw data is available at the Cricsheet website . The raw data consists ball by ball details individually for every match (577 files). The processed data has been downloaded from Kaggle . We further processed the data and incorporated the details about the venue locations for visualization. After that, we developed a shiny application in which the data can be analyzed by plotting and changing various parameters including year, team, and player. More details about the project are on the github page.

GBEER Analysis Pipeline

This is a semi-automated pipeline to run the GBEER tool. GBEER tool is being developed by Friedberg lab which is used to quantify and visualize the evolutionary changes that occur in gene blocks.


SPINNER stands for Seeded Protein Interaction Network Neighborhood Expansion and Ranking Tool. It is an automated software tool, which can rank and compare genes or proteins from constructed phenotype-specific biomolecular interaction networks. Given the user input of a list of phenotype-specific genes, our tool can query the STRING protein-protein interaction database automatically to retrieve protein-protein interactions among the input genes with user-specified network expansion levels to construct a phenotype-specific network. All the sub-networks are ranked and evaluated statistically to obtain a P-value for its index of aggregation before subsequent analysis. To compare the significant contribution of each protein, we consider its node degree of connectivity, protein interaction quality for its surrounding interacting partners (including both direct and indirect connected partners through iterations), the protein's significance in both unfiltered global network and phenotype-specific network, and other network characteristics. Our tool also provides the gene/protein PubMed reference citation count for the specific phenotype to help users evaluate the ranked proteins. A family-wise adjusted P-value of all significant ranks against randomized topology-preserving networks are also provided to help assess the rank.

GEMINE (Gene Expression Mutation Interaction Neighborhood Exploration)

Developed the database which stores the data regarding the significance of genes in cancers based on their gene expression and mutations. The raw data has been taken from COSMIC and TCGA and the differentially expressed genes have been filtered and prioritized on the basis of their SPINNER rank. We also calculate the similarity between the genes on the basis of the gene expression data from TCGA and mutation data from COSMIC. We also developed the web application, which is used to browse through the database using PHP and javascript.

Vocab World

"VOCAB WORLD" is an English Vocabulary Android Application published on Google Play. This application consists of a database of a large pool of words. The application consists of Antonym and Synonym quiz which consists of 10 multiple choice questions having a time limit of 2 minutes. Based on the correct answers the final score is displayed which is saved in the history. The option to review one’s performance is also there with correct answers highlighted green and the wrong answers highlighted red. The review section also consists of the word’s meaning.