Skip to content

anurag19997/Genomics_project

Repository files navigation

CITEseq is a technique that allows researchers to measure both gene expression and protein levels within individual cells, providing valuable insights into the functional state of cells. However, accurately predicting protein levels from gene expression values is a challenging task, as protein levels are often regulated by multiple mechanisms beyond just gene expression. We aim to construct a model that will best predict the surface protein levels.

Having implemented 4 Regression models, 2 Encoder-Decoder Neural Network models and 1 Deep Learning conv1d Encoder-Decoder model, we find the Deep Learning model to be the best model (based on R-squared value of 0.2064) to predict surface protein levels given the gene expression data

image

image

image

image

image

image

About

Code for the Genomics Project (kaggle competition: https://www.kaggle.com/competitions/open-problems-multimodal)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages