Skip to content

Large-scale Machine Learning for gene expression prediction based on genotype only

License

Notifications You must be signed in to change notification settings

kevinVervier/SLINGER

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SLINGER: Short and Long-range INfluencers of Gene Expression Regulation

Work on large-scale ML for gene expression prediction based on genotype only.

Introduction

This project aims to provide a machine-learning based tool for assessing gene expression based on genotype data only.

Installation

This tool requires the installation of R software.

Dependencies

No specific dependency, except basic R packages installed by default.

Tutorial

After cloning the repository, you may want to call the get_prediction.R function on a test sample data, to make sure that everything runs properly.

First, move in the src repository, cd <github-clone-location>/src.

Then run predictions for sample data: Rscript get_prediction.R ../test/sample.txt ../test/test_output.txt.

This function takes as inputs:

    1. genotype matrix with 0 --> 2 allele count values for individuals (rows) and SNPs (columns).
    1. path to output file where estimated gene expression will be stored.

It should create a new file in the test repository, called test_output.txt.

About

Large-scale Machine Learning for gene expression prediction based on genotype only

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages