Skip to content

nickzren/human-gene-pathway

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Human Gene Pathway

This repository processes and combines human gene-pathway data from PathwayCommons and WikiPathways, creating a unified dataset for genomic analysis.

Execution

conda env create -f environment.yml 

conda activate human-gene-pathway

bash download.sh

python process.py

Input

  • wikipathways-Homo_sapiens.gmt
    • Downloaded from the WikiPathways database, this file contains curated biological pathways for Homo sapiens (humans). Each entry includes pathway information such as gene sets associated with specific biological processes or diseases.
  • PathwayCommons12.All.hgnc.gmt
    • This file from Pathway Commons provides comprehensive pathway data, including gene interactions and pathway information.
  • Homo_sapiens.gene_info.gz
    • The file from NCBI is a compressed archive containing detailed information on genes.

Output

  • human-gene-pathway.tsv
    • The file contains combined human gene-pathway data from PathwayCommons and WikiPathways.

License

All original content in this repository is released as CC0 1.0 (public domain). WikiPathways data is licensed as CC BY 3.0. Reactome data is licensed as CC BY 4.0. PID data is in the public domain.

About

A compilation of pathway gene sets

Resources

Stars

Watchers

Forks

Languages

  • Python 81.1%
  • Shell 18.9%