Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is process active in a tissue or cell #1

Open
Chris-Evelo opened this issue Aug 19, 2019 · 4 comments
Open

Is process active in a tissue or cell #1

Chris-Evelo opened this issue Aug 19, 2019 · 4 comments
Assignees
Labels
BridgeDb Uses BridgeDb (IMS) for ID mapping NextProt Uses NextProt as a datasource Python Uses Python or Python notebook wrappers question Further information is requested WikiPathways Uses WikiPathways as a datasource

Comments

@Chris-Evelo
Copy link
Collaborator

For a process (WikiPathways pathway, GO term, or pathway ontology term) find all proteins involved and see how many are expressed in a specific tissue or cell type (using NextPot as the source for expression)

@egonw egonw added the question Further information is requested label Aug 19, 2019
@Chris-Evelo
Copy link
Collaborator Author

Tissue/cell combo's to try: liver/hepatocyte, fat(adipose tissue)/adipocyte

@Chris-Evelo Chris-Evelo added BridgeDb Uses BridgeDb (IMS) for ID mapping NextProt Uses NextProt as a datasource WikiPathways Uses WikiPathways as a datasource labels Aug 19, 2019
@Chris-Evelo
Copy link
Collaborator Author

Chris-Evelo commented Aug 20, 2019

For this one we decided to try simple SPARQL queries on the relevant resources and combine them in Python.

@Chris-Evelo Chris-Evelo added the Python Uses Python or Python notebook wrappers label Aug 20, 2019
@AlasdairGray
Copy link
Member

Query for discovering the cells in which a given protein is highly expressed
https://github.com/openphacts/FederatedPhacts/blob/master/nextprot-proteinHighlyExpressedInCell.rq
Available as a REST interface taking the nextprot identifier for the protein as an argument
http://grlc.io/api/openphacts/FederatedPhacts#/nextprot/get_nextprot_proteinHighlyExpressedInCell

Query for discovering the tissues in which a given protein is highly expressed
https://github.com/openphacts/FederatedPhacts/blob/master/nextprot-highlyExpressed.rq
Available as a REST interface taking the nextprot identifier for the protein as an argument
http://grlc.io/api/openphacts/FederatedPhacts#/nextprot/get_nextprot_highlyExpressed

@AlasdairGray
Copy link
Member

In the query to discover tissues in which a given protein is highly expressed, we do not filter for things that are declared to be of type nextprot:Tissue since this is not used consistently in the nextprot data and eliminates too much data.

The query would then have been

select distinct ?tissue ?tisslab where {
  ?_entry_iri :isoform ?iso.
  # get all expression
  ?iso :expression ?e1.
  # highly expressed
  ?e1 :evidence/:expressionLevel :High.
  ?e1 :term ?tissue .
  ?tissue rdfs:label ?tisslab .
  #Return things that are tissues
  ?tissue :childOf cv:TS-2090
  #Eliminate things that are cells
  filter not exists {?tissue :childOf cv:TS-2035}
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BridgeDb Uses BridgeDb (IMS) for ID mapping NextProt Uses NextProt as a datasource Python Uses Python or Python notebook wrappers question Further information is requested WikiPathways Uses WikiPathways as a datasource
Projects
None yet
Development

No branches or pull requests

3 participants