Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test converting a slice of Reactome BioPAX #3

Open
dustine32 opened this issue Jul 27, 2021 · 7 comments
Open

Test converting a slice of Reactome BioPAX #3

dustine32 opened this issue Jul 27, 2021 · 7 comments
Assignees

Comments

@dustine32
Copy link
Contributor

@deustp01 URL for testing?

@dustine32 dustine32 self-assigned this Jul 27, 2021
@dustine32
Copy link
Contributor Author

Thanks @deustp01! I'll run this BioBAX through the existing pathways2go converter and report back how it goes.

@dustine32
Copy link
Contributor Author

@deustp01 The above BioPAX created 13 pathways model files. Is this expected?

Here's the full log output:

1 of 13 Pathway:[Pyrimidine salvage]
defining pathway Pyrimidine salvage false true R-HSA-73614
Before sparql inference -  triples: 3185
Starting delete locations
Eliminated 'located in' assertions
Starting unconnected node cleanup.  Total nodes 191
Total evidence nodes 120
removed 0
After sparql inference -  triples: 2100
Rule results:
Entity Regulation Rule 1. 	0	[]
Entity Regulation Rule 3	0	[]
Entity Regulator Rule	0	[]
If enabler then MF rule	0	[]
Occurs In Rule	10	[]
Provides Input For Rule	0	[]
Transporter Rule	0	[]

writing....
writing n triples: 2100 models/R-HSA-73614.ttl
done writing...
GO-CAM model is consistent, Total triples in validated model including tbox: 4318
reseting for next pathway...
2 of 13 Pathway:[Nucleotide catabolism]
defining pathway Nucleotide catabolism false true R-HSA-8956319
Before sparql inference -  triples: 504
Starting delete locations
Eliminated 'located in' assertions
Starting unconnected node cleanup.  Total nodes 34
Total evidence nodes 20
removed 1
After sparql inference -  triples: 416
Rule results:
Entity Regulation Rule 1. 	0	[]
Entity Regulation Rule 3	0	[]
Entity Regulator Rule	1	[http://model.geneontology.org/R-HSA-8956319/R-HSA-8866601_GO_0009264_individual]
If enabler then MF rule	0	[]
Occurs In Rule	1	[]
Provides Input For Rule	0	[]
Transporter Rule	0	[]

writing....
writing n triples: 416 models/R-HSA-8956319.ttl
done writing...
GO-CAM model is consistent, Total triples in validated model including tbox: 945
reseting for next pathway...
3 of 13 Pathway:[Nucleotide salvage]
defining pathway Nucleotide salvage false true R-HSA-8956321
Before sparql inference -  triples: 32
No occurs in
Starting delete locations
Eliminated 'located in' assertions
Starting unconnected node cleanup.  Total nodes 1
Total evidence nodes 0
removed 1
After sparql inference -  triples: 32
Rule results:
Entity Regulation Rule 1. 	0	[]
Entity Regulation Rule 3	0	[]
Entity Regulator Rule	0	[]
If enabler then MF rule	0	[]
Occurs In Rule	0	[]
Provides Input For Rule	0	[]
Transporter Rule	0	[]

writing....
writing n triples: 32 models/R-HSA-8956321.ttl
done writing...
GO-CAM model is consistent, Total triples in validated model including tbox: 15
reseting for next pathway...
4 of 13 Pathway:[Purine salvage]
defining pathway Purine salvage false true R-HSA-74217
Before sparql inference -  triples: 4054
Starting delete locations
Eliminated 'located in' assertions
Starting unconnected node cleanup.  Total nodes 241
Total evidence nodes 152
removed 0
After sparql inference -  triples: 2647
Rule results:
Entity Regulation Rule 1. 	0	[]
Entity Regulation Rule 3	0	[]
Entity Regulator Rule	0	[]
If enabler then MF rule	0	[]
Occurs In Rule	12	[]
Provides Input For Rule	0	[]
Transporter Rule	0	[]

writing....
writing n triples: 2647 models/R-HSA-74217.ttl
done writing...
GO-CAM model is consistent, Total triples in validated model including tbox: 5369
reseting for next pathway...
5 of 13 Pathway:[Metabolism of nucleotides, Nucleotide metabolism]
defining pathway Nucleotide metabolism false true R-HSA-15869
Before sparql inference -  triples: 30
No occurs in
Starting delete locations
Eliminated 'located in' assertions
Starting unconnected node cleanup.  Total nodes 1
Total evidence nodes 0
removed 1
After sparql inference -  triples: 30
Rule results:
Entity Regulation Rule 1. 	0	[]
Entity Regulation Rule 3	0	[]
Entity Regulator Rule	0	[]
If enabler then MF rule	0	[]
Occurs In Rule	0	[]
Provides Input For Rule	0	[]
Transporter Rule	0	[]

writing....
writing n triples: 30 models/R-HSA-15869.ttl
done writing...
GO-CAM model is consistent, Total triples in validated model including tbox: 15
reseting for next pathway...
6 of 13 Pathway:[Pyrimidine biosynthesis]
defining pathway Pyrimidine biosynthesis false true R-HSA-500753
Before sparql inference -  triples: 2477
Starting delete locations
Eliminated 'located in' assertions
Starting unconnected node cleanup.  Total nodes 144
Total evidence nodes 91
removed 0
After sparql inference -  triples: 1602
Rule results:
Entity Regulation Rule 1. 	0	[]
Entity Regulation Rule 3	0	[]
Entity Regulator Rule	0	[]
If enabler then MF rule	0	[]
Occurs In Rule	7	[]
Provides Input For Rule	0	[]
Transporter Rule	0	[]

writing....
writing n triples: 1602 models/R-HSA-500753.ttl
done writing...
GO-CAM model is consistent, Total triples in validated model including tbox: 3203
reseting for next pathway...
7 of 13 Pathway:[Interconversion of nucleotide di- and triphosphates]
defining pathway Interconversion of nucleotide di- and triphosphates false true R-HSA-499943
Before sparql inference -  triples: 11917
Starting delete locations
Eliminated 'located in' assertions
Starting unconnected node cleanup.  Total nodes 770
Total evidence nodes 482
removed 0
After sparql inference -  triples: 8462
Rule results:
Entity Regulation Rule 1. 	0	[]
Entity Regulation Rule 3	0	[]
Entity Regulator Rule	8	[http://model.geneontology.org/R-HSA-499943/R-HSA-499943]
If enabler then MF rule	0	[]
Occurs In Rule	34	[]
Provides Input For Rule	0	[]
Transporter Rule	0	[]

writing....
writing n triples: 8462 models/R-HSA-499943.ttl
done writing...
GO-CAM model is consistent, Total triples in validated model including tbox: 20790
reseting for next pathway...
8 of 13 Pathway:[Nucleotide biosynthesis]
defining pathway Nucleotide biosynthesis false true R-HSA-8956320
Before sparql inference -  triples: 32
No occurs in
Starting delete locations
Eliminated 'located in' assertions
Starting unconnected node cleanup.  Total nodes 1
Total evidence nodes 0
removed 1
After sparql inference -  triples: 32
Rule results:
Entity Regulation Rule 1. 	0	[]
Entity Regulation Rule 3	0	[]
Entity Regulator Rule	0	[]
If enabler then MF rule	0	[]
Occurs In Rule	0	[]
Provides Input For Rule	0	[]
Transporter Rule	0	[]

writing....
writing n triples: 32 models/R-HSA-8956320.ttl
done writing...
GO-CAM model is consistent, Total triples in validated model including tbox: 15
reseting for next pathway...
9 of 13 Pathway:[Purine ribonucleoside monophosphate biosynthesis]
defining pathway Purine ribonucleoside monophosphate biosynthesis false true R-HSA-73817
Before sparql inference -  triples: 6685
Starting delete locations
Eliminated 'located in' assertions
Starting unconnected node cleanup.  Total nodes 451
Total evidence nodes 280
removed 0
After sparql inference -  triples: 5040
Rule results:
Entity Regulation Rule 1. 	0	[]
Entity Regulation Rule 3	0	[]
Entity Regulator Rule	10	[http://model.geneontology.org/R-HSA-73817/R-HSA-73817]
If enabler then MF rule	0	[]
Occurs In Rule	17	[]
Provides Input For Rule	0	[]
Transporter Rule	0	[]

writing....
writing n triples: 5040 models/R-HSA-73817.ttl
done writing...
GO-CAM model is consistent, Total triples in validated model including tbox: 12496
reseting for next pathway...
10 of 13 Pathway:[Phosphate bond hydrolysis by NTPDase proteins]
defining pathway Phosphate bond hydrolysis by NTPDase proteins false true R-HSA-8850843
Before sparql inference -  triples: 4429
Starting delete locations
Eliminated 'located in' assertions
Starting unconnected node cleanup.  Total nodes 265
Total evidence nodes 168
removed 0
After sparql inference -  triples: 2889
Rule results:
Entity Regulation Rule 1. 	0	[]
Entity Regulation Rule 3	0	[]
Entity Regulator Rule	0	[]
If enabler then MF rule	0	[]
Occurs In Rule	12	[]
Provides Input For Rule	0	[]
Transporter Rule	0	[]

writing....
writing n triples: 2889 models/R-HSA-8850843.ttl
done writing...
GO-CAM model is consistent, Total triples in validated model including tbox: 5937
reseting for next pathway...
11 of 13 Pathway:[Purine catabolism]
defining pathway Purine catabolism false true R-HSA-74259
Before sparql inference -  triples: 5845
Starting delete locations
Eliminated 'located in' assertions
Starting unconnected node cleanup.  Total nodes 380
Total evidence nodes 237
removed 0
After sparql inference -  triples: 4215
Rule results:
Entity Regulation Rule 1. 	0	[]
Entity Regulation Rule 3	0	[]
Entity Regulator Rule	5	[http://model.geneontology.org/R-HSA-74259/R-HSA-74259]
If enabler then MF rule	0	[]
Occurs In Rule	16	[]
Provides Input For Rule	0	[]
Transporter Rule	0	[]

writing....
writing n triples: 4215 models/R-HSA-74259.ttl
done writing...
GO-CAM model is consistent, Total triples in validated model including tbox: 9867
reseting for next pathway...
12 of 13 Pathway:[Phosphate bond hydrolysis by NUDT proteins]
defining pathway Phosphate bond hydrolysis by NUDT proteins false true R-HSA-2393930
Before sparql inference -  triples: 6241
Starting delete locations
Eliminated 'located in' assertions
Starting unconnected node cleanup.  Total nodes 384
Total evidence nodes 242
removed 0
After sparql inference -  triples: 4201
Rule results:
Entity Regulation Rule 1. 	0	[]
Entity Regulation Rule 3	0	[]
Entity Regulator Rule	2	[http://model.geneontology.org/R-HSA-2393930/R-HSA-2393930]
If enabler then MF rule	0	[]
Occurs In Rule	17	[]
Provides Input For Rule	0	[]
Transporter Rule	0	[]

writing....
writing n triples: 4201 models/R-HSA-2393930.ttl
done writing...
GO-CAM model is consistent, Total triples in validated model including tbox: 9023
reseting for next pathway...
13 of 13 Pathway:[Pyrimidine catabolism]
defining pathway Pyrimidine catabolism false true R-HSA-73621
Before sparql inference -  triples: 5407
Starting delete locations
Eliminated 'located in' assertions
Starting unconnected node cleanup.  Total nodes 341
Total evidence nodes 213
removed 0
After sparql inference -  triples: 3758
Rule results:
Entity Regulation Rule 1. 	0	[]
Entity Regulation Rule 3	0	[]
Entity Regulator Rule	1	[http://model.geneontology.org/R-HSA-73621/R-HSA-73621]
If enabler then MF rule	0	[]
Occurs In Rule	17	[]
Provides Input For Rule	0	[]
Transporter Rule	4	[http://model.geneontology.org/R-HSA-73621/R-HSA-73621]

writing....
writing n triples: 3758 models/R-HSA-73621.ttl
done writing...
GO-CAM model is consistent, Total triples in validated model including tbox: 7906
reseting for next pathway...
done with file source/15869.owl

@dustine32
Copy link
Contributor Author

All 13 models were reported logically (OWL) consistent. And I think I confused myself into thinking the consistency checks included ShEx automatically in the pathways2go code. Apparently it is not checked, or I can't find where it's at in the code yet.

Either way I can just run the ShEx validator on these 13 models separately to get those reports.

@dustine32
Copy link
Contributor Author

@deustp01 The products folder here contains the ShEx reports. The first thing is that only 4 out of 13 models were shex_valid:

$ cut -f1,2,9 products/main_report.txt
filename	model_title	shex_valid
R-HSA-73621.ttl	Pyrimidine catabolism - imported from: Reactome	false
R-HSA-74217.ttl	Purine salvage - imported from: Reactome	false
R-HSA-74259.ttl	Purine catabolism - imported from: Reactome	false
R-HSA-73817.ttl	Purine ribonucleoside monophosphate biosynthesis - imported from: Reactome	false
R-HSA-73614.ttl	Pyrimidine salvage - imported from: Reactome	false
R-HSA-8956319.ttl	Nucleotide catabolism - imported from: Reactome	false
R-HSA-500753.ttl	Pyrimidine biosynthesis - imported from: Reactome	false
R-HSA-8956320.ttl	Nucleotide biosynthesis - imported from: Reactome	true
R-HSA-15869.ttl	Nucleotide metabolism - imported from: Reactome	true
R-HSA-8956321.ttl	Nucleotide salvage - imported from: Reactome	true
R-HSA-2393930.ttl	Phosphate bond hydrolysis by NUDT proteins - imported from: Reactome	false
R-HSA-499943.ttl	Interconversion of nucleotide di- and triphosphates - imported from: Reactome	false
R-HSA-8850843.ttl	Phosphate bond hydrolysis by NTPDase proteins - imported from: Reactome	false

I believe we can dig through products/explanations.txt to figure out what exactly is invalid about these models.

Tagging @kltm @ukemi @vanaukenk

@ukemi
Copy link

ukemi commented Jul 28, 2021

@deustp01 @dustine32 This is interesting. We need to have a look. Just for curiosity, I wonder if the ShEx issues are a result of what we did or are a result of changes to the Shex. Would it be worthwhile to do an experiment and run the Biopax of one of the models that hasn't changed. Peter, I will put this on the agenda for one of our meetings.
PS. I find it a bit weird that Nucleotide Biosynthesis passes, but Pyrimidine biosynthesis doesn't. Shouldn't it be included in the Nucleotide Biosynthesis pathway?

@ukemi
Copy link

ukemi commented Jul 28, 2021

Curious: All seem to be anatomical entity violations.
Looking at the last line:
R-HSA-8850843.ttl Phosphate bond hydrolysis by NTPDase proteins - imported from: Reactome http://model.geneontology.org/R-HSA-8850843 gomodel:R-HSA-8851234 [GO:0017111] BFO:0000066 [obo:go/shapes/AnatomicalEntity] gomodel:reaction_R-HSA-8851234_location_lociGO_0000139 [GO:0000139] []

This reaction represents a nucleotide being hydrolyzed in the Golgi lumen. Not sure why this is failing:
Here is a snippet of the ShEX
@ AND EXTRA a {
a ( @ OR @ ) {1};
enabled_by: ( @ OR @ ) {0,1};
part_of: @ *;
has_part: @ *;
occurs_in: @ {0,1};

@ AND EXTRA a {
a ( @ OR @ );
part_of: @ {0,1};
location_of: ( @ OR @ ) {0,1};
} // rdfs:comment "an anatomical entity"

IRI @ AND EXTRA rdfs:subClassOf {
rdfs:subClassOf [ GoAnatomicalEntity: ];
}

GoAnatomicalEntity: http://purl.obolibrary.org/obo/CARO_0000000

Are we missing the GO_CC CARO bridge here somewhere?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants