-
Notifications
You must be signed in to change notification settings - Fork 36
Script: pvload.sh
- vload - a "provenance-free" shell script wrapper to Virtuoso's isql-vt.
- Naming sparql service description's sd:NamedGraph, so we can name a SPARQL endpoints' named graph.
- Named graphs that know where they came from, talks about provenance modeling of named graphs.
This page describes how to use pvload.sh to capture provenance of loading SPARQL triple store named graphs.
$ pvload.sh --help
usage: pvload.sh [--help] [-n] <url> [-ng <named-graph>] [--separate-provenance [--into (<prov-graph> | 'one')]]
-n : dry run - do not download or load into named graph.
<url> : the URL to retrieve and load into a named graph.
-ng <named-graph> : the named graph to place 'url'. (if not provided, -ng == 'url').
--separate-provenance [ --into <prov_graph> ] :
store the provenance of loading 'url' in a separate named graph, not in '-ng'.
if <prov_graph> is the value 'one', choose a global graph name.
(Setting envvar CSV2RDF4LOD_CONVERT_DEBUG_LEVEL=finest will leave temporary files after invocation.)
(See https://github.com/timrdf/csv2rdf4lod-automation/wiki/Script:-pvload.sh)
-
CSV2RDF4LOD_BASE_URI is used to create URIs for instances of provenance.
- In the examples below, this is set to
http://provenanceweb.org.
- In the examples below, this is set to
- CSV2RDF4LOD_PUBLISH_VIRTUOSO_SPARQL_ENDPOINT is the forward-facing URL for the SPARQL endpoint,
When http://provenanceweb.org/source/same.ttl contains one triple,
$ pvload.sh http://provenanceweb.org/source/same.ttl
INFO: pvload.sh: (URL) http://provenanceweb.org/source/same.ttl
--> (Named Graph) http://provenanceweb.org/source/same.ttl
--> (PROV Graph) http://provenanceweb.org/source/same.ttl
results in 130 triples from:
select distinct count(*)
where {
graph <http://provenanceweb.org/source/same.ttl> {?s ?p ?o}
}
When http://provenanceweb.org/source/same.ttl contains one triple, it can be loaded into a graph named http://example.org/pvload-test-2:
$ pvload.sh http://provenanceweb.org/source/same.ttl -ng http://example.org/pvload-test-2
INFO: pvload.sh: (URL) http://provenanceweb.org/source/same.ttl
--> (Named Graph) http://example.org/pvload-test-2
--> (PROV Graph) http://example.org/pvload-test-2
results in 130 triples from:
select distinct count(*)
where {
graph <http://example.org/pvload-test-2> {?s ?p ?o}
}
When http://provenanceweb.org/source/same.ttl contains one triple,
$ pvload.sh http://provenanceweb.org/source/same.ttl -ng http://example.org/pvload-test-3 --separate-provenance
INFO: pvload.sh: (URL) http://provenanceweb.org/source/same.ttl
--> (Named Graph) http://example.org/pvload-test-3
--> (PROV Graph) http://provenanceweb.org/graph-prov/example.org/pvload-test-3
results in one triple from:
select distinct count(*)
where {
graph <http://example.org/pvload-test-3> {?s ?p ?o}
}
and 129 triples from:
select distinct count(*)
where {
graph <http://provenanceweb.org/graph-prov/example.org/pvload-test-3> {?s ?p ?o}
}
Adding the --into <prov_graph> argument lets you control which graph to put the provenance into.
$ pvload.sh http://provenanceweb.org/source/same.ttl -ng http://example.org/pvload-test-4 --separate-provenance --into http://example.org/put-my-provenance-here
INFO: pvload.sh: (URL) http://provenanceweb.org/source/same.ttl
--> (Named Graph) http://example.org/pvload-test-4
--> (PROV Graph) http://example.org/put-my-provenance-here
results in one triple from:
select distinct count(*)
where {
graph <http://example.org/pvload-test-4> {?s ?p ?o}
}
and 129 triples from:
select distinct count(*)
where {
graph <http://example.org/put-my-provenance-here> {?s ?p ?o}
}
If you don't want to specify the name of the separate provenance graph, use the keyword one and the path /graph-prov will be used.
$ pvload.sh http://provenanceweb.org/source/same.ttl -ng http://example.org/pvload-test-5 --separate-provenance --into one
INFO: pvload.sh: (URL) http://provenanceweb.org/source/same.ttl
--> (Named Graph) http://example.org/pvload-test-5
--> (PROV Graph) http://provenanceweb.org/graph-prov
results in one triple from:
select distinct count(*)
where {
graph <http://example.org/pvload-test-5> {?s ?p ?o}
}
and 129 triples from:
select distinct count(*)
where {
graph <http://provenanceweb.org/graph-prov> {?s ?p ?o}
}
- Script: cache-queries.sh can be used to capture the provenance of querying a SPARQL endpoint.