Skip to content

Script: pvload.sh

Tim L edited this page Feb 4, 2014 · 43 revisions

What is first

What we will cover

This page describes how to use pvload.sh to capture provenance of loading SPARQL triple store named graphs.

Let's get to it!

Usage

$ pvload.sh --help
usage: pvload.sh [--help] [-n] url [-ng named_graph]
  -n  : dry run - do not download or load into named graph.
  url : the URL to retrieve and load into a named graph.
  -ng : the named graph to place 'url'. (if not provided, -ng == 'url').

  (Setting envvar CSV2RDF4LOD_CONVERT_DEBUG_LEVEL=finest will leave temporary files after invocation.)

Environment variables that matter

Loading a URL into a graph with the same name

$ pvload.sh http://provenanceweb.org/source/same.ttl
INFO: pvload.sh: (URL) http://provenanceweb.org/source/same.ttl
                   --> (Named Graph) http://provenanceweb.org/source/same.ttl
                   --> (PROV Graph)  http://provenanceweb.org/source/same.ttl

Loading a URL into a graph with a different name

Let's load one triple into the graph named http://example.org/pvload-test:

$ pvload.sh http://provenanceweb.org/source/same.ttl -ng http://example.org/pvload-test-2
INFO: pvload.sh: (URL) http://provenanceweb.org/source/same.ttl
                   --> (Named Graph) http://example.org/pvload-test-2
                   --> (PROV Graph)  http://example.org/pvload-test-2

When this is done on opendap.tw.rpi.edu, a summary of the named graph can be found at http://opendap.tw.rpi.edu/graph/http/example.org/pvload-test. Because the graph that we loaded only had 1 triple, and the named graph ends up with 128, pvload.sh added 127 triples of provenance.

Loading the provenance of the load into a separate named graph, specific to the graph loaded

$ pvload.sh http://provenanceweb.org/source/same.ttl -ng http://example.org/pvload-test-3 --separate-provenance
INFO: pvload.sh: (URL) http://provenanceweb.org/source/same.ttl
                   --> (Named Graph) http://example.org/pvload-test-3
                   --> (PROV Graph)  http://provenanceweb.org/graph-prov/example.org/pvload-test-3

results in one triple from:

select distinct count(*)
where { 
  graph <http://example.org/pvload-test-4> {?s ?p ?o}
}

and 129 triples from:

select distinct count(*)
where { 
  graph <http://provenanceweb.org/graph-prov/example.org/pvload-test-3> {?s ?p ?o}
}

Loading the provenance of the load into a separate named graph, with a different name

Adding the --into <prov_graph> argument lets you control which graph to put the provenance into.

$ pvload.sh http://provenanceweb.org/source/same.ttl -ng http://example.org/pvload-test-4 --separate-provenance --into http://example.org/put-my-provenance-here
INFO: pvload.sh: (URL) http://provenanceweb.org/source/same.ttl
                   --> (Named Graph) http://example.org/pvload-test-4
                   --> (PROV Graph)  http://example.org/put-my-provenance-here

results in one triple from:

select distinct count(*)
where { 
  graph <http://example.org/pvload-test-4> {?s ?p ?o}
}

and 129 triples from:

select distinct count(*)
where { 
  graph <http://example.org/put-my-provenance-here> {?s ?p ?o}
}

Loading the provenance of the load into a separate, shared, named graph

If you don't want to specify the name of the separate provenance graph, use the keyword one and the path /graph-prov will be used.

$ pvload.sh http://provenanceweb.org/source/same.ttl -ng http://example.org/pvload-test-5 --separate-provenance --into one
INFO: pvload.sh: (URL) http://provenanceweb.org/source/same.ttl
                   --> (Named Graph) http://example.org/pvload-test-5
                   --> (PROV Graph)  http://provenanceweb.org/graph-prov

results in one triple from:

select distinct count(*)
where { 
  graph <http://example.org/pvload-test-5> {?s ?p ?o}
}

and 129 triples from:

select distinct count(*)
where { 
  graph <http://provenanceweb.org/graph-prov> {?s ?p ?o}
}

What is next

Clone this wiki locally