-
Notifications
You must be signed in to change notification settings - Fork 36
Generating a sample conversion using only a subset of data
timrdf edited this page Jan 28, 2011
·
27 revisions
When developing enhancement parameters, it is helpful to see the results as they are added. This iterative process can be sped up by only converting a portion of a large CSV. Since a sample subset is already created as part of the conversion, all that we need to do is turn off the "full" conversion using the CSV2RDF4LOD_CONVERT_EXAMPLE_SUBSET_ONLY environment variable.
First, check to see what its current value is:
bash-3.2$ cr-vars.sh
--
CSV2RDF4LOD_HOME /Users/lebot/afrl/information_management/m4rker/domain_instances/tw-data-gov/csv2rdf4lod
CSV2RDF4LOD_BASE_URI http://logd.tw.rpi.edu
CSV2RDF4LOD_BASE_URI_OVERRIDE (not required, $CSV2RDF4LOD_BASE_URI will be used.)
--
CSV2RDF4LOD_CONVERT_NUMBER_EXAMPLE_ROWS (will default to: 2)
CSV2RDF4LOD_CONVERT_EXAMPLE_SUBSET_ONLY false
How?
Set:
CSV2RDF4LOD_CONVERT_EXAMPLE_SUBSET_ONLY="yes"
.sample.ttl
TODO: example vs sample. one is explicitly annotated in the enhancement params, the other is just the first N rows. Is it being consistent? Look at java params, conversion: params, and env vars.