Skip to content

Logging

timrdf edited this page Sep 30, 2011 · 32 revisions

conversion cockpit's doc/logs/*.txt

A log file is created every time the conversion trigger is pulled. It is placed in the conversion cockpit's doc/log/ directory.

$CSV2RDF4LOD_HOME/bin/convert.sh and $CSV2RDF4LOD_HOME/bin/convert-aggregate.sh always log messages to:

CSV2RDF4LOD_LOG="doc/logs/csv2rdf4lod_log_e${eID}_`date +%Y-%m-%dT%H_%M_%S`.txt"

The number of logs in this directory is asserted as conversion:num_invocation_logs in the aggregated data dump Turtle file publish/<dataset-id>-<version-id>.ttl

Trimming logs

Although having the number of logs around is useful, they can get big. We can trim them down so they take up less space, but are still around to indicate the amount of effort put into enhancing it.

When we are at the [data root](csv2rdf4lod automation data root):

$ cr-pwd.sh 
source/

We can run $CSV2RDF4LOD_HOME/bin/util/cr-trim-logs.sh and skim through the sizes of the logs, and the size it will become if we trim it:

$ cr-trim-logs.sh
...
...
========== source/data-gov/1554/version/2011-Jan-12 ========================================

319M doc/logs total
doc/logs/csv2rdf4lod_log_e1_2011-03-29T13_37_26.txt   24 -> 12
doc/logs/csv2rdf4lod_log_e1_2011-03-29T13_39_16.txt   24 -> 12
doc/logs/csv2rdf4lod_log_e1_2011-03-29T13_42_29.txt   28 -> 12
doc/logs/csv2rdf4lod_log_e1_2011-03-29T13_42_44.txt   328 -> 12
doc/logs/csv2rdf4lod_log_e1_2011-03-29T13_44_41.txt   24 -> 12
doc/logs/csv2rdf4lod_log_e1_2011-03-29T13_44_48.txt   328 -> 12
doc/logs/csv2rdf4lod_log_e1_2011-03-29T13_59_12.txt   19344 -> 4
doc/logs/csv2rdf4lod_log_e1_2011-03-29T14_00_30.txt   19344 -> 4
doc/logs/csv2rdf4lod_log_e1_2011-03-29T14_12_31.txt   19344 -> 4
doc/logs/csv2rdf4lod_log_e1_2011-03-29T14_14_59.txt   28 -> 4
doc/logs/csv2rdf4lod_log_e1_2011-03-29T14_17_32.txt   28 -> 4
doc/logs/csv2rdf4lod_log_e1_2011-03-29T14_19_19.txt   28 -> 4
doc/logs/csv2rdf4lod_log_e1_2011-03-29T14_21_26.txt   28 -> 4
doc/logs/csv2rdf4lod_log_e1_2011-03-29T14_24_47.txt   28 -> 12
doc/logs/csv2rdf4lod_log_e1_2011-03-29T14_25_10.txt   28 -> 4
doc/logs/csv2rdf4lod_log_e1_2011-03-29T14_34_23.txt   28 -> 4
doc/logs/csv2rdf4lod_log_e1_2011-03-29T14_35_41.txt   19904 -> 4
doc/logs/csv2rdf4lod_log_e1_2011-03-29T14_41_59.txt   19908 -> 8
doc/logs/csv2rdf4lod_log_e1_2011-03-29T14_44_37.txt   4
doc/logs/csv2rdf4lod_log_e1_2011-03-29T14_51_12.txt   19908 -> 8
doc/logs/csv2rdf4lod_log_e1_2011-03-29T14_52_55.txt   19916 -> 8
doc/logs/csv2rdf4lod_log_e1_2011-03-29T15_08_18.txt   4304 -> 8
doc/logs/csv2rdf4lod_log_e1_2011-03-29T15_08_38.txt   19916 -> 8
doc/logs/csv2rdf4lod_log_e1_2011-03-29T15_11_07.txt   40 -> 8
doc/logs/csv2rdf4lod_log_e1_2011-03-29T15_12_19.txt   40 -> 8
doc/logs/csv2rdf4lod_log_e1_2011-03-29T15_14_22.txt   40 -> 8
doc/logs/csv2rdf4lod_log_e1_2011-03-29T15_56_35.txt   44 -> 8
doc/logs/csv2rdf4lod_log_e1_2011-04-12T08_47_21.txt   44 -> 8
doc/logs/csv2rdf4lod_log_raw_2011-03-21T14_20_41.txt   40 -> 12

Note: did not trim logs. Use cr-trim-logs.sh -w to modify doc/logs/*.txt
...
...

$ cr-trim-logs.sh | grep total
604K doc/logs total
116K doc/logs total
319M doc/logs total
216K doc/logs total
136K doc/logs total
16K doc/logs total
12K doc/logs total
24K doc/logs total
24K doc/logs total

Loading and Deleting a graph.

These are in tmp b/c Virtuoso needs permission to write.

$CSV2RDF4LOD_HOME/bin/util/virtuoso/vload stores logs to $CSV2RDF4LOD_HOME/tmp/vload-tmp/input-files/*.log with the latest at $CSV2RDF4LOD_HOME/tmp/vload-tmp/input-files/latest.log.

$CSV2RDF4LOD_HOME/bin/util/virtuoso/vdelete stores logs to $CSV2RDF4LOD_HOME/tmp/vdelete-tmp/*.log with the latest at $CSV2RDF4LOD_HOME/tmp/vdelete-tmp/latest.log.

populate-endpoint.sh

populate-endpoint.sh needs to be generalized beyond LOGD. It loads metadata from all conversions into a named graph and caches query results to static files to reduce endpoint load when supporting a web site.

${CSV2RDF4LOD_HOME}/log/populate-endpoint.sh/*.log

Turning on the converter's logging

In debugging situations, I might have you turn this on. It should rarely be needed.

The Java implementation uses java.util.logging to log.

Turning logging on is parameterized by the CSV2RDF4LOD_CONVERT_DEBUG_LEVEL environment variable and takes affect within $CSV2RDF4LOD_HOME/bin/convert.sh:

javaprops="-Djava.util.logging.config.file=$CSV2RDF4LOD_HOME/bin/logging/finest.properties"
#javaprops=""

So,

$ export CSV2RDF4LOD_CONVERT_DEBUG_LEVEL=finer

other valid values include fine, finer, and finest.

CSV2RDF4LOD_HOME/bin/logging/ contains fine, finer, and finest.properties.

(If you REALLY wanna get your hands dirty, add your.properties in CSV2RDF4LOD_HOME/bin/logging/ and set your DEBUG_LEVEL to your.)

Clone this wiki locally