Skip to content

Shell Script

elisasibarani edited this page Jun 15, 2016 · 1 revision

The easiest way to run Krextor is using the shell script. The syntax is

krextor IN..OUT FILE.xml

where IN is one of the input formats, and OUT one of the output formats supported, both referenced by their ID (see Documentation).

Example:

krextor xhtml-rdfa..turtle homepage.xhtml

A TransformationFile that bundles together the desired input and output formats is generated on the fly, if none exists for the given combination.

Source

source of the script

Pretty-printed output

The output of the shell script is not easy to read, as it is rather intended for machine consumption than for humans. With the help of an RDF converter like  Jena's rdfcat, it can be reformatted, though.

Suppose rdfcat is set up as a shell alias for

java -cp $JENA_DIR/jena.jar:$JENA_DIR/antlr-2.7.5.jar:$JENA_DIR/commons-logging-api-1.1.jar:$JENA_DIR/xercesImpl.jar:$JENA_DIR/iri.jar:$JENA_DIR/icu4j_3_4.jar jena.rdfcat

Then one can get pretty-printed output as follows:

krextor IN..ntriples FILE | rdfcat -out N3 -t -

This has the advantage over Krextor's Turtle output that RDF data structures render readably. Namespace prefixes are still not used, as this information gets lost earlier in Krextor's extraction.

Clone this wiki locally