Skip to content

dcat:distribution, model fix, inspect API updates #911

@canwaf

Description

@canwaf

With yanked csvcubed 0.5.0 we adopted the following change to the object model.

<4g-coverage.csv#dataset> <http://purl.org/dc/terms/description> "4G coverage in the UK by geographic area" ;
	<http://purl.org/dc/terms/title> "4G Coverage in the UK" ;
	<http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/linked-data/cube#Attachable>, <http://purl.org/linked-data/cube#DataSet>, <http://www.w3.org/2000/01/rdf-schema#Resource>, <http://www.w3.org/ns/dcat#Distribution>, <http://www.w3.org/ns/dcat#Resource> .

This impacts csvcubed's inspect command, which calls https://github.com/GSS-Cogs/csvcubed/blob/main/src/csvcubed/inspect/sparql_handler/sparql_queries/select_catalog_metadata.sparql which primarily looks for the dcat:Dataset

        SELECT DISTINCT ?dataset
        WHERE {
            GRAPH ?someGraph {
                ?dataset a dcat:Dataset.
            }
        }

Which is no longer present; however it should be present. Consider the application profile where the CSV-W is the distribution. This leads us to the following:

<4g-coverage.csv#csvqb> a <http://purl.org/linked-data/cube#Attachable>, <http://purl.org/linked-data/cube#DataSet>, <http://www.w3.org/2000/01/rdf-schema#Resource>, <http://www.w3.org/ns/dcat#Distribution>, <http://www.w3.org/ns/dcat#Resource> ;
    <http://www.w3.org/ns/dcat#isDistributionOf> <4g-coverage.csv#dataset> .
<4g-coverage.csv#dataset> <http://purl.org/dc/terms/description> "4G coverage in the UK by geographic area" ;
	<http://purl.org/dc/terms/title> "4G Coverage in the UK" .

So the catalogue metadata is attached to the dataset, but the CSV-W's primary subject is now the Attachable, qb:Dataset, etc.

This should allow the SPARQL query to remain unchanged.

The metadata attached to the dcat:Distribution should be at most (Not these are not requirements, just what we can fill in that we already have we should add, nothing new new please):

classDiagram

class Distribution["Distribution a dcat:Distribution"] {
    +dcterms:identifier ∋ rdfs:Literal as xsd:string
    +dcterms:created ∋ rdfs:Literal as xsd:dateTime
    +dcterms:creator ∋ foaf:Agent
    +dcterms:issued ∋ rdfs:Literal as xsd:dateTime
    +prov:wasDerivedFrom ∋ [prov:Entity]
    +prov:wasGeneratedBy ∋ prov:Activity
    +dcat:downloadURL ∋ rdf:Resource
    +dcat:byteSize ∋ rdfs:Literal as xsd:nonNegativeInteger
    +dcat:mediaType ∋ dcterms:MediaType
    +wdrs:describedBy ∋ rdfs:Resource
    +spdx:checksum ∋ spdx:Checksum
}
Loading

tl;dr main subject of the CSV-W metadata file should be <dataset.csv#csvqb> which is dcat:isDistributionOf the dcat:Dataset. The dcat:Dataset is the one which should have the catalogue metadata attached to it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions