Skip to content

generate_sql_schema - redundant type column #77

@rokf

Description

@rokf

Hello,

There's something that I've noticed while looking at the DCAT-AP Agent model. It's got two type columns:

  • Type, which is unique (?) to it
  • type, which is present in every model and is always a single option enum (must be set to Agent in this case)

I've already opened an issue for the Agent case at smart-data-models/dataModel.DCAT-AP#11 but there's a general question related to the type column which the issue above made me think of and since the SQL schema generator is implemented in this repository (generate_sql_schema) I've decided to open an issue here.

Does it even make sense to keep the type column in the generated SQL schema files since they always result in a single value enum that is expected to always have the same value? I.e. for the model Agent the value in the type column will always be Agent. The table name will also be Agent for the Agent model, therefore one can already infer the type from the table name.

Is there a reason why it was kept in the SQL schema? Perhaps so that raw JSON files (i.e. "denormalised" NGSI-LD) can be directly ingested into the table without preprocessing? In some cases (tools, databases, libraries - i.e. DuckDB) one would have to remove or explicitly ignore the type property to import the JSON into a table without the type column.

Thank you 🙂


Here's just a DuckDB example related to the last paragraph above in case that someone is dealing with similar issues 👇

insert into Agent by name select * exclude (type) from read_json('input.json', auto_detect=true);

Let's forget about the properties of the DCAT-AP Agent model in this case. I just wanted to show how you can ignore a type column that's implicitly created by DuckDB as it's ingesting a JSON file with a type property in case that a table (named Agent here) doesn't have that column. You can exclude multiple columns 🙂

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions