-
Notifications
You must be signed in to change notification settings - Fork 60
Description
Hello,
There's something that I've noticed while looking at the DCAT-AP Agent
model. It's got two type columns:
Type
, which is unique (?) to ittype
, which is present in every model and is always a single option enum (must be set toAgent
in this case)
I've already opened an issue for the Agent
case at smart-data-models/dataModel.DCAT-AP#11 but there's a general question related to the type column which the issue above made me think of and since the SQL schema generator is implemented in this repository (generate_sql_schema
) I've decided to open an issue here.
Does it even make sense to keep the type
column in the generated SQL schema files since they always result in a single value enum that is expected to always have the same value? I.e. for the model Agent
the value in the type
column will always be Agent
. The table name will also be Agent
for the Agent
model, therefore one can already infer the type from the table name.
Is there a reason why it was kept in the SQL schema? Perhaps so that raw JSON files (i.e. "denormalised" NGSI-LD) can be directly ingested into the table without preprocessing? In some cases (tools, databases, libraries - i.e. DuckDB) one would have to remove or explicitly ignore the type
property to import the JSON into a table without the type
column.
Thank you 🙂
Here's just a DuckDB example related to the last paragraph above in case that someone is dealing with similar issues 👇
insert into Agent by name select * exclude (type) from read_json('input.json', auto_detect=true);
Let's forget about the properties of the DCAT-AP Agent
model in this case. I just wanted to show how you can ignore a type
column that's implicitly created by DuckDB as it's ingesting a JSON file with a type
property in case that a table (named Agent
here) doesn't have that column. You can exclude multiple columns 🙂