id generation #779

matko · 2021-11-17T11:31:49Z

matko
Nov 17, 2021
Maintainer

The (proposed) idgen path algorithm

the structure of an ID

Ids are used to refer to (sub)documents. An id is built out of the following components

a base prefix: this is the base prefix for the root document, that is, the document that is not a subdocument.
A sequence of path components. these can be
-- a type
-- a property
-- an index
A content descriptor, describing the (sub)document. this can be
-- a random string
-- the concatenated key fields of the (sub)document
-- a hash of the concatenated key fields of the (sub)document
-- a valuehash, which is the hash of all fields of the (sub)documents, concatenated as if they were all key fields.

The final id is generated as such:

all path components are concatenated with slashes between them and a final slash at the end
to this, the content descriptor is appended
This is appended to the base prefix

While conceptually true, actual idgen may be more efficiently implemented by recursively descending into a document, expanding an id string when new path components are descended into.

Type information

Each type carries with it the following properties that are relevant for its id:

A base prefix, where documents are to be placed. If omitted, this is assumed to be the active base prefix of the schema. This in turn implies that any time the active base prefix of the schema changes, this is to be taken as a type change for any type that doesn't specify an explicit base. This is only relevant for Documents, as Subdocuments will be placed under a path.
type name info, namely:
-- A type prefix. This is the type uri this type lives at, minus the type name, and it is relative to this that we'll compress id path components for subdocuments.
-- A type name. This is the type uri without the type prefix.
-- These two properties are not currently stored separately. Instead we store the type uri, from which it is unclear how the separation goes exactly. We'll either need to go fix this in all existing schemas, or use a heuristic. For now, we can probably assume that the type prefix is anything up to and including the last '#' in the type uri, or failing that, an empty string. The type name is the rest of the string.

~~ not quite done ~~

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TerminusDB

id generation #779

{{title}}

Replies: 0 comments

Select a reply

TerminusDB

id generation #779

matko Nov 17, 2021 Maintainer

The (proposed) idgen path algorithm

the structure of an ID

Type information

Replies: 0 comments

matko
Nov 17, 2021
Maintainer