Explore alternative record architecture without "key indexes" #60

pospi · 2019-09-02T08:39:31Z

When I ran through the first pass of inter-DNA linking, we were storing "base" entries (*) as the address of the first version of an entry to keep consistent IDs. This was mostly to allow for consistent record IDs between entry updates, across networks. After implementing the second pass, which does not use "base" entries as targets but instead writes metadata around the target link as a JSON-based entry, storage of such consistent record hashes seems less necessary.

(*)(which have since been renamed to "key indexes"; please substitute as appropriate when you see the older terminology)

It may be possible to link directly between entries whilst always referring to them by their consistent initial hash without incurring any additional storage overhead. For example, we would no longer need the consistently-identified EVENT_BASE_ENTRY_TYPE linking to the underlying EVENT_ENTRY_TYPE that has a roaming hash.

When creating new records via create_record, instead of storing the base entry address, just return the initial hash coming from commit_entry.
When calling update_record there would no longer be any need to dereference the entry; however, reading the entry metadata in order to determine the most recent version hash may be necessary. The initial hash (rather than most recent entry hash) would be returned from this method as an identifier for the record that remains consistent between updates.
- Another option is that update_record should accept a revision ID (read: actual entry hash) rather than a record ID (read: hash of first entry); which would necessitate returning this record metadata in responses (see Retrieve full revision history in all record responses #40). This method would also be better for avoiding undesirable update conflicts.
delete_record may have the same revision ID / record ID concern as for update, with the addition that there is no longer any base entry to delete.
read_record_entry takes the record ID initially returned from commit_entry and follows the update metadata through to the latest version of the entry automatically- there is no longer any reason to dereference the base entry. We may optionally wish to validate that no previous versions of the provided entry address exist, to ensure that revision IDs cannot be incorrectly used as record locators.

Aside from restructuring the zome link! definitions to remove the indirection, I don't think anything needs to change in the linking API. Provided all links continue to use the initial version of an entry, they should all still be readable in a single query for field traversals. It'll just be different link type names.

The text was updated successfully, but these errors were encountered:

pospi · 2019-09-27T01:56:28Z

Other considerations and potential patterns to explore:

Is shadowing link data in the entry fields advisable, in order to speed up reads? This would also mean updates to link field data were reflected in the entry, so one could just follow the entry changelog to see when updates occurred (rather than also having to inspect related link entries). Of course, this would not apply to "indirect indexes" which use an intermediary entry to hold a compound key value.
What is most optimal?-
- read methods that crawl links for each version of an initial entry; or
- linking everything to the initial entry (as currently implemented); or
- duplicating all links alongside the new version of the entry?
- ...probably the latter? As it is consistent with the idea of versioning "carrying over" unmodified data into the new, and results in "always complete", easily re-constructable version data (rather than complex logic involved in link traversals as would be needed in some of the other configurations).
Should we actually be doing the "stable ID" thing, or does this cause issues with update logic & network partitions? (CAS updates are easy to do conflict detection on since the exact version is specified in each update.) If we don't, we need to build index synchronisation logic to cleanup old versions linked in other DNAs or manage cascading updates in order that records in remote DNAs can make accurate inferences about the number of external records linking in.

pospi · 2019-11-14T07:11:17Z

Another nod towards "shadowing the link data in the entry fields"- valueflows/vf-apps#5 (comment)

For constraints like "if the event action is not defined as input and/or output, it should not be related to a process", not including the link field data in the entry actually makes validation logic far more difficult.

We also need to ensure support for calling in to bridged DNAs during validation calls in order to fulfill constraints like the above. (CC @pdaoust)

pospi · 2020-03-12T11:19:12Z

Further reflections to be had RE https://infocentral.org/drafts/PrinciplesDraft.html

pospi · 2022-04-27T07:06:54Z

Closing this one as superceded- given the newer approach of keeping indexes separately to CRUD entries, not to mention the new Holochain architecture of using headers for updates & deletes.

It does seem we are on the final pass of indexing cleanup and with #84 (comment) and #264 being addressed we should be in a good place for an MVP.

pospi added the enhancement New feature or request label Sep 2, 2019

pospi added this to the Production-ready core components milestone Sep 2, 2019

pospi modified the milestones: Production-ready core components, DNA code quality, polish and cleanup Nov 7, 2019

pospi mentioned this issue Nov 27, 2019

Optimal architecture for DHT record logic #3

Closed

pospi changed the title ~~Explore alternative record architecture without "base entries"~~ Explore alternative record architecture without "key indexes" Jan 9, 2020

pospi mentioned this issue Mar 12, 2020

Entry updates refactoring holochain-devcamp/learning-pathways#4

Open

pospi modified the milestones: DNA code quality, polish and cleanup, Holochain core stabilising Feb 17, 2022

pospi closed this as completed Apr 27, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Explore alternative record architecture without "key indexes" #60

Explore alternative record architecture without "key indexes" #60

pospi commented Sep 2, 2019 •

edited

Loading

pospi commented Sep 27, 2019

pospi commented Nov 14, 2019

pospi commented Mar 12, 2020

pospi commented Apr 27, 2022

Explore alternative record architecture without "key indexes" #60

Explore alternative record architecture without "key indexes" #60

Comments

pospi commented Sep 2, 2019 • edited Loading

pospi commented Sep 27, 2019

pospi commented Nov 14, 2019

pospi commented Mar 12, 2020

pospi commented Apr 27, 2022

pospi commented Sep 2, 2019 •

edited

Loading