Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retrieve full revision history in all record responses #40

Open
2 of 4 tasks
pospi opened this issue Aug 16, 2019 · 25 comments
Open
2 of 4 tasks

Retrieve full revision history in all record responses #40

pospi opened this issue Aug 16, 2019 · 25 comments
Assignees
Labels
enhancement New feature or request graphql-api Something at the layer of what is and isn't implemented that affects the graphql api

Comments

@pospi
Copy link
Member

pospi commented Aug 16, 2019

@pospi pospi added the enhancement New feature or request label Aug 16, 2019
@pospi pospi added this to the Multi-DNA system core milestone Aug 16, 2019
@pospi pospi self-assigned this Aug 16, 2019
@pospi
Copy link
Member Author

pospi commented Aug 16, 2019

Should be aware of #9.

@fosterlynn
Copy link

@fosterlynn this will need some discussion in vf-graphql as to what to call & how to organise these fields.

OK. I'm good adding a creating user and date-time to all records. I think those will be different than the small number of created that are defined where we know we need them functionally. For one thing, created will refer to an actual agent; what you are talking about would be a user, more for debugging or tracking back what happened when there is a problem, or do I misunderstand?

If you are also thinking it will be a user, then yes we need to figure out how to make that more universal in terms of the graphql, let's discuss.

@pospi
Copy link
Member Author

pospi commented Aug 21, 2019

I'm thinking it's either a user or a group agent- basically some (REA) agent to which the record might be considered a record of interest, in that they somewhat have stewardship over it, given that they created it. They might have edit permissions by default, for example (at least for a short duration).

I concede that this is somewhat of an antiquated notion in distributed systems design. Does inScopeOf cover this / could it be used for the same purpose? Or should there be something synonymous with stewardingAgents as an additional field with the additional implication?

And all of this is getting into permissions & notifications territory...

@fosterlynn
Copy link

To try to sort out the agents involved:

  • Agents have relationships with each other in roles, which should give some basis for at least some of the authorization needs, as well as default visibility
  • If a flow, then there are provider and receiver agents
  • The inScopeOf generally will be a group agent (or at least has been so far), and that will cover some of the "record of interest" thing - agents related to the group in certain ways could have certain access to records in scope of the group

I'm thinking it's either a user or a group agent- basically some (REA) agent to which the record might be considered a record of interest, in that they somewhat have stewardship over it, given that they created it

I'd like to keep "user" and "agent" separated conceptually. Group agents don't have user credentials, and some person agents tend to have many user credentials, even within one technical infrastructure or application. But I think you probably do mean agents in this case?

agent to which the record might be considered a record of interest, in that they somewhat have stewardship over it, given that they created it

I was thinking of this more as just informational, if you need to know who created something to track something down or whatever, this was standard database practice back in the day. Not sure how that relates to distributed stuff. But seems like generally a good idea, irrespective of the functional requirements of HoloREA and agents.

Or should there be something synonymous with stewardingAgents as an additional field with the additional implication?

Is there something not covered in the list at the top of this comment? If so, let's take a look at it.

@pospi
Copy link
Member Author

pospi commented Aug 22, 2019

Agents have relationships with each other in roles, which should give some basis for at least some of the authorization needs, as well as default visibility

Check. What's unknown there is which data to anchor such roles & access control to. Perhaps it's the network ID, and anyone with an edit_records capability on that network is free to do what they want. But I think it's more fine-grained, and that the context of who originally authored the record is probably an important anchor. edit_own_records (or _commitments, or whatever) sounds like a desirable capability to me.

Maybe we just don't know yet until we start getting user-level requirements coming in.

We certainly don't have to explicitly store the authoring agent address, since Holochain does that by itself. But it might be worth having it injected anyway; as this prevents other agents from being able to mess with things that have been entered by others (they can't pass someone else's agent key).

@fosterlynn
Copy link

I think it's more fine-grained

Agree.

the context of who originally authored the record is probably an important anchor

Agree also.

BTW, I'm not arguing we shouldn't have created date and created by agent or user (let's figure that out), I think that is a good idea, although I'd also like to understand what HC gives us for free. And then think about how to put that into graphql in a fairly universal way.

OT, seems also like we need to chat about what is an agent, even sticking just with a person for now. Although group agent plays in here also. And also perhaps what our requirements might be for the ocaps permissions.

@pospi
Copy link
Member Author

pospi commented Aug 24, 2019

My thinking at present is that maybe the metadata should be returned as its own object, a sub-record of the main VF records. That way, those details only need to be resolved and returned if the UI is specifically interested in returning them; and we avoid clouding the main records with a bunch of repeated fields.

It might not just be author & date stuff, I can see us adding previous revision IDs and updating user IDs down the track to enable querying of a full provenance of each record. That would be a good springboard to locate archival versions for retrieval, which is something we need to do eventually. In any event this is all data that Holochain records for us automatically, so implementation is confined to reads.

The other part of this which I don't think I quite elaborated on properly in that last post is getting uniqueness of new records to avoid accidental conflicts. For example, I create an event which is just {action: receive}, it subsequently gets deleted, then someone else creates {action: receive} and we have a conflict. I suspect the minimum that is required to avoid this is to have author ID and creation time actually be stored in the record entry, which would prevent users being able to create conflicts on other users' entered data (they can't fake user ID) and on their own data (they can fake creation time but would only be hurting themselves and under normal operation this creates uniqueness between multiple distinct records with the same content). Does that make sense?

@fosterlynn
Copy link

My thinking at present is that maybe the metadata should be returned as its own object

That seems like a good idea. I like the idea of being able to expand on the data too.

The other part of this which I don't think I quite elaborated on properly in that last post is getting uniqueness of new records to avoid accidental conflicts.

Not sure I understand this. Won't every record have a unique identifier (a hash)? Or is this about getting the same hash on records that have say identical action, time, provider, etc.? If so, then yes, having the creation time would certainly help that.

Also a side note: we do have more required fields than we have documented, because we have several "either or" things. Should I document those somewhere?

they can fake creation time

We could fix that by not putting it on mutations, and just saving the server time when we save any record.

@pospi
Copy link
Member Author

pospi commented Aug 26, 2019

You've got it RE hashes- records will share the same hash if their content is the same; so we should probably ensure it is always different.

Should I document those somewhere?

Yes please! Ideally log a new issue for "add validation for 'either or' fields" or something.

just saving the server time when we save any record.

Unfortunately we can't- there is no "server time"; only node time. And the user's machine can set that however it wants whether the data comes from the browser or from the DNA. But this is a minor issue if providing the wrong time only hurts the caller (which it will, if agent hash is also injected).

@fosterlynn
Copy link

Unfortunately we can't- there is no "server time"; only node time. And the user's machine can set that however it wants whether the data comes from the browser or from the DNA. But this is a minor issue if providing the wrong time only hurts the caller (which it will, if agent hash is also injected).

Then sounds like node time makes sense. OK with you?

And we use the user token?

I'm not sure if we want it in the graphql reference or not. Do you think it is something that should be standardized across implementation technologies?

@pospi
Copy link
Member Author

pospi commented Aug 27, 2019

Maybe we could just standardise parts of it: I think created_by, modified_by, date_created & date_modified are lowest common denominator.

We can have the core VF schema extended with custom fields by the Holochain implementation; and the structure of the codebase will make it clear what of that is part of the VF spec and what is additional metadata for this implementation specifically. Would be good to proof that setup, too- lots of others will want examples to follow.

@pospi
Copy link
Member Author

pospi commented Dec 6, 2019

Prior to #75, this might be a good flow for using "trusted time", i.e. the time of the write operation: https://forum.holochain.org/t/how-to-provide-time-of-entry-authoring-automatically/1410/7

@pospi
Copy link
Member Author

pospi commented Dec 14, 2019

Figured out the simple flow now, it looks like this:

  • new request hits the zome API
  • entry is constructed and written
  • entry header is read to retrieve timestamp
  • timestamp is written to a separate entry and linked to the main entry. Timestamp is part of the link tag.
  • timestamp is injected into response data as creation_time or similar
  • response data is sent; request ends

It would necessitate an extra link read to provide creation_time when reading the record later, but I think one additional DHT operation is pretty minimal- you can skip reading the entry itself if the timestamp is written to the link tag.

@pospi
Copy link
Member Author

pospi commented Feb 23, 2022

Can probably ignore the above, the HDK has changed significantly since this was written and there will be new (probably native) ways to retrieve this metadata in HDK3.x.

Something related which now has to be done, now that the protocol spec has stabilised enough to proceed-

  • Enable the history module in the DEFAULT_VF_MODULES of @valueflows/vf-graphql-holochain

@Connoropolous Connoropolous added the graphql-api Something at the layer of what is and isn't implemented that affects the graphql api label Mar 4, 2022
@pospi
Copy link
Member Author

pospi commented Mar 25, 2022

Note also that this should be implemented in a transparent way within hdk_records and will probably affect most of the record_helpers.rs function signatures, since we should be returning this data in associated tuples with other record details such as ID and Entry data; and probably want to read it from native Holochain Element fields so that it can't be tampered with.

@Connoropolous
Copy link
Contributor

@pospi since you just mentioned this elsewhere that you were working on it, and consequently I was just reviewing this issue I just 'assigned' it to you, based on our chat about ways of working, if that's ok.

@pospi
Copy link
Member Author

pospi commented Apr 12, 2022

Ah, thankyou for being on top of it! :)

@pospi
Copy link
Member Author

pospi commented May 18, 2022

Work on this started in feature/40-record-revisions-metadata, but pushing to lower priority and will loop back once some of the higher-priority MMR work is completed.

Note that the complicated parts of the implementation have been done, most of the work remaining will be the gruntwork of adding appropriate API endpoints and record metadata to all the existing datatypes.

@Connoropolous
Copy link
Contributor

@pospi is the work for creation times and author metadata considered one-and-the-same as the 'revision history' work? Is it possible to get creation times and author metadata without going full on revision history?

@pospi
Copy link
Member Author

pospi commented May 23, 2022

Yeah. We can split this if you like or I can just do those quickly, actually they might just be this conversion. So we could merge that branch (feature/40-record-revisions-metadata) and import RevisionMeta without having to integrate/test RecordMeta as well (wasm-opt will definitely prune any dead code, if this doesn't happen earlier in the build process...)

@Connoropolous
Copy link
Contributor

I think I like that path, I recommend proceeding that way. There's a lot of utility in just having 'who created and when' at a basic level, before having, "and who updated and when"

@Connoropolous
Copy link
Contributor

@pospi do we consider this work done?

@pospi
Copy link
Member Author

pospi commented Aug 5, 2022

I don't think we have anything logged separately for full revision metadata, so I'll update the issue title to reflect what's needed.

@pospi pospi changed the title Add creation times & author metadata to all new records, and return on read Return previous revision metadata with records to allow UIs to recurse backwards through record versions Aug 5, 2022
@pospi pospi changed the title Return previous revision metadata with records to allow UIs to recurse backwards through record versions Retrieve author metadata in all record responses Aug 5, 2022
@Connoropolous
Copy link
Contributor

Nice, love the work breakdown.

@Connoropolous Connoropolous changed the title Retrieve author metadata in all record responses Retrieve full revision history in all record responses Aug 5, 2022
@pospi
Copy link
Member Author

pospi commented Aug 5, 2022

With the remaining two issues (#347 & #348), because they are quite heavy in terms of DHT ops it will be best to implement them as optional return data. read_revision_metadata_abbreviated should include #346 and #40 only.

To support these changes, and read_revision_metadata_full should be further split apart to support independently loading original & latest revisions where the data is already known, and all fields of RecordMeta except for retrieved_revision should be made optional. The struct can then be returned efficiently on a per-record basis, as follows:

  • For create_*, we can prefill the metadata with original_revision and latest_revision as the same revisions as the current one.
  • For update_*, we can prefill the metadata with latest_revision as the new one being created and determine original_revision as for reads.

We can also further optimise updates by returning the previous revision's metadata from hdk_records::records::update_record and prefilling that as previous_revision, but we may not need to since it's read internally in the update logic anyway and so probably would be read from the local cache in subsequent requests?

For read API requests (whether loading individual records or lists of them), we should add an extra parameter for determining whether to load full metadata for the record. This parameter should be set in the revision(revisionId: ID!) resolver for that type if the input record is queried with latestRevision, originalRevision, previousRevisionsCount or futureRevisionsCount set.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request graphql-api Something at the layer of what is and isn't implemented that affects the graphql api
Projects
None yet
Development

No branches or pull requests

3 participants