-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support tombstone'd value states in the storage-layer #161
Conversation
…ients (not storage implementers)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM, but let's have either Kevin or Jasleen approve 🙂 Thanks for working on this!
Quick question, I see that the PR is marked to close #130 - we are not yet taking care of the "deletion" of non-stale ValueStates
right?
@@ -187,6 +202,38 @@ impl Storage for AsyncInMemoryDatabase { | |||
Ok(map) | |||
} | |||
|
|||
async fn tombstone_value_states(&self, keys: &[ValueStateKey]) -> Result<(), StorageError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense for this API to accept a username and an epoch, where everything before the given epoch should be tombstoned? The caller does not need to query and figure out the existing value states beforehand, and the storage layer can implement a query + update that's optimized for the DB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could but that implies that all use cases for tombstone'ing is related to data retention. While that is "probably" accurate, there could be cases where you'd want to tombstone an intermediate value in non-key-transparency situations (i.e. We want the first and last value, and tombstone in the middle).
Additionally, constructing the ValueStateKey
s is relatively simple if we add a storage layer contract where you can say select a collection of value states between epochs (e.g. limited_key_history
). Does that all make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense! I was gunning for efficiency here and trying to avoid two round trips to the database (say with an UPDATE ... WHERE...
query). But I think it's an OK tradeoff for the ability to tombstone specific ValueStates
.
Hmm it does indeed seem like it's going to close the issue. I thought the term "Addressing" only linked and didn't close. There's some magic linking phrases in github, and now I can't seem to revert it (even deleting the statement) so it'll probably close and I'll re-open once the PR merges. |
akd/src/directory.rs
Outdated
)))); | ||
} | ||
|
||
// TODO: This call is quite expensive, since it's assumed most epochs do NOT have a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This issue was already solved in the original way key history was done. Essentially you want to get a user's version numbers at start_epoch
and end_epoch
and prove history proofs like below. Is there a reason you traverse over epochs as if they are version numbers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I understand what you're getting at, but it still pulled the entire history for an entry. Imagine the following scenario:
- User has account, does 10 key updates
- User deletes account, we tombstone 10 updates
- New user gets phone number of User in (1), they do 10 key updates
Now a key-history for the "new" user in (3) will show 20 entries. Now scale this to 100 updates, over 10 users, etc. Eventually we might not want all of the key history for a user, but say the history in the last X epochs. That's what this call is solving. So it's a "truncated" history, for a specific range of values. Does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A comment about limited_key_history
but otherwise this looks good. Thank you so much!
Also forcing key history proofs to be in max -> min, such that they go from highest version to lowest
@afterdusk or @Jasleen1 can you take a look to see if you're satisfied with my responses to your comments so we can get this merged? After this I think we're ready to release 0.5.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The revisions LGTM, let's merge this PR!
match u_guard.get(&username) { | ||
Some(old_states) => { | ||
let mut new_states = old_states.clone(); | ||
new_states.insert(value_state.epoch, value_state.clone()); | ||
u_guard.insert(username, new_states); | ||
} | ||
None => { | ||
let mut new_map = HashMap::new(); | ||
new_map.insert(value_state.epoch, value_state.clone()); | ||
u_guard.insert(username, new_map); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you've already tried this, but does entry()
and or_insert()
work here?
In realistic implementations of a key-directory, one will need to periodically prune values stored in the directory as per data-retention policies. (See #130 for more context). A method of pruning is converting protected values to "tombstones" which are a constant value identifying that the real value has been removed, but leaves the node in storage for posterity.
This PR poses a possibility of "tombstoning" data in the storage layer, meaning replacing it with said constant value. When validating a
KeyHistoryProof
, if a tombstoned state is encountered, the hash of the value is simply assumed to be correct since the raw data has been pruned. This means that records in the directory are left intact for a version history of updates.Additionally as a side-modification, the
AkdLabel
andAkdValue
were converted to byte-arrays from strings, as string encoding is a broad assumption and serialization should be left to the caller.Resolved #170