Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support tombstone'd value states in the storage-layer #161

Merged
merged 21 commits into from
Mar 15, 2022
Merged

Conversation

slawlor
Copy link
Contributor

@slawlor slawlor commented Mar 3, 2022

In realistic implementations of a key-directory, one will need to periodically prune values stored in the directory as per data-retention policies. (See #130 for more context). A method of pruning is converting protected values to "tombstones" which are a constant value identifying that the real value has been removed, but leaves the node in storage for posterity.

This PR poses a possibility of "tombstoning" data in the storage layer, meaning replacing it with said constant value. When validating a KeyHistoryProof, if a tombstoned state is encountered, the hash of the value is simply assumed to be correct since the raw data has been pruned. This means that records in the directory are left intact for a version history of updates.

Additionally as a side-modification, the AkdLabel and AkdValue were converted to byte-arrays from strings, as string encoding is a broad assumption and serialization should be left to the caller.

Resolved #170

@slawlor slawlor added the enhancement New feature or request label Mar 3, 2022
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 3, 2022
@slawlor slawlor marked this pull request as ready for review March 3, 2022 16:13
@slawlor slawlor linked an issue Mar 3, 2022 that may be closed by this pull request
Copy link
Member

@afterdusk afterdusk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, but let's have either Kevin or Jasleen approve 🙂 Thanks for working on this!

Quick question, I see that the PR is marked to close #130 - we are not yet taking care of the "deletion" of non-stale ValueStates right?

akd/src/client.rs Outdated Show resolved Hide resolved
akd/src/client.rs Outdated Show resolved Hide resolved
akd/src/client.rs Show resolved Hide resolved
akd/src/directory.rs Outdated Show resolved Hide resolved
akd/src/directory.rs Outdated Show resolved Hide resolved
akd/src/storage/memory.rs Show resolved Hide resolved
@@ -187,6 +202,38 @@ impl Storage for AsyncInMemoryDatabase {
Ok(map)
}

async fn tombstone_value_states(&self, keys: &[ValueStateKey]) -> Result<(), StorageError> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense for this API to accept a username and an epoch, where everything before the given epoch should be tombstoned? The caller does not need to query and figure out the existing value states beforehand, and the storage layer can implement a query + update that's optimized for the DB.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could but that implies that all use cases for tombstone'ing is related to data retention. While that is "probably" accurate, there could be cases where you'd want to tombstone an intermediate value in non-key-transparency situations (i.e. We want the first and last value, and tombstone in the middle).

Additionally, constructing the ValueStateKeys is relatively simple if we add a storage layer contract where you can say select a collection of value states between epochs (e.g. limited_key_history). Does that all make sense?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense! I was gunning for efficiency here and trying to avoid two round trips to the database (say with an UPDATE ... WHERE... query). But I think it's an OK tradeoff for the ability to tombstone specific ValueStates.

akd/src/storage/memory.rs Outdated Show resolved Hide resolved
akd/src/storage/tests.rs Show resolved Hide resolved
akd/src/storage/tests.rs Outdated Show resolved Hide resolved
@slawlor
Copy link
Contributor Author

slawlor commented Mar 8, 2022

Quick question, I see that the PR is marked to close #130 - we are not yet taking care of the "deletion" of non-stale ValueStates right?

Hmm it does indeed seem like it's going to close the issue. I thought the term "Addressing" only linked and didn't close. There's some magic linking phrases in github, and now I can't seem to revert it (even deleting the statement) so it'll probably close and I'll re-open once the PR merges.

))));
}

// TODO: This call is quite expensive, since it's assumed most epochs do NOT have a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This issue was already solved in the original way key history was done. Essentially you want to get a user's version numbers at start_epoch and end_epoch and prove history proofs like below. Is there a reason you traverse over epochs as if they are version numbers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I understand what you're getting at, but it still pulled the entire history for an entry. Imagine the following scenario:

  1. User has account, does 10 key updates
  2. User deletes account, we tombstone 10 updates
  3. New user gets phone number of User in (1), they do 10 key updates

Now a key-history for the "new" user in (3) will show 20 entries. Now scale this to 100 updates, over 10 users, etc. Eventually we might not want all of the key history for a user, but say the history in the last X epochs. That's what this call is solving. So it's a "truncated" history, for a specific range of values. Does that make sense?

Copy link
Contributor

@Jasleen1 Jasleen1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment about limited_key_history but otherwise this looks good. Thank you so much!

@slawlor slawlor linked an issue Mar 10, 2022 that may be closed by this pull request
@slawlor slawlor requested review from afterdusk and Jasleen1 March 14, 2022 14:14
@slawlor
Copy link
Contributor Author

slawlor commented Mar 14, 2022

@afterdusk or @Jasleen1 can you take a look to see if you're satisfied with my responses to your comments so we can get this merged? After this I think we're ready to release 0.5.0

Copy link
Member

@afterdusk afterdusk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The revisions LGTM, let's merge this PR!

Comment on lines +109 to +120
match u_guard.get(&username) {
Some(old_states) => {
let mut new_states = old_states.clone();
new_states.insert(value_state.epoch, value_state.clone());
u_guard.insert(username, new_states);
}
None => {
let mut new_map = HashMap::new();
new_map.insert(value_state.epoch, value_state.clone());
u_guard.insert(username, new_map);
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you've already tried this, but does entry() and or_insert() work here?

@slawlor slawlor merged commit 3c5b89f into main Mar 15, 2022
@slawlor slawlor deleted the slawlor/tombstone branch March 15, 2022 12:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Handle absence of ValueState Allow limiting the key history proof to TOP(N) updates
4 participants