Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement optimistic locking #8 #42

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

danieljurek
Copy link
Contributor

  • No more bulk operations for CUD, using async tasks instead
  • Removing entities also respects version field
  • Update caching behavior to handle partial failure in optimistic locking scenario

* No more bulk operations for CUD, using async tasks instead
* Removing entities also respects version field
* Update caching behavior to handle partial failure in optimistic locking scenario
@danieljurek
Copy link
Contributor Author

danieljurek commented Mar 24, 2017

[Discussion belongs in the PR, not on the issue]
Andy and I work together and I'm helping with this PR.

The PR brings up a few things I want to make sure fit into the design moving forward:

  • If a CUD operation fails we'll get an AggregateException with a ConditionalCheckFailedException but no detail about the entity for which the check failed. We should probably create our own exception for this.
  • TableDefinitionWrapper has more complex add/update behavior.. We could probably put the responsibility of generating OperationConfigs in a different class.
  • We mutate an EntityWrapper's document in the CUD operation. There may be a better way to propagate document changes back to the entity.

[Test]
[ExpectedException(typeof(InvalidOperationException), ExpectedMessage = "cannot be added, because entity with that key already exists", MatchType = MessageMatch.Contains)]
[ExpectedException(typeof(AggregateException))]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm i'd probably want to catch+flatten any aggregate exceptions that get thrown internally.
unless a method returns a Task (is Async), a user should not expect that method to throw an AggregateException.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah just saw your comment in the PR description:

If a CUD operation fails we'll get an AggregateException with a ConditionalCheckFailedException but no detail about the entity for which the check failed. We should probably create our own exception for this.

yes. definitely want to catch and re-throw. AFAIK, the Dynamo API doesn't really tell you which expectation failed either, but we are only using conditions for document versioning, so we can assume it's a version mismatch, right?

}

[Test]
[ExpectedException(typeof(AggregateException))]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any way we can catch expected exceptions on the line where we actually expect them?
I'm not a huge fan of catching this at the [Test] level.

Especially in this case, where we want to make sure we throw from the contextB.SubmitChanges() and not the contextA call.

namespace Linq2DynamoDb.DataContext.Tests.EntityManagementTests.Versioning
{
[TestFixture]
class EntityVersioningTests : DataContextTestBase

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe add a test case for the scenario where you submit and THEN read and modify the entity again?
(in other words, the non-exceptional case. verify you can make 2 updates on a single entity and verify both changes as you make them).

@@ -140,7 +141,9 @@
<SubType>Designer</SubType>
</None>
</ItemGroup>
<ItemGroup />
<ItemGroup>
<Service Include="{82A7F48D-3B50-4B1E-B82E-3ADA8210C358}" />

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this an intentional change?

@@ -32,6 +32,9 @@ public class Book : EntityBase
[DynamoDBProperty(typeof(StringTimeSpanDictionaryConverter))]
public IDictionary<string, TimeSpan> FilmsBasedOnBook { get; set; }

[DynamoDBVersion]
public int? VersionNumber { get; set; }

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation is off a bit

/// </summary>
private PropertyInfo EntityVersionNumberProperty {
get {
if (!_hasResolvedEntityVersionNumberProperty) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure about the project's braces style, but looks like you have mixed foo { and foo\n{ in this file's changes.

@@ -14,6 +16,31 @@ internal class EntityWrapper : IEntityWrapper
private readonly IEntityKeyGetter _keyGetter;
private Document _doc, _newDoc;

private PropertyInfo _entityVersionNumberProperty;
private bool _hasResolvedEntityVersionNumberProperty;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could be a Lazy<bool>

Copy link
Owner

@scale-tone scale-tone Apr 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid it wouldn't help either.
Technically, what is happening here now - is that the value for _entityVersionNumberProperty is calculated per each EntityWrapper instance. In effect, it's recalculated all the time (because EntityWrapper instances have very short lifes). Which is inefficient, and is exactly what we try to avoid throughout the library.

A typical approach to workaround this is demonstrated by TableDefinitionWrapper.ToDocumentConversionFunctor property. It contains a functor, that does a straightforward conversion from an entity instance to a Document. And that functor is a compiled Expression, so it works as fast, as if you manually wrote an entity-specific conversion code like

doc["Field1"] = entity.Field1;
doc["Field2"] = entity.Field2;
...

Another, less complicated, approach is used in EntityKeyGetter, where PropertyInfos are cached per TableDefinitionWrapper instance.

I would propose to use one of these two approaches, or just Memoize a version value getter per entity type.

.Where(property =>
property
.GetCustomAttributes(typeof(DynamoDBVersionAttribute), true)
.SingleOrDefault() != null

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.Any()?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or maybe you WANT to throw an exception if an entity contains more than one Version property.
is that even legal in dynamo's object model?

either way.... if your intention is to validate Single versioned property, maybe that should be done somewhere else or at least throw a useful exception if entities violate that constraint.

property
.GetCustomAttributes(typeof(DynamoDBVersionAttribute), true)
.SingleOrDefault() != null
).SingleOrDefault();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can collapse that GetProperties(...).Where(predicate).SingleOrDefault() into GetProperties(...)SingleOrDefault(predicate)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also just realized there are 2 levels of validation you might want for an entity.

  1. should not have more than one [DynamoDbVersion] attribute on a single property (the inner SingleOrDefault). This actually should be enforced by the AttributeUsage of the attribute itself:
[AttributeUsage(AttributeTargets.Field | AttributeTargets.Property, Inherited = true, AllowMultiple = false)]
    public sealed class DynamoDBVersionAttribute : DynamoDBRenamableAttribute
    {
    }

so i don't think you need this inner check and can just use Any.

  1. should not have more than one property decorated with [DynamoDbVersion]. this is the outer SingleOrDefault and we should throw a nice exception if it's violated.

this.IsCommited = true;
}

/// <summary>
/// Sets the value of the Entity's propety that has the DynamoDBVersionAttribute to

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

propety -> property

@@ -126,19 +133,19 @@ public override void ClearModifications()
internal Action<Exception> ThingsToDoUponSubmit;

/// <summary>
/// Entities, loaded from DynamoDb
/// Entities that were loaded from DynamoDb or

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or....?

/// Performs create/update operations with DynamoDB, respecting the [DynamoDBVersion] attribute if present
/// </summary>
/// <remarks>
/// If the entity is new (Version property has not been set) set the version property 0. If

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does the object model versioning start with 0 or 1?


putConfiguration.ConditionalExpression = new Expression()
{
ExpressionStatement = $"attribute_not_exists(#key)",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this not automatically verified internally by dynamo? what would happen if you removed this check and tried to submit 2 entries with same hash key property?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And also what if the table uses both keys, and there's already an item with the same HashKey but different RangeKey?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Andy: When performing a PutItem operation on an entity in DynamoDB, DynamoDB does not verify whether an object with the same key is already present. If an item with the same key is present it is overwritten. The DataContext_AddEntity_DoesNotOverwrite_ExistingVersionedEntity test verifies this behavior.

Konstantin: From documentation and testing, attribute_not_exists is the way to go in both cases because the attribute_not_exists check is verifying against an object that may already be present in the table with the requested primary key (the primary key could use a hash key or a hash and range key). Since every object in a table is required to have at least a partition key, attribute_not_exists is checking to see if the object at the given primary key already exists in the table by checking the presence of a required attribute. My next update of the PR includes a check that demonstrates this case.

The document doesn't explicitly call this out: http://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_PutItem.html

A more exacting explanation on StackOverflow: http://stackoverflow.com/questions/32833351/dynamodb-put-item-if-hash-or-hash-and-range-combination-doesnt-exist

ExpressionStatement = $"attribute_not_exists(#key)",
ExpressionAttributeNames = new Dictionary<string, string>
{
{ "#key", HashKeyPropetyName }

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HashKeyPropetyName => HashKeyPropertyName

{ VersionPropertyName, entry[VersionPropertyName] }
});

// TODO: We assume that the Version is an int here but it could be

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

woah. how would you even handle incrementing if this was not an integer??

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a lazy approach. The correct way to do this is to check the type of the object's versioning property, call the appropriate conversion method on the DynamoDBEntry (e.g. AsSingle() in the case of a float, AsInt() in the case of an int, etc.), then put the incremented value back in the Document object.

In the current implementation, if the version property were a float and the version were stored as 1.1 (by some external process) this code would overwrite that value to 2 instead of 2.1. This doesn't break optimistic locking behavior, but it could produce unexpected results for anyone who is checking the value of the version property.

.Where(property =>
property
.GetCustomAttributes(typeof(DynamoDBVersionAttribute), true)
.SingleOrDefault() != null

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comments apply here RE: attribute validation

? versionProperty.Name
: null;

_hasResolvedVersionPropertyName = true;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and yeah if the actual _versionPropertyName backing field was Lazy you wouldn't need the _hasResolved... field

@@ -7,5 +7,6 @@ public interface IEntityWrapper
object Entity { get; }
Document GetDocumentIfDirty();
void Commit();
Document AsDocument();
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need in this method.

We only need to convert an entity into a Document, if that entity was modified or newly added. Simply because we only need to save modified Documents to storage, unchanged (and deleted) documents do not need to be saved. The GetDocumentIfDirty() method was specifically designed to underscore this.

If at some place of your code you found yourself wanting AsDocument() method, that rather signals a design flaw. And that's what we see in TableDefinitionWrapper, 230: no need to convert deleted entities into Documents, keys are enough for doing a delete.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For optimistic delete you also would need only keys and version field value, so no need in converting the whole entity even in that case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. We only need the value of the version property (if optimistically locking) instead of the whole document.

{
var batch = this.TableDefinition.CreateBatchWrite();
var dynamoDbUpdateTasks = AddEntriesAsync(addedEntities)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought, we agreed on switching between two approaches (batches vs. individual tasks), depending on whether a Version field was defined or not...

Batches are much faster, so I wouldn't just get rid of them in favor of optimistic updates.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must've misread. Batching would certainly be better in the case where we aren't using optimistic locking.

/// <summary>
/// THe name of the underlying entity's property that has the [DynamoDBHashKey] attribute
/// </summary>
internal string HashKeyPropetyName
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's already TableDefinitionWrapperBase.KeyNames for that.

@danieljurek
Copy link
Contributor Author

Was away for a while. Will have an updated PR soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants