Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support round-tripping unknown fields #104

Open
ReubenBond opened this issue Jan 21, 2021 · 0 comments
Open

Support round-tripping unknown fields #104

ReubenBond opened this issue Jan 21, 2021 · 0 comments
Milestone

Comments

@ReubenBond
Copy link
Owner

ReubenBond commented Jan 21, 2021

Currently, Hagar will safely and predictably deserialize types where the payload contains unknown fields. It supports this by marking and otherwise ignoring the field when it encounters it in the bit stream. If it encounters a reference to that ignored field later, then it will look at that mark and deserialize the field (now knowing what type to deserialize the previously unknown field as).

However, Hagar does not yet support re-serializing that object with full fidelity: those ignored fields will not be serialized. In order to support scenarios where this is useful, Hagar should recognize objects which have a property/field marked with a [Hagar.ExtensionData] attribute. Initially, we can require that the member declared as an object and potentially add an interface which allows some small degree of introspection later, as needed.

Example:

public class MyData
{
  // This could be public
  [Hagar.ExtensionData]
  private object _extensionData;

  [Hagar.Id(0)]
  public int MyValue { get; set; }
}

We can optionally also define an interface which users can optionally implement instead of annotating a field themselves.

public interface IHasExtensionData
{
  [Hagar.ExtensionData]
  object ExtensionData { get; set; }
}

Code generation needs to be updated to support identifying extension data members, deserializing into them, and serializing from them. This is a substantial change, since it means that the existing, optimized routine for serialization cannot be used. Instead, the generated code will need to check for extension data between every known field which has gaps before it

Given a type definition:

public class MyData
{
  [Id(1)] public int MyInt { get; set; }

  [Id(2)] public int MyInt2 { get; set; }

  [Id(44)] public int MyInt3 { get; set; }
}

The serialization order would be

  • Serialize any unknown field with id 0
  • Serialize known field with id 1
  • Serialize known field with id 2 (no gaps between 1 and 2)
  • Serialize any unknown fields with ids between 2 and 44
  • Serialize known field with id 44
  • Serialize any unknown fields with ids greater than 44

The performance hit will not be insignificant in some cases, and therefore the code generator should decide whether to use the existing, optimized routine, or this proposed routine based on whether the type (or a parent type) has an [ExtensionData] member.

Similarly, deserialization will need to change, but that change is likely not as involved: instead of ignoring unknown fields, it will need to place them into the extension data.

Generated code can call static helper methods to help with that serialization and deserialization.

@ReubenBond ReubenBond added this to the 1.0 milestone Jan 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant