Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Abrandoned/decoding setters without copy #297

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

abrandoned
Copy link
Contributor

start moving towards per field decoding setters instead of using the standard setter during the decode routine; the standard setter does copies to keep from mutating an incoming value when a value is set, during decode we also make a copy as a result of pulling it off the stream, largely reduces a single copy and in cases where we can it prevents the acceptable? checks as a value that has been serialized and is tagged should be acceptable

also started added setters directly on the message object by tag instead of doing a field lookup and then running through the name of the field; reduces the amount of searching for field definitions and will eventually lead to us having a deserialization routine that only utilizes tags (instead of field lookup)

on MRI and JRuby these changes accounted for ~20% decrease in time for serialization/deserialization of the command
bx rake benchmark:profile_protobuf_serialize[100000,prof_out.txt]

@film42

…py and starts adding setters by tag so we do not need the get_field lookup when we have a tag
…field is Integer in enum before going through coerce! routine
@@ -234,6 +235,10 @@ def define_getter
::Protobuf.field_deprecator.deprecate_method(message_class, method_name) if field.deprecated?
end

def define_decode_setter
# empty for now
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only a few variants of this method. Let's bring the duplicated common implementation here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably best, I wrote this while I was testing the string copy only and now need to refactor since a common pattern has emerged

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@abrandoned
Copy link
Contributor Author

@zachmargolis would love to get your thoughts on this path

@@ -52,6 +52,7 @@ def self.encode(value, use_cache = true)
#

def acceptable?(val)
return true if val.is_a?(Integer) && val >= 0 && val < INT32_MAX
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we still want to compare to self.class.max right?

I understand wanting to avoid coerce! for performance reasons but maybe there's a better way to preserve the existing logic?

Just an idea

def acceptable?(val)
  int_val = val.is_a?(Integer)
    val
  else
    coerce!(val)
  end

  int_val >= self.class.min && int_val <= self.class.max
end

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do want to compare the max (which happens after this if it is larger than the smallest max); the smallest max is the INT32_MAX, so if it passes it (and is an Integer; which is the most common case, especially in deserialization) then we just set it instead of taking the long route (which in our usage is very uncommon)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess as a reader of this patch, I'm having a hard following what it's supposed to do. I didn't realize it would fall through to the old case. Maybe we could do something like this (comments included) for posterity?

def acceptable?(val)
  int_val = if val.is_a?(Integer)
    return true if val >= 0 && val < INT32_MAX # return quickly for smallest integer size, hot code path
    val
  else
    coerce!(val)
  end
  int_val >= self.class.min && int_val <= self.class.max
end

method_name = field.getter

message_class.class_eval do
define_method(method_name) do
@values[field.name] ||= ::Protobuf::Field::FieldArray.new(field)
@values[field_name] ||= ::Protobuf::Field::FieldArray.new(field)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heads up that a lot of this work will need to be moved to the []= and [] methods after PR #302 is merged. In fact a lot of the setter/getter code in this PR will conflict, I comment on just a little of it below.

@embark
Copy link
Contributor

embark commented Feb 23, 2016

👍 great optimization!

How much benefit do we get on creating the dynamic per-tag methods vs. having 1 method in the message where that requires a field lookup first?

Just curious if the optimization is worth the complexity tradeoff of a bunch of new dynamically generated methods

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants