Remove Protobuf Transitive Dependency #38

sfc-gh-jopel · 2024-09-25T21:59:56Z

Splitting PRs

Since this PR is very large, each component will be split into smaller PRs to individually be reviewed and merged individually

Code generator: Add code generator for otel proto #39
TODO: Internalizing opentelemetry-exporter-otlp-proto-common code
TODO: Modifying opentelemetry-exporter-otlp-proto-common to use generated code instead of protobuf
TODO: Using the custom encoding logic instead of opentelemetry-exporter-otlp-proto-common (but with a try..except import to use old method if the dependency is already loaded in the runtime)

Changes:

Removes opentelemetry-exporter-otlp-proto-common as a runtime dependency since it uses protobuf, and internalize code for converting SDK objects to proto objects
Implements custom protobuf serialization using pure python based on the opentelemetry-proto message definitions
Adds protoc plugin to generate python serialization objects based on opentelemetry-proto message definition (adds dev dependency of protobuf, Jinja2, grpcio-tools, black, isort)
Chose the fastest approach I found from benchmarks which is 1.2-1.3x faster serialization performance and over 4x less peak memory usage. This serialization approach is subject to change in the future.

Overview of files:

src/snowflake/telemetry/_internal/opentelemetry/proto/ is code generated by scripts/proto_codegen.sh to serialize the custom messages in opentelemetry-proto spec
compile/ is the custom protoc compiler plugin to generate the files above, used by the proto_codegen.sh script
snowflake/telemetry/_internal/opentelemetry/exporter/otlp/proto/common/ is copied from opentelemetry-exporter-otlp-proto-common and slightly modified to use the newly defined message types
snowflake/telemetry/_internal/serialize/ is the custom serialization logic for protobuf primitive types

TODO

TODO: Add "proxy" try/except import between opentelemetry.exporter.otlp.proto.common and snowflake.telemetry._internal.exporter.otlp.proto.common depending on if opentelemetry-exporter-otlp-proto-common==1.26.0 is available in the runtime. Will be done in a separate PR since it makes changes / tests harder to review
TODO: Release new version of snowflake-telemetry-python v0.6.0, will be done in separate PR

Testing:

Added opentelemetry-exporter-otlp-proto-common as a test dependency to test for serialization correctness
Ensures that serialization is follows same semantics as protobuf serialization
TODO: Add unit tests for proto plugin / compilation

Benchmarking:

Add benchmarks to compare protobuf serialization vs custom serialization performance, 1.2x faster
TODO: Add benchmarks for memory usage, 4x improvement
TODO: Add benchmarks for telemetry package loading time on program startup when imported

-----------------------------------------------------------------------------
Benchmark                                   Time             CPU   Iterations
-----------------------------------------------------------------------------
test_bm_serialize_logs_data             78590 ns        78590 ns         8893
test_bm_pb2_serialize_logs_data         96043 ns        96043 ns         7277
test_bm_serialize_metrics_data         132482 ns       132482 ns         5285
test_bm_pb2_serialize_metrics_data     163929 ns       163931 ns         4270
test_bm_serialize_traces_data          103523 ns       103524 ns         6656
test_bm_pb2_serialize_traces_data      132048 ns       132048 ns         5294

sfc-gh-bdrutu · 2024-09-26T19:27:12Z

...flake/telemetry/opentelemetry/exporter/otlp/proto/common/_internal/trace_encoder/__init__.py

+    # We need to inspect the spans and group + structure them as:
+    #
+    #   Resource
+    #     Instrumentation Library


Instrumentation Library -> Scope

sfc-gh-bdrutu · 2024-09-26T22:26:47Z

src/snowflake/telemetry/serialize/__init__.py

+    def write_to(self, out: "ProtoSerializer") -> None:
+        ...
+
+class ProtoSerializer:


One optimization that you can do is to pre-calculate the size of the binary array and pre-allocate the array.

I tried changing my serialize implementation to write to a right-sized buffer (using memoryview for zero-copy fast overwriting of slices). I noticed that it was about 10-20% slower than my current implementation (Can see my changes here https://github.com/sfc-gh-jopel/snowflake-telemetry-python/pull/1/files). This script illustrates it a little.

def test1(): b = bytearray(len(b"hello world") * 10000) m = memoryview(b) i = 0 l = len(b"hello world") for _ in range(10000): m[i:i+l] = b"hello world" i += l return b def test2(): b = bytearray() for _ in range(10000): b.extend(b"hello world"[::-1]) return b[::-1] def test3(): b = bytearray() for _ in range(10000): b += b"hello world"[::-1] return b[::-1] if __name__ == "__main__": import timeit print(timeit.timeit("test1()", setup="from __main__ import test1", number=1000)) print(timeit.timeit("test2()", setup="from __main__ import test2", number=1000)) print(timeit.timeit("test3()", setup="from __main__ import test3", number=1000))

Output:

1.3387652660021558 1.4347571920079645 1.276998276996892

I had the extend approach reverse the bytes before writing since that is what my current implementation is doing. The preallocated approach was only slightly faster than the extend approach, and slower than += method. (I should switch the extend to += for a slight performance increase). Calculating the size of the buffer would be more complex for our proto messages than what is done in this script, and likely make it the slowest of the 3.

This result is puzzling to me because the memoryview approach should be more memory friendly. My guess is for the messages I am using in my benchmark (and probably most telemetry data) fits in cache so the resize operation happens infrequently and is fast enough that it still outperforms the overhead of bounds checking in the right-sized approach.

I am not sure what is the best thing to do here. I think memoryview will outperform for huge logs and traces, but I don't think it's the typical use case.

…alization

sfc-gh-sili requested review from sfc-gh-sili, sfc-gh-bdrutu and sfc-gh-tmonk September 26, 2024 04:10

sfc-gh-jopel force-pushed the remove-protobuf-3 branch 2 times, most recently from e0d8aee to 0b58bdb Compare September 26, 2024 18:15

sfc-gh-bdrutu reviewed Sep 26, 2024

View reviewed changes

sfc-gh-jopel force-pushed the remove-protobuf-3 branch 3 times, most recently from 077f82b to fa8d31e Compare October 18, 2024 23:19

sfc-gh-jopel marked this pull request as ready for review October 21, 2024 16:31

sfc-gh-jopel requested a review from a team as a code owner October 21, 2024 16:31

sfc-gh-jopel requested a review from sfc-gh-bdrutu October 21, 2024 16:31

sfc-gh-jopel added 6 commits October 21, 2024 16:48

Add custom protoc code generator plugin

d1e2684

Add generated code

a6135d4

Copy opentelemetry-exporter-otlp-proto-common encoding

37cdb30

Modify encoding logic to use custom serialization

d13aa09

Replace opentelemetry-exporter-otlp-proto-common dep with custom seri…

8d9a031

…alization

Add benchmarks

929b9d1

sfc-gh-jopel force-pushed the remove-protobuf-3 branch from 02dd321 to 929b9d1 Compare October 22, 2024 00:07

Add 4MB Benchmark case

e0e6187

sfc-gh-jopel marked this pull request as draft October 22, 2024 20:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove Protobuf Transitive Dependency #38

Remove Protobuf Transitive Dependency #38

sfc-gh-jopel commented Sep 25, 2024 •

edited

Loading

sfc-gh-bdrutu Sep 26, 2024

sfc-gh-bdrutu Sep 26, 2024

sfc-gh-jopel Sep 28, 2024

Remove Protobuf Transitive Dependency #38

Are you sure you want to change the base?

Remove Protobuf Transitive Dependency #38

Conversation

sfc-gh-jopel commented Sep 25, 2024 • edited Loading

Splitting PRs

Changes:

Overview of files:

TODO

Testing:

Benchmarking:

sfc-gh-bdrutu Sep 26, 2024

Choose a reason for hiding this comment

sfc-gh-bdrutu Sep 26, 2024

Choose a reason for hiding this comment

sfc-gh-jopel Sep 28, 2024

Choose a reason for hiding this comment

sfc-gh-jopel commented Sep 25, 2024 •

edited

Loading