Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The future of the go-structform library #29694

Closed
fearful-symmetry opened this issue Jan 4, 2022 · 14 comments
Closed

The future of the go-structform library #29694

fearful-symmetry opened this issue Jan 4, 2022 · 14 comments
Labels

Comments

@fearful-symmetry
Copy link
Contributor

This is a continuation of an email chain between myself and @jlind23 about the future of the go-structform library: https://github.com/elastic/go-structform

After Steffan's departure, I'm not sure if anyone wants to, or knows how to, maintain this library. I'm not sure if anyone here understands how go-structform works on a deep level, but I certainly don't, and I helped tinker on it a few times. It's rather complex and surprisingly large, and most worryingly, it also has its fingers in a lot of places within libbeat, including:

./libbeat/common/transform/typeconv
./libbeat/esleg/eslegclient
./libbeat/outputs/codec/json
./libbeat/outputs/codec/
./libbeat/publisher/queue/diskqueue
./libbeat/statestore/backend/memlog

I've gone over a bit of this code, and in quite a few of these cases, it's using structform as a JSON encoder, and I'm not sure why we're using structform instead of golang's built-in json marshaller.

One thing that the go-structform library does for system metrics is implement an IsZero interface that allows us to remove "null" metrics in cases where given values don't exist on a platform, while still allowing us to use structs instead of reverting to less developer and performance-friendly MapStr types. I'm a bit worried about this, as the IsZero function is relatively new, and if we need to expand the feature or do bugfixes, I'm not sure anyone here has the knowledge to do so expediently.

Before the Holiday break, I prototyped a replacement for go-structform's IsZero implementation that's meant to somewhat emulate the behavior of Rust's Option enum type and allow us to distinguish between an "actual zero" and a non-existent metric. This ended up being rather complex to do in a generic fashion due to the limits of Go's JSON marshaller, and requires all structs to be passed through a wrapper that implements a custom MarshalJSON method that implements a bunch of reflection to manually nil-out non-existent fields. Still, the resulting code is about 100 lines, and can be applied to any struct or type that implements an IsZero interface. I'd like to do some performance benchmarking, since I suspect that both this implementation, and go-structform, has a good bit of overhead.

In this case, it's also worth noting that this IsZero functionality might expand to places outside the system module. This feature was originally added so we could deal with the heterogeneous nature of system metrics, as the values reported by the os can, obviously, vary quite a lot depending on the OS itself, the release, version, customer configuration, etc. However, with more linux distros and deployed docker versions running cgroups V2, we're going to run into this issue in the docker module, and I assume the k8s module, as the metrics reported by cgroups V2, and thus passed along by docker, differ in a number of areas from cgroups V1. Sooner or later, we'll need to add IsZero support to docker, at least.

So, there's two questions I want to ask here, although @jlind23 might have more:

  • How should we deal with go-structform going forward?
  • Should we try and replace it with the stdlib json marshaller where possible?
  • Does anyone understand this code on a level where they can take responsibility for it?
  • Should we try and replace structform's IsZero implementation with a simpler one that other developers will be able to easily understand and add on to?
@fearful-symmetry fearful-symmetry added the Team:Elastic-Agent Label for the Agent team label Jan 4, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent (Team:Elastic-Agent)

@fearful-symmetry fearful-symmetry added the Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team label Jan 4, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

@fearful-symmetry fearful-symmetry removed the Team:Elastic-Agent Label for the Agent team label Jan 4, 2022
@jlind23
Copy link
Collaborator

jlind23 commented Jan 5, 2022

ping @kvch as she may want to jump in.

@fearful-symmetry
Copy link
Contributor Author

Did a very brief and not entirely scientific set of benchmarks, Both my PoC and go-structform are about the same time-wise for encoding a small json object, and the stdlib json encoder is about twice the speed.

@ph
Copy link
Contributor

ph commented Jan 6, 2022

This is a lot of contexts and memories I would need to unpack, I've played in the past with go structform. If I remember correctly other than the speed advantages it had (at the time of writing) vs the build-in library. I think it was also better citizen for memory usages and reducing copies.

But last time I've messed around with struct form it was a few years ago so takes my comment with a grain of salt.

Just to be complete, the "raison d'être" of this library is because Golang was so new when Beats was initially created a lot of std library was inefficient or non-exiting.

If we can remove code and relies on std library, I believe it would be a good move. But we need to have a plan to evaluate that change.

@faec
Copy link
Contributor

faec commented Jan 6, 2022

+1 to this... my general perception (being responsible for at least one of the uses above 😅) is that go-structform was at one time unambiguously better than the stdlib, and became the default in new Beats code by inertia. As the years pass we're starting to suspect that their positions have reversed, but we haven't put in the (substantial) work yet to verify either way. But whichever choice is best at this point, the library represents a lot of technical debt, and if we can clarify that and start moving some of our use cases back to stdlib I think that'd be a great win.

@ruflin
Copy link
Contributor

ruflin commented Jan 11, 2022

go-structform has served us well over time but as mentioned above, it is time to reevaluate. There are likely parts today that we can replace by stdlib and if we can, we should. But it is important to understand what we change and why.

There are special cases that go-structform handles for us. I would challenge if replacing this custom implementation by a new custom implementation will really benefit us and is worth the effort. Assuming go-strucform does not have bugs or major issues, it is likely much quicker and reliable to dig into the code and fully understand it. Think of it like any other third party library.

I remember a discussion with @axw of the apm team that is also working on some libraries around a similar problem. Maybe worth to team up?

Most important part for me is that if we invest time in removing go-structform is understanding what the problem is we solve with it and the benefits we get. In an ideal scenario from my perspective is that we are heading in a direction of not requiring the serialising and deserialising at all which makes these libraries obsolete.

@axw
Copy link
Member

axw commented Jan 12, 2022

I remember a discussion with @axw of the apm team that is also working on some libraries around a similar problem. Maybe worth to team up?

In apm-server we are mostly using stdlib, with a smattering of https://github.com/json-iterator/go and https://github.com/elastic/go-fastjson. We're currently relying on fastjson for JSON encoding events, but still using reflection. We do have a pain point with encoding "nullable" values, as @fearful-symmetry describes above. We currently have a step which translates our structs to MapStrs, omitting zero values, for this.

No matter which library we use, if it involves reflection it will be sub-optimal. What we're planning to do is generate JSON encoding code, similar to what is done in the APM Go agent (see model/marshal_fastjson.go) using https://github.com/elastic/go-fastjson. This can be done with zero allocations when reusing buffers, and is very fast.

@fearful-symmetry
Copy link
Contributor Author

@axw

We currently have a step which translates our structs to MapStrs, omitting zero values, for this.

How does APM do this? I wasn't aware anyone besides myself was tackling that.

I would challenge if replacing this custom implementation by a new custom implementation will really benefit us and is worth the effort.

I'm kind of conflicted about. I definitely agree, but all else (usability, performance, etc) being equal, we might want to prefer an implementation that would be easier to maintain or for other users to understand. We may also want to revisit this once go generics become GA, since last I dug into it, a considerable portion of the complexity in structform comes from autogenerated implementations for a wide array of types.

@axw
Copy link
Member

axw commented Jan 13, 2022

We currently have a step which translates our structs to MapStrs, omitting zero values, for this.

How does APM do this? I wasn't aware anyone besides myself was tackling that.

It's currently handwritten translation code. Here's an example, the process ECS field set: https://github.com/elastic/apm-server/blob/279e053c5e944359097226cc5d1cdbb9e7736f66/model/process.go#L24-L48. The pid field is not set if the struct field is zero.

Later on I'm hoping we can auto-generate this kind of logic, using tags/annotations on the model types.

@fearful-symmetry
Copy link
Contributor Author

@axw That's interesting. I assume that fields() is called manually, and it's not implementing an interface that's called by some later marshaller or type conversion process?

@axw
Copy link
Member

axw commented Jan 14, 2022

@fearful-symmetry correct, all manual and no interfaces involved in this part.

@fearful-symmetry
Copy link
Contributor Author

Honestly, not sure if we should close this or move it to discussion. We might also want to wait to see how the dust settles with the V2 stuff and then see how structform and the other transformation libs are actually being used.

@botelastic
Copy link

botelastic bot commented Apr 6, 2023

Hi!
We just realized that we haven't looked into this issue in a while. We're sorry!

We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1.
Thank you for your contribution!

@botelastic botelastic bot added the Stalled label Apr 6, 2023
@botelastic botelastic bot closed this as completed Oct 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants