reorder struct fields for compactness#23
Conversation
…ic/betteralign had to duplicate flow/flow.go to tmp.go to get the tool to pick up the structs: /Users/rtice/src/gopath/libkflow/tmp.go:3:11: struct of size 520 could be 512 /Users/rtice/src/gopath/libkflow/tmp.go:73:13: struct of size 104 could be 88 /Users/rtice/src/gopath/libkflow/config.go:19:13: struct with 168 pointer bytes could be 136 /Users/rtice/src/gopath/libkflow/send.go:21:13: struct with 104 pointer bytes could be 72
aaronfuj
left a comment
There was a problem hiding this comment.
I think we should keep this change focused on just the Flow and Custom struct, given that is the one that is within the hot path. For the other structs, we are a lot less concerned about their size vs readability and it would be better to keep them as is to preserve the history imo.
aaronfuj
left a comment
There was a problem hiding this comment.
Looks good, let's make it explicit what we've done for this micro optimization so we're kind to future developers that may undo what we've done
|
Seems reasonable. I started doing the napkin math based on number of devices I had seen floating around and was on the fence as to how negligible or not that would be in the aggregate. |
| SrcBgpCommunity string | ||
| SrcFlowTags string | ||
| DstFlowTags string | ||
| FlowTags string | ||
| DstBgpAsPath string | ||
| DstBgpCommunity string | ||
| SrcBgpAsPath string | ||
| Ipv6SrcAddr []byte | ||
| Ipv6SrcRoutePrefix []byte | ||
| Customs []Custom | ||
| Ipv6DstRoutePrefix []byte | ||
| Ipv6DstAddr []byte | ||
| Ipv6SrcNextHop []byte | ||
| Ipv6DstNextHop []byte |
There was a problem hiding this comment.
Wouldn't it be better to move all of these variable length items to the end of the struct? If they come earlier, it seems like they can go against struct padding as opposed to the fixed length fields?
There was a problem hiding this comment.
This analyzer find structs that can be rearranged to use less memory, and provides
a suggested edit with the most compact order.
Note that there are two different diagnostics reported. One checks struct size,
and the other reports "pointer bytes" used. Pointer bytes is how many bytes of the
object that the garbage collector has to potentially scan for pointers, for example:
struct { uint32; string }
have 16 pointer bytes because the garbage collector has to scan up through the string's
inner pointer.
struct { string; *uint32 }
has 24 pointer bytes because it has to scan further through the *uint32.
struct { string; uint32 }
has 8 because it can stop immediately after the string pointer.
Be aware that the most compact order is not always the most efficient.
In rare cases it may cause two variables each updated by its own goroutine
to occupy the same CPU cache line, inducing a form of memory contention
known as "false sharing" that slows down both goroutines.
I guess it's the difference between pointer bytes and struct size. It would probably be good to try out the different options internally via some actual data and see how they compare before bringing it back up here unless it is well known that this current strategy works better. I assume based on the libraries used that it should be good, but if we're doing this optimization we might as well be sure.
reorder struct fields for compactness using https://github.com/dkorunic/betteralign
had to duplicate flow/flow.go to tmp.go to get the tool to pick up the structs:
/Users/rtice/src/gopath/libkflow/tmp.go:3:11: struct of size 520 could be 512
/Users/rtice/src/gopath/libkflow/tmp.go:73:13: struct of size 104 could be 88
/Users/rtice/src/gopath/libkflow/config.go:19:13: struct with 168 pointer bytes could be 136
/Users/rtice/src/gopath/libkflow/send.go:21:13: struct with 104 pointer bytes could be 72
closes #21