WIP: first attempt at #776 / #941 #942

marvinthepa · 2022-12-10T00:38:02Z

WIP as it contains a few FIXMEs that need to be addressed (especially error handling).

Uploaded anyway as a basis for discussion.

test/integration/test_pipelined_get.rb

petergoldstein · 2022-12-12T16:32:57Z

lib/dalli/pipelined_getter.rb

+    end
+
+    def server_pipelined_get(server, request)
+      buffer_size = server.socket_sndbuf


I had been thinking about doing it by batching the number of keys, but I think this approach of looking it at from the perspective of # of bytes is a better way to address the issue. It also allows us to give the end user direction about how to control the behavior (by tuning sndbuf)

I started out with batching the keys, but I got the idea to batch by bytes instead when I struggled to come up with a good default value for the batch size - It seemed to me that the original problem was caused by the response size rather than the number of keys requested..

I don't know if sndbuf is a good value though, maybe it is more efficient to try to write larger batches than the buffer.
It will also not allow users of the library to tweak the send buffer and the chunk size independently (I do not know if that makes sense).

It might not matter at all as this uses nonblocking IO for the writes, which could mean that there would just be partial writes (which are already correctly handled AFAIK) if we try to write everything at once. Maybe I will do some experiments, but I think there are other areas I should polish first (basic correctness being the most important one).

I think we might do something like this:

Add an explicit chunking parameter that can be set on the client

Have that chunking parameter fallback to the sndbuf size if it isn't explicitly set

petergoldstein · 2022-12-12T16:37:48Z

lib/dalli/pipelined_getter.rb

+      start_time = Time.now
+      servers = requests.keys
+
+      # FIXME: this was executed before the finish request was sent. Why?


I have to look at this in a little detail. The behavior in the case of no servers available is supposed to a returned [] with no error. But if I recall correctly, there are cases where the a server is not connected that can trigger raising a Dalli::Error without this check. I'll find some time to look.

Any hints on the intended original behavior of the multi-server code are highly appreciated. I haven't been able to fully wrap my head around it yet, and I guess it is the hardest to get right when things fail in between reading/writing chunks. Hard to correctly test as well..

petergoldstein · 2022-12-12T16:58:16Z

lib/dalli/protocol/base.rb

@@ -143,6 +145,14 @@ def quiet?
      end
      alias multi? quiet?

+      def pipelined_get_request(keys)
+        req = +''


I know this line was just copied and moved from pre-existing code, but looking at it we can almost certainly reduce allocations by pre-allocating req to a reasonable size. Something like keys.size * 24 + keys.map(&:size).sum) for binary? And something like keys.size * 18 + keys.map(&:size).sum) for meta.

Will do. But it seems that it should be 17 for meta, at least it was in my tests (non-base64, which will completely break this calculation..).
Does not seem to make a significant difference in speed, though.

I have another idea for a performance optimization: start of by only generating as much bytes as are needed for the first chunk and send it, then pre-calculate upcoming bytes whenever the select blocks (or the socket is writable and no pre-calculated data is available).
This means the server can start generating the response earlier, dalli would use less memory, and time that is "lost" waiting for I/O could be used for calculation.
But it will be a bit harder to do elegantly without a library like async or eventmachine, and I don't know if it is worth the added complexity. Anyway, I would put this in an extra PR once this one is finished.

It's very possible I got the number wrong for meta (I did the calculation quickly) but just make sure you're including the separator.

I think the benefit for that optimization is unlikely to be worth the tradeoff, but we can discuss.

I think that's a neat idea for an optimization, but yeah it feels like it could be an improvement for a future PR.

The .map(&:size).sum) is faster in my little benchmark, but by only by like 10 nanoseconds a key.

The .map(&:size).sum) is faster in my little benchmark, but by only by like 10 nanoseconds a key.

I had the (naive) assumption that the cost of allocating an array should be prevented - not knowing that it is outweighed by doing the sum in C instead if in Ruby. I should have remembered Knuth.

petergoldstein · 2022-12-12T17:00:36Z

There's a bit of "devil in the details", but directionally this looks correct. Thanks for drafting @marvinthepa . With a little back and forth we can almost certainly get this in.

marvinthepa · 2022-12-12T22:35:05Z

Thanks for drafting @marvinthepa

Happy to give back and spend some company time on a library we have been using for a few years now..

marvinthepa force-pushed the interleaved_fetch branch 3 times, most recently from 350d7b4 to 03ad88e Compare December 12, 2022 07:18

petergoldstein reviewed Dec 12, 2022

View reviewed changes

test/integration/test_pipelined_get.rb Outdated Show resolved Hide resolved

petergoldstein reviewed Dec 12, 2022

View reviewed changes

WIP: first attempt at petergoldstein#776 / petergoldstein#941

dac656d

marvinthepa force-pushed the interleaved_fetch branch from 03ad88e to dac656d Compare December 12, 2022 22:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: first attempt at #776 / #941 #942

WIP: first attempt at #776 / #941 #942

marvinthepa commented Dec 10, 2022

petergoldstein Dec 12, 2022

marvinthepa Dec 12, 2022

petergoldstein Dec 13, 2022

petergoldstein Dec 12, 2022

marvinthepa Dec 12, 2022

petergoldstein Dec 12, 2022

marvinthepa Dec 12, 2022

petergoldstein Dec 13, 2022

jamiemccarthy Dec 19, 2022

marvinthepa Dec 19, 2022

petergoldstein commented Dec 12, 2022

marvinthepa commented Dec 12, 2022

WIP: first attempt at #776 / #941 #942

Are you sure you want to change the base?

WIP: first attempt at #776 / #941 #942

Conversation

marvinthepa commented Dec 10, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

petergoldstein commented Dec 12, 2022

marvinthepa commented Dec 12, 2022